CS378: Natural Language Processing (Fall 2022)
NOTE: This page is for an old semester of this class
Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 11am - 12:30pm, JGB 2.216
Instructor Office Hours (all on Zoom): Tuesday 5:30pm-6:30pm, Wednesday 10am-11am
TAs: Xi Ye, Lokesh Pugalenthi
TA Office Hours (all on Zoom):
- Lokesh: Monday 10am-11am
- Xi: Wednesday 4pm-5pm
- Xi: Thursday 2pm-3pm
- Lokesh: Friday 3pm-4pm
Discussion Board (Piazza)
Course Format
The course lectures will be delivered in a traditional, in-person format. Recordings will be made available
via the LecturesOnline service for students to browse after class. All course materials will be posted
on this website. Note that additional pre-recorded video content overlapping with the concepts in this course
is available on the CS388 website.
The exam will be given in-person. Contact the instructor ASAP if this poses a problem for you/
Description
This course provides an introduction to modern natural language processing
using machine learning and deep learning approaches. Content includes
linguistics fundamentals (syntax, semantics, distributional properties of
language), machine learning models (classifiers, sequence taggers, deep
learning models), key algorithms for inference, and applications to a range of
problems. Students will get hands-on experience building systems to do tasks
including text classification, syntactic analysis, language modeling, and language generation.
Requirements
- CS 429
- Recommended: CS 331, familiarity with probability and linear algebra, programming experience in Python
- Helpful: Exposure to AI and machine learning (e.g., CS 342/343/363)
Detailed syllabus with course policies
Assignments:
Assignment 0: Warmup (ungraded) [nyt dataset] [tokenizer.py] [solutions]
Assignment 1: Sentiment Classification (due September 8, 11:59pm) [Code and data]
Assignment 2: Feedforward Neural Networks and Optimization (due September 22, 11:59pm) [Code and data
Assignment 3: Sequence Modeling and Parsing (due October 6, 11:59pm) [Code and data on Canvas]
Midterm (topics) [midterm in-class on Tuesday, October 11] [fall 2021 (longer than this one will be) midterm / solutions, fall 2020 (longer than this one will be) midterm / solutions, spring 2020 midterm / solutions]
Assignment 4: Character Language Modeling with Transformers (due November 1, 11:59pm) [Code and data]
Assignment 5: Machine Translation (due November 8, 11:59pm) [Code and data]
Final Project: [instructions], [instructions for independent projects], [Code]
Readings: Textbook readings are assigned to complement the material discussed in lecture. You may find it useful
to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed
in the textbook if it isn't covered in lecture.
Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.
Bold readings and videos are most central to the course content; it's recommended that you look at these.
The chief text in this course is Eisenstein: Natural Language Processing,
available as a free PDF online.
(Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however,
we will not be using it much for this course.)
Schedule (subject to change through the first day of classes)
Date |
Topics |
Readings |
Assignments |
Aug 23 |
Introduction [4pp] |
|
A0 out (ungraded) |
Aug 25 |
Classification 1: Features, Perceptron |
Classification lecture note
Eisenstein 2.0, 2.1, 2.3.1, 4.1, 4.3
|
A1 out |
Aug 30 |
Classification 2: Logistic Regression, Optimization |
Classification lecture note
Perceptron Loss (VIDEO)
perc_lecture_plot.py
Jurafsky and Martin 5.0-5.3
|
|
Sept 1 |
Classification 3: Multiclass (slides: [1pp] [4pp]) |
Pang+02
Wang+Manning12
Socher+13 Sentiment
Multiclass lecture note
Eisenstein 2.4.1, 2.5, 2.6, 4.2
|
|
Sept 6 |
Classification 4: Fairness [4pp] / Neural 1 |
Goldberg 4
Schwartz+13 Authorship
Hutchinson and Mitchell18 Fairness
Eisenstein 3.0-3.3
|
|
Sept 8 |
Neural 2: Implementation, Word embeddings intro [4pp] |
Neural Net Optimization (VIDEO)
ffnn_example.py
Eisenstein 3.3
Goldberg 3, 6
Iyyer+15 DANs
Init and backprop
|
A1 due/A2 out |
Sept 13 |
Neural 3: Word embeddings |
Other embedding methods (VIDEO)
Eisenstein 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe
|
|
Sept 15 |
Neural 4: Bias, multilingual [4pp] |
Bolukbasi+16 Gender
Gonen+19 Debiasing
Ammar+16 Xlingual embeddings
Mikolov+13 Word translation
|
|
Sept 20 |
Sequence 1: Tagging, POS, HMMs |
Eisenstein 7.1-7.4, 8.1
|
|
Sept 22 |
Sequence 2: HMMs, Viterbi |
Viterbi lecture note
Eisenstein 7.1-7.4
|
A2 due/A3 out |
Sept 27 |
Sequence 3: Beam Search, POS, CRFs/NER (slides: [1pp] [4pp]) |
CRFs (VIDEO)
Viterbi lecture note
Eisenstein 7.5-7.6
Manning11 POS
|
|
Sept 29 |
Trees 1: PCFGs, CKY (slides: [1pp] [4pp]) |
Eisenstein 10.1-3, 10.4.1
|
|
Oct 4 |
Trees 2: Dependency (slides: [1pp] [4pp]) |
Eisenstein 11.3-4
KleinManning03 Unlexicalized
|
|
Oct 6 |
Trees 3: Shift-reduce / Midterm review |
State-of-the-art Parsers (VIDEO)
ChenManning14
Andor+16
|
A3 due |
Oct 11 |
Midterm (in-class) |
|
|
Oct 13 |
LM 1: N-grams, RNNs |
Eisenstein 6.1-6.2
|
|
Oct 18 |
LM 2: Self-attention, Transformers (slides: [1pp] [4pp]) |
Luong+15 Attention
Vaswani+17 Transformers
Alammar Illustrated Transformer
|
A4 out |
Oct 20 |
LM 3: Implementation (slides: [1pp] [4pp]) |
Kaplan+20 Scaling Laws
Beltagy+20 Longformer
Choromanski+21 Performer
Tay+20 Efficient Transformers
|
Custom FP proposals due |
Oct 25 |
MT 1: Alignment, Phrase-based MT |
Eisenstein 18.1-18.2, 18.4
Michael Collins IBM Models 1+2
JHU slides
History of MT
|
|
Oct 27 |
MT 2: Seq2seq, Systems [4pp] |
Holtzman+19 Nucleus Sampling
Liu+20 mBART
Wu+16 Google
Chen+18 Google
SennrichZhang19 Low-resource
|
|
Nov 1 |
Pre-training 1: ELMo, BERT [4pp] |
Peters+18 ELMo
Devlin+19 BERT
Clark+20 ELECTRA
He+21 DeBERTa
Alammar Illustrated BERT
|
A4 due / A5 out |
Nov 3 |
Pre-training 2: BART/T5, GPT-3, Prompting [4pp] |
Raffel+19 T5
Lewis+19 BART
Radford+19 GPT2
Brown+20 GPT3
Chowdhery+22 PaLM
Sanh+21 T0
Chung+22 FLan-PaLM
|
|
Nov 8 |
Understanding NNs 1: Dataset Bias [4pp] |
Gururangan+18 Artifacts
McCoy+19 Right
Gardner+20 Contrast
Swayamdipta+20 Cartography
Utama+20 Debiasing
|
A5 due / FP out |
Nov 10 |
Understanding NNs 2: Interpretability [4pp] |
Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Interpretation Tutorial
|
|
Nov 15 |
Question Answering [4pp] |
Eisenstein 12
Chen+17 DrQA
Guu+20 REALM
Kwiatkowski+19 NQ
Nakano+21 WebGPT
|
|
Nov 17 |
Dialogue, Ethics of Generation [4pp] |
Adiwardana+20 Google Meena
Roller+20 Facebook Blender
Thoppilan+22 LaMDA
BenderGebru+21 Stochastic Parrots
Gehman+20 Toxicity
|
FP check-ins due Nov 18 |
Nov 22 |
NO CLASS |
|
|
Nov 24 |
NO CLASS |
|
|
Nov 29 |
Multilingual, Multimodal Models [4pp] |
Ammar+16 Xlingual embeddings
Conneau+19 XLM-R
Pires+19 How multilingual is mBERT?
Radford+21 CLIP
|
|
Dec 1 |
Wrapup + Ethics [4pp] |
HovySpruit16 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
BenderGebru+21 Stochastic Parrots
Gebru+18 Datasheets for Datasets
Raji+20 Auditing
|
FP due Dec 9 |