CS378: Natural Language Processing (Fall 2022)

NOTE: This page is for an old semester of this class

Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 11am - 12:30pm, JGB 2.216
Instructor Office Hours (all on Zoom): Tuesday 5:30pm-6:30pm, Wednesday 10am-11am
TAs: Xi Ye, Lokesh Pugalenthi
TA Office Hours (all on Zoom):

Lokesh: Monday 10am-11am
Xi: Wednesday 4pm-5pm
Xi: Thursday 2pm-3pm
Lokesh: Friday 3pm-4pm

Discussion Board (Piazza)

Course Format

The course lectures will be delivered in a traditional, in-person format. Recordings will be made available via the LecturesOnline service for students to browse after class. All course materials will be posted on this website. Note that additional pre-recorded video content overlapping with the concepts in this course is available on the CS388 website. The exam will be given in-person. Contact the instructor ASAP if this poses a problem for you/

Description

This course provides an introduction to modern natural language processing using machine learning and deep learning approaches. Content includes linguistics fundamentals (syntax, semantics, distributional properties of language), machine learning models (classifiers, sequence taggers, deep learning models), key algorithms for inference, and applications to a range of problems. Students will get hands-on experience building systems to do tasks including text classification, syntactic analysis, language modeling, and language generation.

Requirements

CS 429
Recommended: CS 331, familiarity with probability and linear algebra, programming experience in Python
Helpful: Exposure to AI and machine learning (e.g., CS 342/343/363)

Syllabus

Detailed syllabus with course policies

Assignments:

Assignment 0: Warmup (ungraded) [nyt dataset] [tokenizer.py] [solutions]

Assignment 1: Sentiment Classification (due September 8, 11:59pm) [Code and data]

Assignment 2: Feedforward Neural Networks and Optimization (due September 22, 11:59pm) [Code and data

Assignment 3: Sequence Modeling and Parsing (due October 6, 11:59pm) [Code and data on Canvas]

Midterm (topics) [midterm in-class on Tuesday, October 11] [fall 2021 (longer than this one will be) midterm / solutions, fall 2020 (longer than this one will be) midterm / solutions, spring 2020 midterm / solutions]

Assignment 4: Character Language Modeling with Transformers (due November 1, 11:59pm) [Code and data]

Assignment 5: Machine Translation (due November 8, 11:59pm) [Code and data]

Final Project: [instructions], [instructions for independent projects], [Code]

Readings: Textbook readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn't covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics. Bold readings and videos are most central to the course content; it's recommended that you look at these.

The chief text in this course is Eisenstein: Natural Language Processing, available as a free PDF online. (Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however, we will not be using it much for this course.)

Schedule (subject to change through the first day of classes)

Date Topics Readings Assignments

Aug 23 Introduction [4pp] A0 out (ungraded)

Aug 25 Classification 1: Features, Perceptron Classification lecture note
Eisenstein 2.0, 2.1, 2.3.1, 4.1, 4.3 A1 out

Aug 30 Classification 2: Logistic Regression, Optimization Classification lecture note
Perceptron Loss (VIDEO)
perc_lecture_plot.py
Jurafsky and Martin 5.0-5.3

Sept 1 Classification 3: Multiclass (slides: [1pp] [4pp]) Pang+02
Wang+Manning12
Socher+13 Sentiment
Multiclass lecture note
Eisenstein 2.4.1, 2.5, 2.6, 4.2

Sept 6 Classification 4: Fairness [4pp] / Neural 1 Goldberg 4
Schwartz+13 Authorship
Hutchinson and Mitchell18 Fairness
Eisenstein 3.0-3.3

Sept 8 Neural 2: Implementation, Word embeddings intro [4pp] Neural Net Optimization (VIDEO)
ffnn_example.py
Eisenstein 3.3
Goldberg 3, 6
Iyyer+15 DANs
Init and backprop A1 due/A2 out

Sept 13 Neural 3: Word embeddings Other embedding methods (VIDEO)
Eisenstein 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe

Sept 15 Neural 4: Bias, multilingual [4pp] Bolukbasi+16 Gender
Gonen+19 Debiasing
Ammar+16 Xlingual embeddings
Mikolov+13 Word translation

Sept 20 Sequence 1: Tagging, POS, HMMs Eisenstein 7.1-7.4, 8.1

Sept 22 Sequence 2: HMMs, Viterbi Viterbi lecture note
Eisenstein 7.1-7.4 A2 due/A3 out

Sept 27 Sequence 3: Beam Search, POS, CRFs/NER (slides: [1pp] [4pp]) CRFs (VIDEO)
Viterbi lecture note
Eisenstein 7.5-7.6
Manning11 POS

Sept 29 Trees 1: PCFGs, CKY (slides: [1pp] [4pp]) Eisenstein 10.1-3, 10.4.1

Oct 4 Trees 2: Dependency (slides: [1pp] [4pp]) Eisenstein 11.3-4
KleinManning03 Unlexicalized

Oct 6 Trees 3: Shift-reduce / Midterm review State-of-the-art Parsers (VIDEO)
ChenManning14
Andor+16 A3 due

Oct 11 Midterm (in-class)

Oct 13 LM 1: N-grams, RNNs Eisenstein 6.1-6.2

Oct 18 LM 2: Self-attention, Transformers (slides: [1pp] [4pp]) Luong+15 Attention
Vaswani+17 Transformers
Alammar Illustrated Transformer A4 out

Oct 20 LM 3: Implementation (slides: [1pp] [4pp]) Kaplan+20 Scaling Laws
Beltagy+20 Longformer
Choromanski+21 Performer
Tay+20 Efficient Transformers Custom FP proposals due

Oct 25 MT 1: Alignment, Phrase-based MT Eisenstein 18.1-18.2, 18.4
Michael Collins IBM Models 1+2
JHU slides
History of MT

Oct 27 MT 2: Seq2seq, Systems [4pp] Holtzman+19 Nucleus Sampling
Liu+20 mBART
Wu+16 Google
Chen+18 Google
SennrichZhang19 Low-resource

Nov 1 Pre-training 1: ELMo, BERT [4pp] Peters+18 ELMo
Devlin+19 BERT
Clark+20 ELECTRA
He+21 DeBERTa
Alammar Illustrated BERT A4 due / A5 out

Nov 3 Pre-training 2: BART/T5, GPT-3, Prompting [4pp] Raffel+19 T5
Lewis+19 BART
Radford+19 GPT2
Brown+20 GPT3
Chowdhery+22 PaLM
Sanh+21 T0
Chung+22 FLan-PaLM

Nov 8 Understanding NNs 1: Dataset Bias [4pp] Gururangan+18 Artifacts
McCoy+19 Right
Gardner+20 Contrast
Swayamdipta+20 Cartography
Utama+20 Debiasing A5 due / FP out

Nov 10 Understanding NNs 2: Interpretability [4pp] Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Interpretation Tutorial

Nov 15 Question Answering [4pp] Eisenstein 12
Chen+17 DrQA
Guu+20 REALM
Kwiatkowski+19 NQ
Nakano+21 WebGPT

Nov 17 Dialogue, Ethics of Generation [4pp] Adiwardana+20 Google Meena
Roller+20 Facebook Blender
Thoppilan+22 LaMDA
BenderGebru+21 Stochastic Parrots
Gehman+20 Toxicity FP check-ins due Nov 18

Nov 22 NO CLASS

Nov 24 NO CLASS

Nov 29 Multilingual, Multimodal Models [4pp] Ammar+16 Xlingual embeddings
Conneau+19 XLM-R
Pires+19 How multilingual is mBERT?
Radford+21 CLIP

Dec 1 Wrapup + Ethics [4pp] HovySpruit16 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
BenderGebru+21 Stochastic Parrots
Gebru+18 Datasheets for Datasets
Raji+20 Auditing FP due Dec 9

Date	Topics	Readings	Assignments
Aug 23	Introduction [4pp]		A0 out (ungraded)
Aug 25	Classification 1: Features, Perceptron	Classification lecture note Eisenstein 2.0, 2.1, 2.3.1, 4.1, 4.3	A1 out
Aug 30	Classification 2: Logistic Regression, Optimization	Classification lecture note Perceptron Loss (VIDEO) perc_lecture_plot.py Jurafsky and Martin 5.0-5.3
Sept 1	Classification 3: Multiclass (slides: [1pp] [4pp])	Pang+02 Wang+Manning12 Socher+13 Sentiment Multiclass lecture note Eisenstein 2.4.1, 2.5, 2.6, 4.2
Sept 6	Classification 4: Fairness [4pp] / Neural 1	Goldberg 4 Schwartz+13 Authorship Hutchinson and Mitchell18 Fairness Eisenstein 3.0-3.3
Sept 8	Neural 2: Implementation, Word embeddings intro [4pp]	Neural Net Optimization (VIDEO) ffnn_example.py Eisenstein 3.3 Goldberg 3, 6 Iyyer+15 DANs Init and backprop	A1 due/A2 out
Sept 13	Neural 3: Word embeddings	Other embedding methods (VIDEO) Eisenstein 14.5-14.6 Goldberg 5 Mikolov+13 word2vec Pennington+14 GloVe
Sept 15	Neural 4: Bias, multilingual [4pp]	Bolukbasi+16 Gender Gonen+19 Debiasing Ammar+16 Xlingual embeddings Mikolov+13 Word translation
Sept 20	Sequence 1: Tagging, POS, HMMs	Eisenstein 7.1-7.4, 8.1
Sept 22	Sequence 2: HMMs, Viterbi	Viterbi lecture note Eisenstein 7.1-7.4	A2 due/A3 out
Sept 27	Sequence 3: Beam Search, POS, CRFs/NER (slides: [1pp] [4pp])	CRFs (VIDEO) Viterbi lecture note Eisenstein 7.5-7.6 Manning11 POS
Sept 29	Trees 1: PCFGs, CKY (slides: [1pp] [4pp])	Eisenstein 10.1-3, 10.4.1
Oct 4	Trees 2: Dependency (slides: [1pp] [4pp])	Eisenstein 11.3-4 KleinManning03 Unlexicalized
Oct 6	Trees 3: Shift-reduce / Midterm review	State-of-the-art Parsers (VIDEO) ChenManning14 Andor+16	A3 due
Oct 11	Midterm (in-class)
Oct 13	LM 1: N-grams, RNNs	Eisenstein 6.1-6.2
Oct 18	LM 2: Self-attention, Transformers (slides: [1pp] [4pp])	Luong+15 Attention Vaswani+17 Transformers Alammar Illustrated Transformer	A4 out
Oct 20	LM 3: Implementation (slides: [1pp] [4pp])	Kaplan+20 Scaling Laws Beltagy+20 Longformer Choromanski+21 Performer Tay+20 Efficient Transformers	Custom FP proposals due
Oct 25	MT 1: Alignment, Phrase-based MT	Eisenstein 18.1-18.2, 18.4 Michael Collins IBM Models 1+2 JHU slides History of MT
Oct 27	MT 2: Seq2seq, Systems [4pp]	Holtzman+19 Nucleus Sampling Liu+20 mBART Wu+16 Google Chen+18 Google SennrichZhang19 Low-resource
Nov 1	Pre-training 1: ELMo, BERT [4pp]	Peters+18 ELMo Devlin+19 BERT Clark+20 ELECTRA He+21 DeBERTa Alammar Illustrated BERT	A4 due / A5 out
Nov 3	Pre-training 2: BART/T5, GPT-3, Prompting [4pp]	Raffel+19 T5 Lewis+19 BART Radford+19 GPT2 Brown+20 GPT3 Chowdhery+22 PaLM Sanh+21 T0 Chung+22 FLan-PaLM
Nov 8	Understanding NNs 1: Dataset Bias [4pp]	Gururangan+18 Artifacts McCoy+19 Right Gardner+20 Contrast Swayamdipta+20 Cartography Utama+20 Debiasing	A5 due / FP out
Nov 10	Understanding NNs 2: Interpretability [4pp]	Lipton+16 Mythos Ribeiro+16 LIME Simonyan+13 Visualizing Sundararajan+17 Int Grad Interpretation Tutorial
Nov 15	Question Answering [4pp]	Eisenstein 12 Chen+17 DrQA Guu+20 REALM Kwiatkowski+19 NQ Nakano+21 WebGPT
Nov 17	Dialogue, Ethics of Generation [4pp]	Adiwardana+20 Google Meena Roller+20 Facebook Blender Thoppilan+22 LaMDA BenderGebru+21 Stochastic Parrots Gehman+20 Toxicity	FP check-ins due Nov 18
Nov 22	NO CLASS
Nov 24	NO CLASS
Nov 29	Multilingual, Multimodal Models [4pp]	Ammar+16 Xlingual embeddings Conneau+19 XLM-R Pires+19 How multilingual is mBERT? Radford+21 CLIP
Dec 1	Wrapup + Ethics [4pp]	HovySpruit16 Social Impact of NLP Zhao+17 Bias Amplification Rudinger+18 Gender Bias in Coref BenderGebru+21 Stochastic Parrots Gebru+18 Datasheets for Datasets Raji+20 Auditing	FP due Dec 9