CS371N: Natural Language Processing (Fall 2024)

Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 10:45am, JGB 2.218
Instructor Office Hours: Monday 1pm-2pm, Thursday 2pm-3pm, GDC 3.812 and on Zoom (hybrid; see Canvas for link)
TAs: Juan Diego Rodriguez, Grace Kim
TA Office Hours:

Tuesday 11am, Desk 5 GDC TA Station (1st floor) [Juan Diego]
Wednesday 4pm, Desk 1 GDC TA Station (1st floor) [Grace]
Thursday 5pm, Desk 5 GDC TA Station (1st floor) [Juan Diego]
Friday 3pm, Desk 1 GDC TA Station (1st floor) [Grace]

See Canvas for a link to the discussion board (Ed Discussion)

[Link to syllabus with detailed course policies]

Description

This course provides an introduction to modern natural language processing using machine learning and deep learning approaches. Content includes linguistics fundamentals (syntax, semantics, distributional properties of language), machine learning models (classifiers, sequence taggers, Transformers, and large language models), algorithms for decoding and inference, and contemporary methods for applying language models to solve a range of problems. Students will get hands-on experience building systems to do tasks including text classification, language modeling, and textual entailment.

Requirements

CS 429
Recommended: CS 331, familiarity with probability and linear algebra, programming experience in Python
Helpful: Exposure to AI and machine learning (e.g., CS 342/343/363)

Course Details

The course lectures will be delivered in a traditional, in-person format. Recordings will be made available via the LecturesOnline service for students to browse after class. All course materials will be posted on this website. Note that additional pre-recorded video content overlapping with the concepts in this course is available on the CS388 website.

The exam will be given in-person. Contact the instructor ASAP if this poses a problem for you.

Syllabus [Clickable link with important information about the course policies. The current page you are on is NOT the syllabus]

Assignments:

Assignment 0: Warmup (ungraded) [nyt dataset] [tokenizer.py] [solutions]

Assignment 1: Sentiment Classification (due September 12, 11:59pm) [Code and data]

Assignment 2: Feedforward Neural Networks and Optimization (due September 26, 11:59pm) [Code and data]

Assignment 3: Transformer Language Modeling (due October 10, 11:59pm) [Code and data]

Assignment 4: Sequence Modeling and Parsing (due October 22, 11:59pm)

Midterm (topics) [midterm in-class on Thursday, October 24] [fall 2023 midterm / solutions, fall 2022 midterm / solutions]

Assignment 5: Factuality of ChatGPT (due November 7, 11:59pm) [Code and data]

Final Project: Dataset Artifacts [instructions], [Code], [instructions for independent projects] [Project sample 1] [Project sample 2]

Readings: Textbook and paper readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn't covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.

The chief text in this course is Eisenstein: Natural Language Processing, available as a free PDF online. (Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however, we will not be using it much for this course.)

Schedule (subject to change through the first day of classes)

Date Topics Readings Assignments

Aug 27 Introduction [4pp] A0 out (ungraded)

Aug 29 Classification 1: Features, Perceptron Classification lecture note
Perceptron Loss (VIDEO)
perc_lecture_plot.py
Eisenstein 2.0, 2.1, 2.3.1, 4.1, 4.3 A1 out

Sept 3 Classification 2: Logistic Regression Classification lecture note
Optimization (VIDEO)
Jurafsky and Martin 5.0-5.3

Sept 5 Classification 3: Multiclass, Examples (slides: [1pp] [4pp]) Multiclass lecture note
Eisenstein 2.4.1, 2.5, 2.6, 4.2
Pang+02
Wang+Manning12
Socher+13 Sentiment
Schwartz+13 Authorship

Sept 10 Classification 4: Fairness / Neural 1: Feedforward, Backpropagation [4pp] [handwritten notes] Fairness (VIDEO)
HutchinsonMitchell18 Fairness
Eisenstein 3.0-3.3
Goldberg 4

Sept 12 Neural 2: Implementation, Word embeddings intro [4pp] Neural Net Optimization (VIDEO)
ffnn_example.py
Eisenstein 3.3
Goldberg 3, 6
Iyyer+15 DANs
Init and backprop A1 due/A2 out

Sept 17 Neural 3: Word embeddings [4pp] [handwritten notes] Eisenstein 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe

Sept 19 Neural 4: Bias, multilingual [4pp] Bolukbasi+16 Gender
Gonen+19 Debiasing
Ammar+16 Xlingual embeddings
Mikolov+13 Word translation

Sept 24 LM 1: N-grams, RNNs Eisenstein 6.1-6.2

Sept 26 LM 2: Self-attention, Transformers Luong+15 Attention
Vaswani+17 Transformers
Alammar Illustrated Transformer
PhuongHutter Transformers A2 due / A3 out

Oct 1 LM 3: Implementation, Extensions [4pp] [handwritten notes] Kaplan+20 Scaling Laws
Beltagy+20 Longformer
Choromanski+21 Performer
Tay+20 Efficient Transformers

Oct 3 Pre-training 1: Encoders (BERT), Tokenization [4pp] Peters+18 ELMo
Devlin+19 BERT
Alammar Illustrated BERT
Liu+19 RoBERTa
BostromDurrett20 Tokenizers

Oct 8 Pre-training 2: Decoders (GPT/T5), Decoding Methods [4pp] Raffel+19 T5
Lewis+19 BART
Radford+19 GPT2
Brown+20 GPT3
Chowdhery+21 PaLM
Holtzman+19 Nucleus Sampling

Oct 10 Sequence 1: Tagging, POS, HMMs Eisenstein 7.1-7.4, 8.1 A3 due/A4 out

Oct 15 Sequence 2: HMMs, Viterbi Viterbi lecture note
Eisenstein 7.1-7.4

Oct 17 Trees 1: PCFGs, CKY [4pp] [handwritten notes] Eisenstein 10.1-3, 10.4.1

Oct 22 Trees 2: Dependency, Shift-reduce [4pp] Eisenstein 11.3-4
KleinManning03 Unlexicalized
ChenManning14
Andor+16 A4 due

Oct 24 Midterm [4pp]

Oct 29 Understanding GPT3 1: Prompting GPT-3, Factuality [4pp] Zhao+21 Calibrate Before Use
Min+22 Rethinking Demonstrations
Gonen+22 Demystifying Prompts
Olson+22 Induction Heads
Min+23 FActScore
Gao+22 RARR A5 out

Oct 31 Understanding GPT3 2: Rationales, Chain-of-thought [4pp] Camburu+18 e-SNLI
Wei+22 CoT
YeDurrett22 Unreliability
Kojima+22 Step-by-step
Gao+22 Program-aided
Sprague+24 To CoT or not to CoT
Ye+23 SatLM
Yao+23 Tree-of-thought Custom FP proposals due

Nov 5 Understanding GPT3 3: Instruction tuning, RL in NLP [4pp] Sanh+21 T0
Liu+21 Prompting
Chung+22 Flan-PaLM
Ouyang+22 Human Feedback
Rafailov+23 DPO
Singhal+23 Length

Nov 7 Understanding NNs 1: Dataset Bias [4pp] Gururangan+18 Artifacts
McCoy+19 Right
Gardner+20 Contrast
Swayamdipta+20 Cartography
Utama+20 Debiasing A5 due / FP out

Nov 12 Understanding NNs 2: Interpretability [4pp] Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Interpretation Tutorial

Nov 14 Machine Translation, Multilinguality [4pp] Eisenstein 18.1-18.2, 18.4
Michael Collins IBM Models 1+2

Nov 19 Language Grounding [4pp] Radford+21 CLIP
Ahn+22 SayCan
Driess+23 PaLM-E

Nov 21 Modern Topics 1: LLM efficiency [4pp] Levaiathan+23 Speculative
Medusa Heads (blog)
Dao+23 Flash Attention
Xia+23 Sheared Llama
Sanh+19 DistilBERT
Hsieh+23 Distill Step-by-Step
Zakan+22 BitFit
Hu+21 LoRA
Dettmers+22 LLM.int8()
Dettmers+23 QLoRA

Nov 26 NO CLASS

Nov 28 NO CLASS

Dec 3 Modtern Topics 2: LLM safety, RAG [4pp] Shen+23 Jailbreaking
Zou+23 Attacks on LLMs
EldanRussinovich23 Unlearning
Mitchell+22 Model Editing
Onoe+23 Challenges in Propagating

Dec 5 Wrapup + Ethics [4pp] HovySpruit16 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
BenderGebru+21 Stochastic Parrots
Gebru+18 Datasheets for Datasets
Raji+20 Auditing FP due Dec 13

Date	Topics	Readings	Assignments
Aug 27	Introduction [4pp]		A0 out (ungraded)
Aug 29	Classification 1: Features, Perceptron	Classification lecture note Perceptron Loss (VIDEO) perc_lecture_plot.py Eisenstein 2.0, 2.1, 2.3.1, 4.1, 4.3	A1 out
Sept 3	Classification 2: Logistic Regression	Classification lecture note Optimization (VIDEO) Jurafsky and Martin 5.0-5.3
Sept 5	Classification 3: Multiclass, Examples (slides: [1pp] [4pp])	Multiclass lecture note Eisenstein 2.4.1, 2.5, 2.6, 4.2 Pang+02 Wang+Manning12 Socher+13 Sentiment Schwartz+13 Authorship
Sept 10	Classification 4: Fairness / Neural 1: Feedforward, Backpropagation [4pp] [handwritten notes]	Fairness (VIDEO) HutchinsonMitchell18 Fairness Eisenstein 3.0-3.3 Goldberg 4
Sept 12	Neural 2: Implementation, Word embeddings intro [4pp]	Neural Net Optimization (VIDEO) ffnn_example.py Eisenstein 3.3 Goldberg 3, 6 Iyyer+15 DANs Init and backprop	A1 due/A2 out
Sept 17	Neural 3: Word embeddings [4pp] [handwritten notes]	Eisenstein 14.5-14.6 Goldberg 5 Mikolov+13 word2vec Pennington+14 GloVe
Sept 19	Neural 4: Bias, multilingual [4pp]	Bolukbasi+16 Gender Gonen+19 Debiasing Ammar+16 Xlingual embeddings Mikolov+13 Word translation
Sept 24	LM 1: N-grams, RNNs	Eisenstein 6.1-6.2
Sept 26	LM 2: Self-attention, Transformers	Luong+15 Attention Vaswani+17 Transformers Alammar Illustrated Transformer PhuongHutter Transformers	A2 due / A3 out
Oct 1	LM 3: Implementation, Extensions [4pp] [handwritten notes]	Kaplan+20 Scaling Laws Beltagy+20 Longformer Choromanski+21 Performer Tay+20 Efficient Transformers
Oct 3	Pre-training 1: Encoders (BERT), Tokenization [4pp]	Peters+18 ELMo Devlin+19 BERT Alammar Illustrated BERT Liu+19 RoBERTa BostromDurrett20 Tokenizers
Oct 8	Pre-training 2: Decoders (GPT/T5), Decoding Methods [4pp]	Raffel+19 T5 Lewis+19 BART Radford+19 GPT2 Brown+20 GPT3 Chowdhery+21 PaLM Holtzman+19 Nucleus Sampling
Oct 10	Sequence 1: Tagging, POS, HMMs	Eisenstein 7.1-7.4, 8.1	A3 due/A4 out
Oct 15	Sequence 2: HMMs, Viterbi	Viterbi lecture note Eisenstein 7.1-7.4
Oct 17	Trees 1: PCFGs, CKY [4pp] [handwritten notes]	Eisenstein 10.1-3, 10.4.1
Oct 22	Trees 2: Dependency, Shift-reduce [4pp]	Eisenstein 11.3-4 KleinManning03 Unlexicalized ChenManning14 Andor+16	A4 due
Oct 24	Midterm [4pp]
Oct 29	Understanding GPT3 1: Prompting GPT-3, Factuality [4pp]	Zhao+21 Calibrate Before Use Min+22 Rethinking Demonstrations Gonen+22 Demystifying Prompts Olson+22 Induction Heads Min+23 FActScore Gao+22 RARR	A5 out
Oct 31	Understanding GPT3 2: Rationales, Chain-of-thought [4pp]	Camburu+18 e-SNLI Wei+22 CoT YeDurrett22 Unreliability Kojima+22 Step-by-step Gao+22 Program-aided Sprague+24 To CoT or not to CoT Ye+23 SatLM Yao+23 Tree-of-thought	Custom FP proposals due
Nov 5	Understanding GPT3 3: Instruction tuning, RL in NLP [4pp]	Sanh+21 T0 Liu+21 Prompting Chung+22 Flan-PaLM Ouyang+22 Human Feedback Rafailov+23 DPO Singhal+23 Length
Nov 7	Understanding NNs 1: Dataset Bias [4pp]	Gururangan+18 Artifacts McCoy+19 Right Gardner+20 Contrast Swayamdipta+20 Cartography Utama+20 Debiasing	A5 due / FP out
Nov 12	Understanding NNs 2: Interpretability [4pp]	Lipton+16 Mythos Ribeiro+16 LIME Simonyan+13 Visualizing Sundararajan+17 Int Grad Interpretation Tutorial
Nov 14	Machine Translation, Multilinguality [4pp]	Eisenstein 18.1-18.2, 18.4 Michael Collins IBM Models 1+2
Nov 19	Language Grounding [4pp]	Radford+21 CLIP Ahn+22 SayCan Driess+23 PaLM-E
Nov 21	Modern Topics 1: LLM efficiency [4pp]	Levaiathan+23 Speculative Medusa Heads (blog) Dao+23 Flash Attention Xia+23 Sheared Llama Sanh+19 DistilBERT Hsieh+23 Distill Step-by-Step Zakan+22 BitFit Hu+21 LoRA Dettmers+22 LLM.int8() Dettmers+23 QLoRA
Nov 26	NO CLASS
Nov 28	NO CLASS
Dec 3	Modtern Topics 2: LLM safety, RAG [4pp]	Shen+23 Jailbreaking Zou+23 Attacks on LLMs EldanRussinovich23 Unlearning Mitchell+22 Model Editing Onoe+23 Challenges in Propagating
Dec 5	Wrapup + Ethics [4pp]	HovySpruit16 Social Impact of NLP Zhao+17 Bias Amplification Rudinger+18 Gender Bias in Coref BenderGebru+21 Stochastic Parrots Gebru+18 Datasheets for Datasets Raji+20 Auditing	FP due Dec 13