Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, Garrison Hall 0.132 (GAR)
Instructor Office Hours: Wednesday 10:00am - 12:00pm, GDC 3.420 (additional OHs by appointment)
TA: Ye Zhang
TA Office Hours: Tuesday and Thursday 2pm-3pm, GDC 1.302
Piazza
This class covers a range of topics in structured prediction and deep learning with a focus on applications to NLP. We discuss model structures that commonly arise in NLP such as sequence models, tree-structured models, more general graphical models, recurrent neural networks, convolutional neural networks, and connections between these. We study the models themselves, examples of problems they are applied to, inference methods, parameter estimation (both supervised and unsupervised approaches), and optimization. Programming assignments involve building scalable machine learning systems for various NLP tasks, with a focus on understanding design decisions surrounding modeling, inference, and learning, and how these interact.
Differences from CS388: This class is intended to complement CS388; CS388 is not required as a prerequisite for this class, nor will those who have taken CS388 have seen everything in this class. In particular, this class has a greater emphasis on the fundamentals of structured machine learning and covers a wider range of deep learning techniques, while CS388 deals more with covering broadly important problems in NLP and studying the underlying linguistic phenomena.
Requirements
Detailed syllabus with course policies
This course is broken into two halves: the first half covers structured prediction techniques with linear models, and the second revisits these techniques and structures in the context of deep neural networks. Throughout the course, methods will be illustrated via a number of NLP tasks including POS tagging, named entity recognition, syntactic parsing, sentiment analysis, machine translation, image captioning, and others. This schedule is tentative! Because this is the first time this course is being offered, lecture topics at the end may shift around.
Assignments: There are three programming assignments that require implementing models discussed in class. Framework code in Python and datasets will be provided. If you prefer to use another language, that is possible as well, but you'll have to implement some basic file I/O and other parts of the framework code yourself. In addition, there is an open-ended final project to be done either individually or in teams of 2. This project should constitute novel exploration beyond directly implementing concepts from lecture and should result in a report that roughly reads like an NLP/ML conference submission in terms of presentation and scope.
Samples of successful Project 1 reports: Sample 1 Sample 2
Project 1: CRF for NER [download code]
Project 2: Shift-Reduce Parsing [download code]
Project 3: Neural Networks for Sentiment Analysis [download code and data (20MB)]
Readings: Readings are purely optional and intended to supplement lecture and give you another view of the material. Two main sources will be used: