Syllabus for CS378: Natural Language Processing

Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, WAG 214
Instructor Office Hours: Tuesday 1pm-2pm / Wednesday 10am-11am GDC 3.812
TAs: Yasumasa Onoe (yasumasa@utexas.edu), Shrey Desay (shreydesai@utexas.edu)
TA Office Hours: All in GDC 1.302:

Description

Natural language processing (NLP) is a subfield of AI focused on solving problems that involve dealing with human language in a sophisticated way: these include information extraction, machine translation, automatic summarization, conversational dialogue, syntactic analysis, and many others. Much of the progress on these problems over the last 25 years has been driven by statistical machine learning and, more recently, deep learning. One distinctive feature of language compared to other types of data is its structured nature: modeling language involves understanding the linguistic phenomena it exhibits and grappling with it as a sequentially-structured, tree-structured, or graph-structured entity.

This class is intended to be a survey of modern NLP in two respects. First, it covers the main applications of NLP techniques today, both in academia and in industry, as well as enough linguistics to put these problems in context and understand their challenges. Second, it covers a range of models in structured prediction and deep learning including classifiers, sequence models, statistical parsers, neural network encoders, and encoder-decoder models. We study the models themselves, examples of problems they are applied to, inference methods, parameter estimation, and optimization. Programming assignments involve building scalable machine learning systems for various NLP tasks and seeing how these models can be put into practice.

Prerequisites

Lectures

Lectures are 9:30-11:00am Tuesday and Thursday in WAG 214. A complete schedule of lectures and assignments, complete with readings, is on the main website page.

Coursework

The timeline of assignments is on the course calendar. Assignment specifications, code, and data will be made available on the course website and Canvas.

Religious Holy Days: A student who is absent from an examination or cannot meet an assignment deadline due to the observance of a religious holy day may take the exam on an alternate day or submit the assignment up to 24 hours late without penalty, if proper notice of the planned absence has been given. Notice must be given at least 14 days prior to the classes which will be missed. For religious holy days that fall within the first 2 weeks of the semester, notice should be given on the first day of the semester. Notice should be personally delivered to the instructor and signed and dated by the instructor, or emailed, in which case a student submitting email notification must receive email confirmation from the instructor.

Other Extensions: Extensions may be granted in cases of medical emergency or other circumstances. In all cases, the student should inform the course staff as soon as is practical, and the extension must be negotiated before the assignment's original due date.

Assignments

The assignments will feature a combination of written question and coding assignments with various scope. Detailed instructions for assignment completion and submission are given with each assignment.

Slip Days: Each student is given 2 slip days to use throughout the term. Any number of these days can be applied to any assignment to extend the deadline for that assignment by that many days. E.g., you can turn in Assignment 1 one day late and Assignment 4 one day late, or you could turn in a single assignment two days late. Slip days can only be used for assignments and not the midterm or final project. Slip days cannot be used fractionally: you must choose to use 0, 1, or 2 slip days for an assignment.

Late Assignments: For each day late an assignment is turned in not covered by a slip day or negotiated extension (listed above), 15% of the credit for that assignment will be deducted. So, an assignment turned in two days late will automatically lose 30%.

Midterm

There will be one in-class midterm as described on the course calendar. Students will be allowed one standard letter (8.5" x 11") page of notes during exams. Use of electronic communication devices (phones, laptops, calculators, etc.) is banned during the exam.

Final Project

The final project is either an in-depth exploration of question answering or an opportunity for more open-ended exploration of concepts in the course. Both options can be completed individually or in groups of 2; working in groups is encouraged! If you wish to pursue your own independent project, your group must write a brief 1-page proposal by March 24 describing what you plan to do and how you plan to do it, which the course staff will provide feedback on. Independent projects do not necessarily have to "work," but will be held to a high standard in terms of expected effort, insight, and technical sophistication.

Final Grades

Your final grade is computed based on the total points earned across all assignments. The final grade is mapped to a letter as follows, with grades on the boundary receiving the higher grade:

A 100 - 93.3
A- 93.3 - 90.0
B+ 90.0 - 86.6
B 86.6 - 83.3
B- 83.3 - 80.0
C+ 80.0 - 76.6
C 76.6 - 73.3
C- 73.3 - 70.0
D 70 - 65
F below 65

Depending on class performance, the instructors may shift these boundaries down to raise students' grades.

Academic Honesty

Please read the department's academic honesty policies. For this course, students should complete all assignments independently, excluding the final project, which may be completed in groups. Limit any discussion of assignments with other students to clarification of the requirements or definitions of the problems, or to understanding the existing code or general course material. Never discuss issues directly relevant to problem solutions with other students. Finally, note that you may not use external resources (e.g., code on Github that does the assigned task) except where explicitly authorized by the course staff.

Be sure you respect these policies when posting on Piazza. Asking clarifying questions, addressing possible bugs in the provided code, etc. are fair game, but you should not discuss solutions in a substantive way. When in doubt, post privately to the instructors.

Students who violate these policies may receive a failing grade on the assignment in question or for the course overall, depending on the instructors' judgment and the severity of the infraction.

Miscellaneous

Disabilities: Students with disabilities may request appropriate academic accommodations from the Division of Diversity and Community Engagement, Services for Students with Disabilities at 512-471-6259.