Lecture 0: Course introduction
Why are we here? Let's talk about JavaScript.
JavaScript has arrays:
[]
[1, 2, 3]
[1, 2, 3].length
What's [] + []
? Empty array? Type error? No, it's empty string.
Ok, fine. JavaScript also has objects, also known as maps, also known as dictionaries:
{}
{a: 5, b: "hello"}
({a: 5, b: "hello"}).a
Can you add [] + {}
? No, right, what would that even mean? JavaScript knows! It's "[object Object]"
-- an object.
Of course, we learned in math class that +
commutes, so {} + []
should be the same thing ... oh, it's 0.
Now for all the marbles, what's {} + {}
? No arrays here! This is, of course, not a number! NaN
.
(This was by far the most ridiculous one I could find, but let's not beat up on JavaScript any more and look at a couple of other examples.)
In Python, we can write lists and append them together:
a = [1, 2]
b = a
a = a + [3, 4]
a # [1, 2, 3, 4]
b # [1, 2]
Now, one of the little things we learn about many languages like C and Python is that instead of writing a = a + _
we can just write a += _
. So let's try that:
a = [1, 2]
b = a
a += [3, 4]
a # [1, 2, 3, 4]
b # [1, 2, 3, 4]
That's a little weird. But Python's weird. Surely an industrial-strength language like C, the one we all trust, wouldn't do things like this?
uint64_t mul(uint16_t a, uint16_t b) {
uint32_t c = a * b;
return c;
}
What does mul(60000, 60000)
evaluate to? Let's try gcc -o demo demo.c -O0
:
3600000000
But -O0
is slow, we want to be fast. Let's try -O3
:
18446744073014584320
Welp.
What's going on?
Programming languages are hard! And yet, programming is everywhere today—everything from critical services like medicine and aviation to less critical things like Instagram. What makes a programming language "good" for solving all these problems?
- Simplicity -- clean language constructs that are easy to understand and orthogonal. "There should be one-- and preferably only one --obvious way to do it."
- Readability -- the syntax of the language should be elegant and easy to read. Not COBOL constantly shouting at you.
- Safety -- it should be difficult to write programs that "go wrong". The language should protect us.
- Modularity -- software is large and needs to be worked on by multiple people. The language should help us separate that work.
- Efficiency -- programs should be fast and efficient to run. It must be possible to write a good compiler.
But these things are almost always in tension with each other.
- Type systems provide strong guarantees but restrict expressiveness -- make some programs difficult or impossible to read/write
- Run time safety checks (assertions) rule out errors but come with an efficiency cost
- Different domains demand different compromises between these things -- the needs of aviation are different to those of my dentist's website
Programming languages -- and this course -- are all about studying these trade-offs and how to make them better. How much of our cake can we find a way to eat?
How to study programming languages
One thing we could do: just study a few different programming languages.
What are some languages that people think are good, worthy of studying?
Of course, it's important to study negative examples too -- what are some languages people think are bad?
I have no idea which programming languages will be popular in the future. Some of my favorite languages are 70 years old, others weren't around 10 years ago.
We will do a little bit of studying particular languages to introduce you to some applications of the ideas in this class.
But mostly, this is a class about building foundations for programming languages. The whole course is really just about two things:
- What do programs mean?
- How can we be sure a program is correct?
But these two things carry a surprising amount of depth. We'll start with simple languages, and build them up by extending them with more features.
One thing we won't see much of in this class is answers to these two questions for "real" programming languages. Real languages are hard! Despite being well established, bringing the ideas of this class to real languages is still on the cutting edge of computer science research. We don't know how to do a lot of this stuff. Maybe you can be the one to figure it out! We'll see a few examples along the way of how people have tried to solve this problem, but mostly we'll concern ourselves with small example languages.
Nonetheless, these tools are incredibly valuable for learning new languages—the foundations haven't changed all that much over the years.
- You should know of the Turing Award, the highest prize in computer science, vaguely the "Nobel Prize for computing". A total of 75 people have received the Turing Award since it was created in 1966. Of those, 24 were for work on programming languages! These ideas have been incredibly influential in computer science.
- Also, there's somewhat of a renaissance of new programming languages right now. Languages like TypeScript, Rust, React are taking ideas that were previously buried in theoretical PL and bringing them to the mainstream. It's a really exciting time for PL!
The ulterior motive of this course
In addition to learning the foundations of PL, I have a secret ulterior motive for this class.
Theory is hard and tedious. We're going to write a lot of proofs in this class. I don't like grading proofs. And I bet you don't like writing them all that much, either. Writing proofs on paper is hard because it's tricky to know if you really got it right.
So I also want to use this class to introduce you to a really cool family of programming languages tools called proof assistants. These are basically programming languages in which you write proofs and the computer checks them for you automatically. This has a huge upside -- the computer can tell you if the proof is correct or not! It also has a huge downside -- computers are really pedantic!
Proof assistants are really popular in PL research right now, but even beyond PL, anyone who works with math for a living can benefit from them. There's a growing group of mathematicians who use proof assistants to formalize their work, for example. We'll talk more about proof assistants in the next lecture, but the important point is: while I'll be writing some proofs on the board this semester, you won't be writing any proofs on paper for homework. All our proofs will be done in a proof assistant instead.
Course logistics
First thing to know is that this is actually two courses:
- CS 345H is the undergraduate honors PL course.
- CS 386L is the graduate PL course.
These are the same course, with only one difference we'll discuss in just a moment. For everything else -- lectures, homework, exam -- the course is identical.
Me, Sam, Sammy.
Prerequisites
- The only formal prerequisites are a discrete math course (like 311) and a computer organization course (like 429).
- But more generally, this is a math heavy class, so I assume some mathematical maturity. We're going to get really good at proofs by induction.
Course website, Ed
Homework
- There are five homeworks.
- They will always be due on a Thursday at 6pm.
- The first one is very short, just to make sure you can set up a proof assistant and know how to use GitHub Classroom, which we're using for homework submissions.
- The other homeworks are more involved. This is an advanced class. That doesn't mean I'm here to try to stump you—just the opposite. But it does mean that homeworks will ask you to do things we haven't covered in lectures. The idea is for the homework to teach you new things, not to practice repeating stuff from lectures. So be sure to start early and ask for help if you get stuck.
Final exam
- There is a final exam.
- It's a take home exam, done in a proof assistant, like the homeworks. It is open book, but if you use any resources other than the course materials, you must cite them appropriately.
- Available for 48 hours during the final exam period -- 6pm December 8 to 6pm December 10. If this timing is an issue for you (for example, maybe you have 4 other exams during this 48 hour window), please let me know as soon as possible and we'll figure something out.
- Unlike the homeworks, the final will focus on things we've covered during lectures, and is mostly aimed at testing your understanding of the foundational ideas we've developed in this class.
Paper readings
- Towards the end of the class we'll read a few seminal programming languages research papers and discuss them during lecture.
- For each paper you'll need to answer some questions before lecture, and also submit some suggested discussion questions for the lecture.
Course project
- This is the only difference between the undergraduate and graduate version of the course.
- If you're in the graduate version, you'll need to do a course project.
- If you're in the undergraduate version of the course, you can opt in to doing the project if you want to. I'll explain how that affects your grade in a moment.
- The project is open-ended; take it as an opportunity to explore something interesting in the realm of programming languages (very broadly construed).
- A few examples (there's more on the course website):
- Contribute some code to an open-source compiler
- Build a program synthesis tool for a domain you're interested in
- Use PL ideas in domains you might not think of as "programming" -- cell biology, education, etc
- Model and prove the correctness of some interesting feature of a language you like
- Basically: build something, either in a proof assistant or in code.
- Apply PL ideas to your research!
- You can do the project alone or in a group of up to 3 people. Larger groups should plan to do larger projects.
- Two milestones for the project:
- Submit a short project proposal by September 29 describing what you want to do. We'll use this as an opportunity to give feedback on project size and direction.
- Submit a final project report and your code by December 1, describing what you did and evaluating how well it worked. Think of this like writing a small workshop paper -- set up the problem, discuss your approach, and present the results and any future directions.
Grading
For 386L (graduate students):
- Homework: 40% (4% for Homework 0, 9% each for Homeworks 1–4)
- Final exam: 25%
- Course project: 30%
- Paper readings: 5%
For 345H (undergraduates):
- Homework: 65% (5% for Homework 0, 15% each for Homeworks 1–4)
- Final exam: 30%
- Paper readings: 5%
For undergrads, the course project is redeemable against your final grade. That means that if you choose to do the project, we'll compute your final grade using both these grading scales and you'll get whichever grade is higher. In other words, trying the project can't make your grade worse. To choose to do the project, all you need to do is submit the proposal. You can't opt in after the proposal due date.
Please don't take this flexibility as an excuse to flake out on a group project, though. I reserve the right to deny you the redeemable option, or assign you a different project grade to the rest of your group, if this happens.
Other course policies
I prohibit the concealed carry of firearms in my office.
Academic integrity:
- You've all been here a while and know how this works.
- Anything you submit in this class for credit must be entirely your own work.
- You're welcome to discuss homework with other students, but do not share code, and consider taking a break after discussion before writing it up to make sure you don't accidentally write the same thing.
- You may not discuss the final exam with anyone except the course staff.
- In general, if you're ever in doubt about academic integrity, please just ask me before submitting anything. There is never a penalty for asking permission.