This is an advanced topics course, which means we won't go into details of classic NLP problems and methods. Instead, we will study, in-depth, a set of advanced topics in Natural Language Understanding. Here is a list of expectations:
This is a seminar-style graduate course.
For each lecture, you would read about 2 papers and write a commentary which would be a conference style review of the papers. You might be asked to peer-review at least one commentary.
In the lecture, a small group of people would present and also lead class-discussion on the readings.
A semester long course project.
Your goal while writing the review here is not to assess the quality of the papers.
Rather, your goal is to construct a rich mental map of existing work, which you will sooner or later be able to use as a foundation for your own research.
For each paper, you should answer the following questions in your commentary:
What’s the paper about? (Summarize the paper and its contributions in your own words.)
What are the strengths of the paper?
What are the weaknesses of the paper?
What’s a research-level question/observation that you have after having read the paper? A “research-level” question is something deeper than “What did the Greek letters on page 4 mean?” or “What’s the baseline in Figure 6?” This could be about a concept that you did not understand or a modeling phenomena that you couldn't follow. An observation might be something like, “The problem this paper addresses reminds me of the X problem, which is similar in ways A and B, but different in way C. Could this paper’s approach, or something like it, be used to tackle X?”
What are low-level details that you did not understand? (optional)
Commentaries should be at max 500 words (its okay to have shorter commentaries), and in a pdf format.
Commentaries are due at noon on the day before the lecture. Late Commentaries will not be accepted.
Additionally, you should post your answers to questions 2, 3, 4 and 5 to Piazza by midnight on the day before of the lecture. Posting answer to Q1 is optional.
You don't need to submit a commentary for the day you present.
Free pass: You will be exempted from having to submit a commentary for one lecture of your choosing. Inform the instructor of your "free pass" day.
Noah Smith's advice on how to write good conference reviews.
In the lecture, one (or more) people would present and also lead class-discussion on the readings. The size of the group presenting the readings would depend on the number of students in the class.
If there are multiple presenters, all of them are collectively responsible for the presentation. This means that they should discuss the papers to be presented with me (if needed) and post the paper titles by midnight on Tuesday of the previous week.
The presenter(s) must email me a draft of their slides by noon on the previous day.
The presenter(s) should post a link to an anonymous feedback form for themselves on Piazza by noon on the previous day.
Do the reading well in advance, so that you have time to 'soak-in' the material.
You will be evaluated on your presentation skills among other things. So, prepare well.
For the discussion, have some suggested discussion questions to kick things off. These could be the 'research-level question/observation' that you would have written if you would have submitted a commentary (even though you don't). You do not necessarily need to have the answers!
Check out Simon Peyton Jones's advice on how to give a great research talk.
Michael Ernst has lots of good advice, too.
Some more advice on good presentations
Patrick Winston's useful advice on how to give a good talks/lecturer.
You could work in a small group or individually.
Towards the end of the course, you will be expected to present your project as a report and/or a short in-class presentation
Short summary of what you are planning to work on. To be presented as slides.
List names of all project members.
Only one presentation needed per group.
Questions to answer in your summary (if you were to write a paper on your project, these are the questions you answer in the introduction):
What is the problem you are solving? (if possible, give an example of expected input and output)
Who cares about this problem (and its solution)?
Why is this a hard problem to solve? What are the challenges?
Which dataset(s) will you use?
How will you evaluate?
Prepare slides for your presentation
Questions to answer in your slides:
What is the problem you are solving? (give an example of expected input and output)
Who cares about this problem (and its solution)?
Why is this a hard problem to solve? What are the challenges?
What has been done so far for solving such a problem?
What dataset are you using?
How are you evaluating?
How are you solving the problem?
What baselines will you compare with?
Your slides should include:
Introduction
1-slide problem definition (the input and output to your system with examples),
1 or 2-slide dataset details (stats including but not limited to the number of instances, train/test splits, etc), Also include (annotated) examples from your dataset, indicating what part of the dataset you are using
Your primary method. The description should include
a slide with the key insight behind your method in a 1 or 2 sentences (no math/figures please!).
an overview of the method.
The details of the method. If you have long details like prompts, include them in main/backup slides so that we can read the prompts if needed.
Experiments and Results. Please keep the following points in mind:
If you use non-standard evaluation metrics, please define them.
You can also present your reasonable but not-so-successful attempts as baselines.
Please make sure that the fonts are large enough.
While describing an experiment, explain the structure of the tables/figures, i.e. explain what the rows, columns, axes, and legend mean.
Before describing an experiment, please articulate the 1-sentence research question that the experiment is trying to answer.
After describing an experiment, summarize the conclusion of that experiment in 1-2 bullets. This would answer the research question you posed.
Conclusion of your project. Include the key takeaway message(s).
Commentary on readings: 15%
Participation in class discussion: 20%
Presentations: 20%
Course project: 45%
Amazon customer reviews: https://s3.amazonaws.com/amazon-reviews-pds/readme.html
Yelp reviews: https://www.yelp.com/dataset (samples)
The RocStories corpus: https://www.cs.rochester.edu/nlp/rocstories/
Sentiment analysis on mixed languages: https://competitions.codalab.org/competitions/20789#learn_the_details
Codalab might have interesting problems: https://competitions.codalab.org/competitions/
Also check out the "Resources/Datasets" heading under "Topics"
** Special thanks to Lindsey Kuper for letting me borrow content and advice from her course.