LIN386/CS395T Home

Instructor: Jason Baldridge   Time: T/Th 2-3:30   Location: Parlin 101

The field of computational linguistics has undergone a major shift over the last two decades toward statistical methods. For some tasks, such as language modeling, there is a wealth of data available for training models, but for many tasks, the performance of models is severely limited by the amount of relevant labeled training material. Semisupervised learning seeks to use small amounts of annotated data in combination with (possibly) large amounts of raw text to improve performance over just using the annotated data by itself.

This class will look at the theory and methods behind semisupervised learning methods in the context of computational linguistics. The main goal will be to provide a high-level view of machine learning methods in the context of computational linguistics tasks, with an aim toward understanding when they should be expected to work well for a task and how to apply them. The focus will be on practical and applied concerns in natural language processing. Even so, what is linguistics if not the search to find concise characterizations of linguistic patterns? Unsupervised/semi-supervised machine learning seeks to learn good models using less human-guided input, and we will consider the possible ramifications for core linguistic concerns, such as acquiring syntactic structure.

Course Notes

  • 1st International Academic Conference on Social Broadcasting Technology March 13 & 14, 2011 - Austin, Texas Andrew Whinston (Professor, McCombs Business School and CS dept) is chairing a new conference to be held during South-by-Southwest in Austin next March. Many of the projects in ...
    Posted Oct 11, 2010, 8:10 AM by Jason Baldridge
  • Drop-in CS talk by L. Venkata Subramaniam This talk is an hour before our class meets on Thursday, so would be possible to make it to it.------------Type of Talk: UTCS Colloquium/AI Speaker/Affiliation: L. Venkata ...
    Posted Oct 11, 2010, 7:52 AM by Jason Baldridge
  • LDC data at UT Austin's library We have subscriptions to the Linguistic Data Consortium that mean that folks at UT have access to many of the standard datasets used in computational linguistics. Here's the list ...
    Posted Sep 22, 2010, 12:33 PM by Jason Baldridge
  • UTCS talk by Philip Resnik on Sep. 30 @ 2pm Our class will be attending the talk by Philip Resnik on September 30 at 2pm. Details below.-------------------Type of Talk: UTCS Colloquium Speaker/Affiliation: Philip Resnik/University of Maryland Date ...
    Posted Sep 12, 2010, 8:58 PM by Jason Baldridge
  • ACL Anthology If you want to find computational linguistics papers, here's the place to start:
    Posted Sep 3, 2010, 8:20 AM by Jason Baldridge
Showing posts 1 - 5 of 7. View more »