Seminar Announcement: Monday April 18, 2011 - Ves Stoyanov: Modeling Natural Language with Minimum-Risk Trained CRFs

Post date: Apr 16, 2011 5:17:10 PM

There is a seminar on Monday that students in our class will find interesting. It's by our own course instructor, Ves! He's giving a talk on some of his recent research on approximate CRFs. It's all very relevant to our course content. Here's the full details:

Monday, April 18, 2011

810 Wyman Park Drive, Baltimore, MD 21211

10:00 am, North Conference Room

Title: Modeling Natural Language with Minimum-Risk Trained CRFs

Abstract: Conditional Random Fields (CRFs) are a suitable formalism for modeling the dependencies in natural language. It is well known how to train CRFs for which exact inference and decoding are tractable. Some NLP phenomena, however, require the use of CRFs that are not tractable and require approximations such as the use of variatonal inference. I will present three such NLP problems that can be modeled naturally with CRFs with loopy structure: (i) modeling congressional votes, (ii) information extraction from semi-structured text, and, (iii) collective multi-label classification.

I will argue that when a CRF model will be used with approximations, it should be trained to minimize the risk given the approximate inference and decoding methods. I will present a back-propagation algorithm to compute gradients of arbitrary losses. I will also discuss how these gradients can be used by a local optimizer to learn parameters. I will present results of an experimental study using synthetic data showing that models trained using our approach outperform the corresponding models trained by an approximation to the maximum likelihood. In addition, I will present results on the three NLP problems mentioned above that show that:(i) modeling natural language with CRFs that require approximate inference or decoding can lead to improved performance, and, (ii) minimum risk training learns more accurate models in this setting as well.

Finally, I will put this work in the context of a probabilistic approach to structured relational learning (SRL) that we are developing. I will describe how this approach will be used to model problems such as entity linking, label propagation and social network modeling.