Projects

SEECAT: Speech and Eye-tracking Enabled Computer Assisted translation

Link (deprecated)



*last update 29th Nov,2015

1. Automatic Speech Conversation Summarization : We developed an automatic summarization system for spoken conversations. We adapt a hypernym based template generation algorithm for creating abstractive templates. The system also exploits the relationship between transcripts and summaries to select appropriate templates for generating summary for a conversation. We also propose an alternative novel architecture of generating templates on-the-fly using RNN based encoder-decoder architecture. We achieve high scores compared to all other systems in the past. 

Link to Toolkit : Will be available soon                                                                                                                                                                                                     

Reference paper : Stepanov, E., et al. "AUTOMATIC SUMMARIZATION OF CALL-CENTER CONVERSATIONS." (ASRU Demo 2015)

2. Speaker Diarization For Heterogenous News Data : I have implemented speaker diarization module for a massive heterogeneous television news corpus using LIUM and PyCASP with pre-processing steps which will allow a user to use it's own segmentation information or can use auto-segmentation option. Another contribution was to train a speaker identification module within a network using technique called ``Constrained Global Clustering'' across video files. This framework will be then compiled with open-source toolKit called ``voice-id'' to use Multi-Modal approach which also incorporates information from visual features.

3. Exploring use of vector space models in Dependency Parsing :  We explore the methods of using the embeddings in Dependency Parsing of Hindi, a MoR-FWO (morphologically rich, relatively freer word order) language and show that they not only help improve the quality of parsing, but can even act as a cheap alternative to the traditional features which are costly to acquire. We demonstrate that if we use distributed representation of lexical items instead of features produced by costly tools such as Morphological Analyzer, we get competitive results. This implies that only mono-lingual corpus will suffice to produce good accuracy in case of resource poor languages for which these tools are\ unavailable. We also explored the importance of these representations for domain adaptation.                                                                                                                                   

Reference paper :  Tammewar, Aniruddha, et al. "Can Distributed Word Embeddings be an alternative to costly linguistic features: A Study on Parsing Hindi." SPMRL 2015

4. Exploring use of semantic features from Word-Net in dependency parsing : This work explores the importance of semantic class information induced via word-net for the dependency parsing of Hindi-a morphologically rich language and English-a positional language.

5. Graph Visualization ToolKit

6. Game with a real life environment : This was done as part of the graphics course. It was car-racing game with a 3D environment.

Link to Code ( can be easy but easy to run ) : https://github.com/ksingla025/3D-car-world

7. SEECAT : Typing has traditionally been the only input method used by human translators working with computer-assisted translation (CAT) tools. However, speech is a natural communication channel for humans and, in principle, it should be faster and easier than typing from a keyboard. This contribution investigates the integration of automatic speech recognition (ASR) in a CAT workbench testing its real use by human translators while post-editing machine translation (MT) outputs. This work also explores the use of MT combined with ASR in order to improve recognition accuracy in a workbench integrating eye-tracking functionalities to collect process-oriented information about translators' performance.