Research

Query Word Labeling using Supervised Machine Learning

FIRE2014 Shared Task on Transliterated Search in collaboration with I.R.S.I. and Microsoft Research

Identify words as belonging to an Indian language (L) or English (E) from sentences written in Roman script and if the word belongs to Indian language (L), transliterate the same to its Devanagari script equivalent.

Read the entire working notes here.

Normalization based stop-word approach to source code plagiarism detection

FIRE2015 Task on Cross Language Plagiarism detection

We approach this task as text document plagiarism task, without considering formal programming language grammatical structure.

We use normalization of commonly used identifiers to detect pair of programs which have the same objective. We also find that entirely removing these normalized operations improves the system.

Read the entire working notes here.