Bibliography of Related Work in NLI

[See below to download full .bibtex file]

    • Ahn, C. S. (2011). Automatically Detecting Authors' Native Language. Master's thesis, Naval Postgraduate School, Monterey, CA.. [pdf]

    • Al-Rfou, R. (2012). Detecting English Writing Styles for Non-native Speakers. [pdf]

    • Bergsma, S., Post, M., and Yarowsky, D. (2012). Stylometric analysis of scientific articles. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 327–337, Montréal, Canada. Association for Computational Linguistics. [pdf]

    • Bestgen, Y., Granger, S., and Thewissen, J. (2012). Error Patterns and Automatic L1 Identification. In Jarvis, S. and Crossley, S. A., editors, Approaching Language Transfer through Text Classification, pages 127-153. Multilingual Matters [book link] [download pre-print]

    • Blanchard, D., Tetreault, J., Higgins, D., Cahill, A., and Chodorow, M. (To Appear in 2013). TOEFL11: A Corpus of Non-Native English. Technical report, Educational Testing Service.

    • Brooke, J. and Hirst, G. (2011). Native language detection with ‘cheap’ learner corpora. In Conference of Learner Corpus Research (LCR2011), Louvain-la-Neuve, Belgium. Presses universitaires de Louvain. [pdf]

    • Brooke, J. and Hirst, G. (2012a). Measuring interlanguage: Native language identification with l1-influence metrics. In Calzolari, N., Choukri, K., Declerck, T., Dogan, M. U., Maegaard, B., Mariani, J., Odijk, J., and Piperidis, S., editors, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pages 779–784, Istanbul, Turkey[pdf]

    • Brooke, J. and Hirst, G. (2012b). Robust, Lexicalized Native Language Identification. In Proceedings of COLING 2012, pages 391-408, Mumbai, India. The COLING 2012 Organizing Committee. [pdf]

    • Bykh, S. and Meurers, D. (2012). Native Language Identification using Recurring n-grams - Investigating Abstraction and Domain Dependence. In Proceedings of COLING 2012, pages 425-440, Mumbai, India. The COLING 2012 Organizing Committee. [pdf]

    • Crossley, S. A. and McNamara, D. (2012). Detecting the First Language of Second Language Writers Using Automated Indices of Cohesion, Lexical Sophistication, Syntactic Complexity and Conceptual Knowledge. In Jarvis, S. and Crossley, S. A., editors, Approaching Language Transfer through Text Classification, pages 106-126. Multilingual Matters. [book link] [download pre-print]

    • Estival, D., Gaustad, T., Pham, S. B., Radford, W., and Hutchinson, B. (2007a). Author profiling for English emails. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, pages 263–272, Melbourne, Australia. [pdf]

    • Estival, D., Gaustad, T., Pham, S. B., Radford, W., and Hutchinson, B. (2007b). TAT: An Author Profiling Tool with Application to Arabic Emails. In Proceedings of the Australasian Language Technology Workshop 2007, pages 21–30, Melbourne, Australia. [pdf]

    • Golcher, F. and Reznicek, M. (2011). Stylometry and the interplay of topic and L1 in the different annotation layers in the FALKO corpus. QITL-4 Proceedings of Quantitative Investigations in Theoretical Linguistics, 4:29–34. [pdf]

    • van Halteren, H. (2008). Source Language Markers in EUROPARL Translations. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 937–944, Manchester, UK. Coling 2008 Organizing Committee. [pdf]

    • van Halteren, H. and Oostdijk, N. (2004). Linguistic profiling of texts for the purpose of language verification. In Proceedings of Coling 2004, pages 966–972, Geneva, Switzerland. COLING. [pdf]

    • Jarvis, S., Bestgen, Y., Crossley, S. A., Granger, S., Paquot, M., Thewissen, J., and McNamara, D. (2012a). The Comparative and Combined Contributions of n-Grams, Coh-Metrix Indices and Error Types in the L1 Classification of Learner Texts. In Jarvis, S. and Crossley, S. A., editors, Approaching Language Transfer through Text Classification, pages 154-177. Multilingual Matters [book link] [download pre-print]

    • Jarvis, S., Castañeda-Jiménez, G., and Nielsen, R. (2012b). Detecting L2 Writers' L1s on the Basis of Their Lexical Styles. In Jarvis, S. and Crossley, S. A., editors, Approaching Language Transfer through Text Classification, pages 34-70. Multilingual Matters. [book link] [download pre-print]

    • Jarvis, S. and Crossley, S., editors (2012). Approaching Language Transfer Through Text Classification: Explorations in the Detection-based Approach, volume 64. Multilingual Matters Limited, Bristol, UK [book link]

    • Jarvis, S. and Paquot, M. (2012). Exploring the Role of n-Grams in L1 Identification. In Jarvis, S. and Crossley, S. A., editors, Approaching Language Transfer through Text Classification, pages 71-105. Multilingual Matters. [book link] [download pre-print]

    • Kochmar, E. (2011). Identification of a writer’s native language by error analysis. Master’s thesis, University of Cambridge.[pdf]

    • Koppel, M., Schler, J., and Zigdon, K. (2005b). Determining an author’s native language by mining a text for errors. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 624–628, Chicago, IL. ACM. [pdf]

    • Koppel, M., Schler, J., and Argamon, S. (2008). Computational methods in authorship attribution. Journal of the American Society for information Science and Technology, 60(1):9–26. [abstract]

    • Swanson, B. and Charniak, E. (2012). Native language detection with tree substitution grammars. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 193–197, Jeju Island, Korea. Association for Computational Linguistics. [pdf]

    • Tetreault, J., Blanchard, D., Cahill, A., and Chodorow, M. (2012). Native tongues, lost and found: Resources and empirical evaluations in native language identification. In Proceedings of COLING 2012, pages 2585-2602, Mumbai, India. The COLING 2012 Organizing Committee. [pdf]
    • Tetreault, J., Blanchard, D., and Cahill, A. (2013). A report on the first native language identification shared task. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications. Atlanta, GA, USA. Association for Computational Linguistics.
    • Tofighi, P.; Köse, C.; and Rouka, L. (2012). Author’s native language identification from web-based texts. International Journal of Computer and Communication Engineering. 1(1):47-50 [pdf]

    • Tomokiyo, L.M. and Jones, R. (2001). You’re not from ’round here, are you?:naive Bayes detection of non-native utterance text. In Proceedings of the Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies, pages 1–8, Pittsburgh, PA.Association for Computational Linguistics. [pdf]

    • Torney, R.; Vamplew, P.; and Yearwood, J. (2012). Using psycholinguistic features for profiling first language of authors. Journal of the American Society for Information Science and Technology. 63(6):1256-1269. [abstract]

    • Tsur, O. and Rappoport, A. (2007). Using Classifier Features for Studying the Effect of Native Language on the Choice of Written Second Language Words. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition, pages 9–16, Prague, Czech Republic. Association for Computational Linguistics.[pdf]

    • Wong, S.-M. J. and Dras, M. (2009). Contrastive Analysis and Native Language Identification. In Proceedings of the Australasian Language Technology Association Workshop 2009, pages 53–61, Sydney, Australia. [pdf]

    • Wong, S.-M. J. and Dras, M. (2011). Exploiting Parse Structures for Native Language Identification. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1600–1610, Edinburgh, Scotland, UK. Association for Computational Linguistics. [pdf]

    • Wong, S.-M. J., Dras, M., and Johnson, M. (2012). Exploring Adaptor Grammars for Native Language Identification. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 699–709, Jeju Island, Korea. Association for Computational Linguistics. [pdf]

    • Wong, S.-M. J., Dras, M., and Johnson, M. (2011). Topic Modeling for Native Language Identification. In Proceedings of the Australasian Language Technology Association Workshop 2011, pages 115–124, Canberra, Australia. [pdf]