In September 2016 joined Digital Operatives, in Arlington, Virginia, to help out with the NLP needs on various cybersecurity-related projects.
From September 2014 to September 2016 I was a Senior Research Scientist in the London group of Thomson-Reuters Corporate Research and Development. I worked on a parser, and on methods for dealing with name ambiguity when screening individuals and companies.
I worked as research scientist for Ron Kaplan's NLU group at Nuance, Sunnyvale. I was there from March 1, 2013 to August 2014. Before that I was a research scientist for the Educational Testing Service, and before that I taught and did research in universities, at Sussex, Edinburgh and Ohio State. I am proud of the efforts an wonderful group of Ph.D, Masters and Senior thesis students with whom I have worked over the years. It is very gratifying to see the impact that they are having in research, development and service. Mirella Lapata, Sabine Schulte im Walde and Anna Feldman have long-term academic positions in Scotland, Germany and the USA, and most of the others have research-oriented positions in government or industry.
My background is in Computational Linguistics, especially Statistical NLP and Computational Semantics. I studied at the University of Sussex, in the Department of Experimental Psychology, where my D.Phil advisor was Steve Isard. He taught me to focus on the work, and not to worry about what the department is called.
I enjoy writing my own software. It is personally gratifying to build a complete system that is competitive with the best. A short answer system that I built at the Educational Testing Service is the basis for a portion of a new version of the Praxis test that many states use for assessment of teacher training and professional preparation. The parser that I built at Thomson Reuters is likely to find its way into production as part of a system that legal professionals (and others) can use to query TR's databases. DeZhao song is the lead author on our papers about early versions of this system.
My favorite recent achievement is a system that came 10th out of 156 entries in the ASAP essay scoring competition. I built it using Python, Scikits.learn and Pandas. The main point was to practice for ETS's participation in the ASAP short answer scoring competition, for which I was part of ETS's team. We placed 5th, out of the money. My Kaggle name is joshnk. I recently also came 3rd out of 91 in the Twitter personality prediction competion. See code. I also have a didactic chart parser with versions in Java and Python, and some code written for my own amusement that breaks simple ciphers, including Playfair. Stephen Boxwell and Dennis Mehay wrote a CCG-based semantic role labeler that formed the basis of some of our publications.
My textbook on Language and Computers, written with Markus Dickinson and Detmar Meurers, came out in November 2012. It's designed for a general audience, so it tries to be as gentle as possible in explaining technical ideas. The cover is going to have these kind words on it:
We hope that students and teachers will find the book useful.
I am proud of the efforts an wonderful crew of Ph.D, Masters and Senior thesis students with whom I have worked over the years. It is very gratifying to see the impact that they are having in research, development and service. Mirella Lapata, Sabine Schulte im Walde and Anna Feldman have long-term academic positions in Scotland, Germany and the USA, and most of the others have research-oriented positions in government or industry.