From 2010-2011 I was on leave from UW-Madison, working as Chief Scientist of Kosmix, a startup in social media analytics.
While at Kosmix I was involved in a number of projects that built and used Web-scale knowledge bases, especially for information extraction and integration, entity matching, and social media analytics. Parts of these projects are described in the following papers:
Social Media Analytics: the Kosmix Story, with many authors. IEEE Data Engineering Bulletin, Sept 2013.
Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-Based Approach, A. Gattani, D. Lamba, N. Garera, M. Tiwari, X. Chai, S. Das, S. Subramaniam, A. Rajaraman, V. Harinarayan, and A. Doan. VLDB-13, industrial paper. slides
Building, Maintaining, and Using Knowledge Bases: A Report from the Trenches, O. Deshpande, D. Lamba, M. Tourn, S. Das, S. Subramaniam, A. Rajaraman, V. Harinarayan, A. Doan. SIGMOD-13, industrial paper. slides
Muppet: MapReduce-Style Processing of Fast Data, W. Lam, L. Liu, S. Prasad, A. Rajaraman, Z. Vacheri, A. Doan. VLDB-12, industrial paper. slides
I also worked on event detection and monitoring for social media. A talk that describes work at Kosmix at a high level: Social Media, Data Integration, and Human Computation.
Kosmix was acquired by Walmart in 2011 and turned into WalmartLabs, the research and development lab for e-commerce at Walmart. From 2011-2014 I worked as Chief Scientist of WalmartLabs. Some of this work and collaboration efforts from 2014 until now are described here. My work/collaboration efforts involve building product knowledge bases, product matching, information extraction, data cleaning, and crowdsourcing.