MARKETING
comScore: online users and purchases. Yong has the data.
DEMOGRAPHIC, CENSUS & STATISTICS
FedStats: a comprehensive source of US statistics and more
DataFerrett, a data mining tool that accesses and manipulates TheDataWeb, a collection of many on-line US Goverment datasets.
AI-D (AI for Development)
GENERAL DATA MINING & MACHINE LEARNING
Agnostic Learning vs Prior Knowledge
Visualize Free, with wonder full visualization of real-world datasets
TEXT MINING & NATURAL LANGUAGE PROCESSING
CORA at UMass, provided by Andrew McCallum
911 Pager Data: includes 50M text messages covering 24 hour period surrounding the September 11, 2001.
AOL Search Data Scandal (mirror sites listed)
Yahoo Webscope : Yong, Wenjun, & Zhongmou have requested selected datasets
SOCIAL NETWORK
Amazon Web Services (AWS) Public Data Sets, a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications.
GRAPH & NETWORK
TIME SERIES
FINANCIAL
WRDS: Wharton B-Sh financial database, subscription required, ask mqf for access.
MULTIMEDIA & SPATIAL
MIT data from Xiaogang Wang: Yong has the complete data.
R-Portal: having some GPS traces of buses and trucks in Greece.
BIOINFOMATICS