Resources

yellow LaTex Templates and Publisher Policies

yellow Mailing Lists

  • DDLBETA: An invitation-only mailing list for discussions of text classification, text mining, and related issues.
  • SIG-IRList: a moderated regular IR news source.
  • DBWorld: mailing list intended for messages of interest to the database research community.
  • KDNuggets: bi-weekly electronic newsletter focusing on Data Mining and Knowledge Discovery.

green Software Documents and Manuals

blue Programming Guides

red Suggestion Collections

yellow Crawlers

  • Heritrix: A web crawler written in Java.
  • Larbin: Another web crawler which is written in c.
  • Sitemap: A crawling protocol observed by Google, Yahoo! and MSN goes beyond robots.txt.

"It is a very sad thing that nowadays there is so little useless information." -- Oscar Wilde
compass  Back to Shaozhi's Homepage