| When our email collection become a pile of documents, in many cases
we need to organize and analyse it to give us information about it.
Therefore, we use an email client, like Thunderbird or Microsoft
Outlook that serve us tools for basic functions: sending, retrieving,
organizing our emails, and spam detection. Those tools are not for
analysing email collection in large size. In order to analyse them, we
need a special software tool that functions as the analyser tool which
will provide information about the email collections. when we analyzing
emails we know who communicate with us, what and how many groups formed
based on emails frequency and what most interchangeable information
among participants, and what topics that mostly discuss.
Based on that requirements and problems, I have developed a unique tool that have many features, specially to analyze a large document collections. I call this application as BuddyMiner. The first version of BuddyMiner is restricted to read mbox file format (a stored Mozilla email format). With BuddyMiner, we will be helped to find some pattern of information, email automatic clustering, some statistic graphics of email collection, information retrieval for the collection, etc. Features BuddyMiner is developed based on text mining clustering, information retrieval and information extraction theory. With this approach make BuddyMiner as a special application for any organization to help theme finding some hidden pattern information in their email collection. BuddyMiner is designed to analyze Indonesian and English documents. Picture 1 give us an illustration about the main interface of BuddyMiner. |
