tremit, Trend mining tool

Goal:

Our trend mining tool, tremit, is being developed with two goals in mind. Foremost it is to create a flexible and extensible tool for mining trends in different data sets with the aim of providing  sufficient interpretability of the trend mining results. Our current database consists mainly of web document corpora but we are working on extending the tool for interpretation also of WiFi-data and mobile user data. The tool allows us to experiment with, and test, different mining algorithms providing valuable feedback to our research on trend mining. Our secondary goal is to develop a general trend mining tool that offers various built-in functions, allowing the application of several analysis methods on different data sets with temporal aspect.

Thirdly, we want to enable interested people with an easier introduction to the research on trend mining by providing a way to test and experiment with known results as implemented in tremit.

tremit is an open source project and if you want to support the development of tremit in any way we are very interested in hearing from you!


Figure A.1 The GUI of tremit


Users:
  1. Anyone interested in trend mining
  2. Researchers in related fields such as temporal data mining, and situation recognition
  3. Those involved with the analysis of temporal data in general

Use cases and data types:
  •  Web data (e.g. XML files as in case of web documents):  a trend in web documents is an emergent topic area increasing in interest and utility over time*. By mining trends in web documents in general we aim at finding most significant topic(s) emerging within a given time frame in the data set and providing information about them. Here, the focus is on so called long-term trends, and for this we offer an ex post approach.
  •  Ambient signal data (WiFi data as obtained through CSI): this is based on our experiments with situation recognition; for this data we are interested in looking for changes in the signal over a short period of time in order to support situation recognition.
  • Mobile user data: it aims at finding trends in data obtained from mobile users.

GUI:

                                                        
                                                        Figure A. 2 The GUI of tremit- explanation.


  • (1) The "terminal" – this is where tremit keeps you updated about what it is doing, and what you have been doing, e.g. changed the filter, started an algorithm, connecting to a database. This is also where part of the "Topic Modeling" results are displayed upon successful completion of the algorithm. This field acts as the tremit’s main communication channel to the user; it can be scrolled and copied from.
  • (2) This is the visualization area. The "Default Visualization Tab" (currently unavailable) displays the colorful results of some of the algorithms  while the "Browser" tabs display the rendered HTML output of the "K-Means Clustering" & "Topic Modeling" algorithms upon its successful completion, as shown in the screenshot. This output can be exported [more on that in (7)].
  • (3) This is the configuration area. Here, after loading your data set into tremit using (4), you select the desired analysis, then choose the desired parameters. Currently, the only fully available algorithms are Topic Modeling and K-Means Clustering. After you selected the parameters press "Start" to begin the algorithm. tremit will use the terminal (1) to keep you informed of what is happening. Keep in mind that tremit may (and likely will) seem unresponsive after you click on "Start". Don’t worry, the algorithm is being executed and you will be notified of any errors if things go wrong.
  • (4) This is the DB Connection area. Enter the details of your DB Connection here to load your data set into tremit. Only then can tremit actually execute any of the algorithms. You can define a filter on your DB query using (5) if you only want part of your data set to be loaded into tremit. Keep in mind that each time you connect to the database the data set currently stored in tremit will be overwritten.
  • (5) This is the filter area. Click on the button to configure the conditions of your DB query in the new window that appears. When you are done, click "Accept" to apply that filter,"Discard" to return without changing the current filter, or "Clear Filter" to delete the current filter. If you click "Accept", then terminal (1) will display the resulting SQL query condition that will be used to access your DB, so feel free to take a look at it.
  • (6) This is the filter on the display area. This area shows the current details of your chosen filter.
  • (7) This is the export area. Clicking on this button will open the export window. Here you can choose to export the most recent result(s) of the Topic Model Algorithm as a ".txt" file to your chosen location. If there was no result, it will simply create an empty document. The same applies to the most recent result of K-Means Clustering as ".html" document. Of course, you can also just copy things from the terminal and into a file, but the export function may be more convenient especially with regard to other output formats with more lines like ".rdf" or ".csv"  (currently unavailable).

Test data:

Our current test corpus consists of around 40,000 web documents (DAX News) in XML format. Each file consists of tags that can be taken into account by tremit separately (e.g. analysis by: Abstract only, Content only, Title only, etc.) . The analysis of web documents is one of our use cases (see Use Cases above) and we provide a small sample of this data as a test data set when downloading tremit in order to get started (see Demo below). This sample consists of around 900 document files, with file size being as evenly distributed as possible. The documents are in German language and pre-processed by: Part of Speech analysis, Stemming, and Named-Entities Recognition. 

Additional to the document corpus, we have an experimental WiFi data corpus obtained through our experimental research on situation recognition


Demo:

The demo for download will be available soon! 
Meanwhile, see our tremit screencast from the running Version-1.0 here:

Screencast of tremit, trend mining tool. from xtrmtggng on Vimeo.