Implementation Notes

NubFinder classifier is trained with data from International Survey On Emotion Antecedents And Reactions (ISEAR) project. In this project “Student respondents, both psychologists and non-psychologists, were asked to report situations in which they had experienced all of 7 major emotions (joy, fear, anger, sadness, disgust, shame, and guilt). In each case, the questions covered the way they had appraised the situation and how they reacted. The final data set thus contained reports on seven emotions each by close to 3000 respondents in 37 countries on all 5 continents.”


NubFinder currently works only with the first bunch of messages that Twitter Search (free of charge) API provides.
 
NubTrend process every new, not-empty bunch of messages *and_on_every_next_bunch_re-classifies_all_previous_bunches_as_well* until the requested by you time period expires. Processing of every 'message bunch' results in generation of a new Emotion Trend Chart for the given request.
 As soon as first chart gets ready, NubTrend shows green "ready" status in Web GUI. This does not mean though, that processing of the request is complete! NubTrend continues to work in the background and new charts are getting generated until the time period expires. In case the new chart got created, it replaces the one open in your browser. Or you will see the updated chart next time when you click on the 'minimized' request with 'ready' status. Updated charts should reflect classification results improving in time - first charts are rough class approximation suffering from noise, while the last one, as I strongly hope, will show better classification result. (More experimentation and result analysis are needed here).
Note: Processing every time *all_so_far_seen_message_bunches* should improve message classification in the long run. Continues 'running query' should improve message classification with passing time. The longer query runs, the better all messages get classified.

NubFinder and NubTrend Web clients are implemented with GWT . Server is Apache Tomcat running on Amazon AWS single instance.
 All server-side code including Twitter crawler and Sparse Vector Space Model classifier is written in Haskell and compiled with The Glasgow Haskell Compiler.
Comments