Making Machine Learning Applications for Time-Series Sensor Data Graphical and Interactive (ACM TiiS 2017, Expedition in Computing)

The goal of this project is to create a simple, usable tool for handling time-series sensor data streams in ML incorporated applications. In the latest study, we attacked issues related to the volume, source input, and unintelligibility of time-series sensor data streams, aiming to help domain experts who are intermediate ML users (not experts, not novices) better interact with ML applications.

For the study, we recruited 30 domain experts who are intermediate users of machine learning in HCI, robotics, behavior science, educational technology, language technology, and so on, and conducted a 10-person contextual inquiry to identify difficulties in modeling sensor-based time series data and two 10-person probe-based studies of ML users to improve the usability of our prototype ML system, Gimlets 1.0 à 2.0 à 2.5 (Figure 6).

I integrated visual analytics allowing rich user interactivity into the ML pipeline, thereby enabling the system to provide supports for synchronization of multimedia data streams (e.g., videos and audios), sensor data streams, and human-annotation data; interactive visualization of data annotations; quick creation of derived features; plug-ins for extensibility and structured guidance for moving through the model creation process (Gimlets 2.0, Figure 6 top and bottom left); and a mixed-initiative support through end-user interactive visualizations based on mouse input for finding and manipulating erroneous and outlier data (Gimlets 2.5, Figure 6 bottom left).

For the evaluation, Gimlets 1.0 users classified physical activities such as walking, standing up, sitting down, lying down, bending, falling and bicycling (using a stationary bike), which required them to explore sub-layouts across the ML pipeline. Gimlets 2.0 users evaluated visual-analytic features embedded in the pipeline, where one experimenter demonstrated data inspection tasks to participants by using datasets from a study on driver activity prediction (i.e., video recordings of drivers and sensor data about their motion and physiological responses while driving) and from a study on sensor-incorporated Rapid ABC test (i.e., video recordings and electrodermal activity data from a child and an examiner during a short interactive screening – Figure 6 bottom right).

Figure 6. Exploratory layout of Gimlets 2.0 (top), interactive visual-analytics and ML features in Gimlets 1.0, 2.0, and 2.5 (bottom left), and visual inspection tasks using Gimlets 2.0 – checking electro dermal activity (EDA) fluctuation in an autism screen, Rapid ABC test (bottom right).