Ye:2023:TVCG

Z. Ye and M. Chen. Visualizing ensemble predictions of music mood. IEEE Transactions on Visualization and Computer Graphics, 29(1):864-874, 2023. DOI. (Presented at IEEE VIS 2022.)

under construction

Three Visual Designs. For the requirements described in Section 3, the three line-graph based visual design (i.e., stacked line graph, original ThemeRiver, and dual-flux ThemeRiver) cannot support R6 or R7 easily. Although all three visual designs convey more or less the same amount of information, they have different strengths and weaknesses in supporting R1∼5.

Symptom: With the stacked line graph and the original ThemeRiver, one cannot see easily the changes in the dominant opinion, the ordering of other opinions, and the place where ordering changes. Observing such information is an essential part of R1∼5.
Cause: Although such information is depicted implicitly, the cognitive cost for gaining it is very high, as it would involve perceptual estimation of the heights of different cross-sections, and cognitive comparison of such height measures [5]. The stacked line graph has some advantages over the original ThemeRiver in estimating the total height and that of the bottom stream.
Remedy: Introduce a more explicit depiction of such information to reduce the cognitive cost. With the dual-flux ThemeRiver, the dominant opinion, the ordering, and the places of mood changes are all explicit, ready to be perceived.
Side-effect: The mood streams are no longer continuous, and it may take extra effort to re-connecting the same stream e.g., to quantify the amount of mood change. With only four moods and appropriate color-coding, the side-effect is not a big issue. It could become more serious if there were many streams.

Pixel-based Visualization and dual-flux ThemeRiver. This combined use of two visual designs is for supporting requirements R6 and R7. To address the issue, we have to go back to the traditional methods for observing ML models’ performance. When one has a few models to compare, one might be able to ensure the demanding effort for observing their performance against individual data objects (music clips in this work) by reading classification logs. However, this would not scale up to 210 ML models.

Symptom: It is almost impossible to observe a large number of ensemble models against individual data objects by reading classification logs.
Cause: It incurs very high cognitive costs of reading numbers, remembering them for building up a mental overview model, and performing comparative tasks mentally.
Remedy: Both pixel-based visualization and dual-flux ThemeRiver provide external memorization, substantially reducing the cost of repeated reading-remembering. By removing the burden of memorization, the users can devote more cognitive resources to the patterns depicted.
Side-effect (new symptom): Identifying individual ML models is difficult with an arbitrary list of models, and grouping models visually is even harder.
Cause: Labelling small pixels is not easy. Visual grouping demands extra cognitive load for remembering and formulating groups mentally.
Remedy: Using different sorting schemes.
Side-effect: There could be an issue if the sorting scheme is unfamiliar to a user. For ML model-developers, this is unlikely.