The work reported in this paper is part of the DPhil project of the first author. It is concerned with the development of an ensemble model for predicting music mood. The ensemble model consists of 210 sub-models. If one simply relies on a summary prediction of the ensemble model, one naturally feels unsure about such prediction as one may feel the same about another person's judgement. Such uncertainty is actually caused by the fact that the ensemble model compress the information too quickly (Alg-High-AC). One obvious solution is to use visualization to convey the predications by individual sub-models in a less-aggregated manner (Vis-Low-AC). Below, we use orange text to indicated the original text in the paper. The work considered two groups of target users, music experts and ML developers.
• R1. Both would like to observe how ensemble models collectively voted on individual sections of music, so someone with music knowledge can reason if the voting results are sensible or not.
• R2. Both would like to observe how ensemble models are collectively influenced by the less accurate “global” ground truth labels in parts of the music where the mood changes.
• R3. Both would like to locate where ensemble models voted for a mood change so we can relate such changes with the corresponding music score.
• R4. Both would like to see the dominant opinion of ML models, the second dominant opinion, the third, and so on, and music experts would like to exercise their own interpretations of the different predictions generated by an ensemble of ML models.
• R5. Both would like ideally to identify visual representations that can be used to accompany music for non-experts.
• R6. ML developers would like to observe sub-groups of models (e.g., by methods and interval length) to compare their performance with the ensemble group.
• R7. ML developers would like to observe individual models’ performance to compare their performance with the ensemble group and related subgroups.