Zhang:2016:CGF

This is one of the three post-hoc case studies discussed in the 2019 paper by Chen and Ebert, where the IVAS framework was proposed. The analysis was reported in Appendix C.2 of the paper. The second author (DSE) suggested this paper, which was unknown to the first author (MC) previously. MC first read the paper and then wrote a report as an independent reviewer. The report took about 40-60 minutes to complete, including the effort of writing but excluding the time for the first reading. MC then emailed the report to DSE for comments. There were a few email correspondences to discuss the report.

MC: Here is a Report of Abstract Reasoning for Case Study 2. I am sorry that my analysis may sound a bit critical, and I guess that it is the "fault" of abstract reasoning as it forces one to search for problems and solutions systematically.

Jiawei Zhang et al. A Visual Analytics Framework for Microblog Data Analysis at Multiple Scales of Aggregation, CGF, 2016.

This paper did not describe a particular original system, but some techniques in the literature were mentioned as the benchmark (reference) system. Nevertheless, the users were clearly defined, i.e., causal analysis experts in the emergency and law enforcement agencies. From Section 3 of this paper, one can gather that the users have access to sentiment analysis and topic modelling in their workflow, but have found that they provide only a coarse-grained overview about the situation.

1. Symptoms of the Original Workflow:

(A). The lack of fine-grained, crisis-related categorizations. This implies that the existing analytical techniques have a high-level alphabet compression but the resulting output categories do not allow experts to reconstruct the parts of original data useful to their task. Using the students’ examination marks as an example again, this is a bit like that the students (cf. microblogs) are categorized based on their calculus or algebra results, but the users really want a categorization based on their sports interests. Trying to infer their sports interests from the categorizations of mathematical ability, there will be a lot of potential distortion.

(B). The users wish to have some controls over the scales of space, time, and data categorization, but pre-defined aggregation schemes cause abrupt changes. The lack of user controls of the scales of statistical aggregation implies that too much alphabet compression by the algorithms, which did not leave many options (i.e., letters) to users.

(C). The rapid arrival of the data results in rapid changes of the visualization, disrupting the ongoing analysis. This implies not enough alphabet compression by the visualization.

(D). Related to (A), (B), (C), there is an obvious symptom that viewing overly fine-grained data (e.g., all microblogs) is too costly and not effective. Although it was not mentioned in the paper, we can assume that this option has been ruled out at the beginning.

2. Analysis of Possible Causes:

(A) is firstly caused by an inappropriate algorithm designed for different categorization. Thus simply reducing the scale of alphabet compression will not necessarily solve this problem. When the domain experts said “we want to see more”, in this case they might mean “we want to see different topic modelling or sentiment analysis.” It might also be caused by the same issue of (B) as the paper noted.

(B) is caused by the lack of interaction for controlling the algorithms.

(C) may be caused by a lack of alphabet compression, but may also be caused by humans’ poor memory in remembering what has been observed, i.e., potential distortion in recall, and cost of cognitive load to remember and that of effort for repeatedly going backward and forward. Thus (C) may be really caused by the lack of a suitable alphabet compression by using visualization for memory externalization.

DSE: This implies the need of a better visual encoding method.

MC: (continued)

3. Analysis of Optional Remedies:

The solutions presented in the paper focused on (B) through requirements R1, R2, R3, R5, and (C) through R4. The issue of (A) seems to be addressed in conjunction with (B). As an independent "reviewer", I will look at the above causes individually.

(A) Since the current provision of topic modelling and sentimental analysis does not provide a useful categorization, clearly one optional remedy must be to develop more suitable topic modelling and sentimental analysis. Although there will not be a categorization suit all situations, there must some commonly-occurred situations, such as traffic accidents, street fights, etc. Some pre-built topic models and sentiment analysis can be called upon when those common ones occur. Hence better alphabet compression may help.

DSE: It was hard to come up with a topic model for inputs each with 280 characters and words used in different manners - e.g., kill - my head is killing me, I killed that exam, I want to kill someone. Machine learning can help here but requires very long training and is hard to adapt to new situations - each crisis is different - but we are working on this.

MC: (continued)

(B) All topic models and sentimental analysis are sensitive to space and time. Hence introducing user interaction is good. This means a bit less alphabet compression by algorithms, and a bit more by users (i.e., interactions). In this way, the latter causes less potential distortion, as the users have access to extra variables, such as contextual information, confidential information, and their knowledge and experience.

(C) The remedy for cause (C) is to design visualization that can assist memory recall. In this case, petal-based glyphs are used as they feature a very high-level of alphabet compression (i.e., a lot of information loss) and thus easy to remember. Some other visual designs, which may provide more memory externalization, could potentially be explored, such as the changes of petal-based glyphs over a period for specific locations, or the movement of certain types of petal-based glyphs, i.e., the largest volume of microblogs.

4. Analysis of Potential Side-Effects

The skills for interactively working with topic modelling and sentimental analysis can be costly, (interaction, high cost). Interactively setting different spatial and time scales may cause different scales of statistical aggregation, making visualization, especially petal-based glyphs sensitive to scale changes, (visualization, high PD). This was addressed by the empirical study. The domain experts prefer the ability to make rapid and intuitive observations about the difference between neighboring glyphs. As the users are involved in setting the scales, the users have better reconstructions than what a passive observer would have. This is similar to the scenario that a user who changes an axis to a different scale (e.g., log), the user can reconstruct the data better than the one who does not know the change.

Petal-based glyphs require users to remember which petal is for which attribute, thus high cognitive load, (visualization, high cost). Since the system is expected to be used by experts for some time, one may consider some investment in more meaningful glyph designs that are easy to remember.

Future research topics:

DSE: This is the type of feedback that would be good to get in the review process.