Thesis Work

Improving Conversation Analysis Pipeline through Personal Situated Analytics in Extended Reality

Technologies Used: C#, Python, Unity, VRTK, MRTK, scikit learn, Open AI, Google Speech to Text API, Optitrack suite

Fig. 1: Two seated head tracked individuals interacting with Articulate+ during the data collection phase

Fig. 2: PSA is an immersive MR/VR framework for experiencing and analyzing recorded conversations and events. We see two participants represented as virtual avatars (from Fig.1) engaged in a conversation. Their conversation is accessible with rising speech bubbles next to them along with the associated audio. The display screen shows the visualizations generated by the AI agent Articulate+ based on requests made by the participants. An interactive on-demand word cloud can be used to access attributes used in the meeting and reach any point in the conversation when the attribute was uttered. A ray originating from the participant's forehead may be used to understand gaze patterns.

Fig. 3: A user embedded in a conversation using PSA in VR showing the same two participants from Fig. 1 represented as virtual avatars.

Strategizing and sensemaking in immersive environments have predominantly been explored for training, learning, and recreational tasks. The ability to embed these concepts into the data analysis pipeline is recently emerging and has seen limited research. In our work, we present Personal Situated Analytics (PSA) that provides individuals the ability to embed themselves in recorded events in multiple degrees of immersion on the Reality-Virtuality spectrum, evaluate the benefits, and study their exploration patterns. With the increasing availability of sensors and capturing devices, we can now actively record and leverage data from our daily activities to gain valuable insights and make informed decisions. PSA at its core formulates a combination of embodied cognition, situated analytics, and conversation analysis while exploiting the benefits of strategic immersion and sensemaking in immersive environments. To that end, we first developed a framework for embedding the individuals in the environment where the conversation originally occurred. Secondly, we conducted a pilot user study (n=12, protocol #STUDY2023-0018), to compare user experiences while exploring recorded conversations in Mixed Reality through the HoloLens2 device, and in Virtual Reality through the Quest2 device. Our proposed framework encompasses various stages such as tracking, data capturing, data cleaning, data synchronization, prototype building, and deploying the final product to end-user hardware. In the first part of the experiment, we used datasets consisting of seated participants (protocol #2022-0354). In the future, we will record and analyze datasets with standing and non-stationary participants (protocol #STUDY2023-0509) in two settings that would employ various collaboration profiles among the participants. Drawing from our experience during the preliminary user study and the feedback we received from the participants, we catalog our lessons learned and insights which can potentially drive advancements in embodied situated analytics and thereby enhance the conversation analysis pipeline.

Preliminary exam announcement: https://www.evl.uic.edu/events/2720

ISMAR Workshop Paper - https://shorturl.at/eszN1

UIST 2023 Poster - https://dl.acm.org/doi/10.1145/3586182.3616697

Video - https://www.youtube.com/watch?v=Rif75BUbfAk&ab_channel=AshwiniNaik

Google Sites

Report abuse