Human-Centered RL

Human-Centered Reinforcement Learning
for Personalized Coaching in Health

Introduction

Chronic diseases, such as type 2 diabetes (T2D), hypertension, and obesity place an ever-increasing burden on individuals and society at large [1]. Health coaching, in which human experts use knowledge and experience to help individuals who wish to improve their health identify appropriate health goals, and provide practical advice, encouragement, and feedback towards attaining these goals, has emerged as an effective approach to promoting self-management [4,11,14,18]. However, there are not enough coaching professionals to accommodate the growing population of individuals with chronic diseases, particularly those in medically underserved communities with limited access to traditional healthcare [5,13]. In addition, there are disparities in access to in-person coaching, including transportation and cost [5,12].

Conversational agents (CA) have the potential to overcome these barriers and make health coaching available to more diverse populations. CA have been successfully used in many health contexts and domains, including health coaching [2]. The majority of CA in health have relied on fully-scripted approaches, in which the flow of all possible dialogs is written in advance and users choose from available requests and responses [8]. However, these approaches are less suitable for more open-ended dialogs, such as coaching dialogs in the context of chronic diseases, where the complexity of dialog structures can quickly become unmanageable, and fully-scripted dialogs can be perceived as rigid and repetitive [3, 7]. In these contexts, data-driven CA that rely on machine learning (ML) to learn appropriate dialog structures have advantages over the fully-scripted ones.

One particularly promising data-driven approach is with reinforcement learning, a machine learning technique that learns from interactions and prescribes sequences of actions towards reaching a predetermined goal (RL, [3,15]). Given the fluid and flexible structure of coaching and its emphasis on addressing unique needs of each client, RL presents a promising approach for coaching CA. However, RL-based CAs have several limitations that are particularly important in the context of health coaching. First, previous studies showed that optimization algorithms used to train RL-based CA often produce dialogs that are efficient (short) but that are perceived as random by human users [16]. Consequently, there is a need for new approaches to aligning RL-based CA with human reasoning and expectations. This is further exacerbated by the general lack of explainability of actions chosen by an RL agent, a problem common to many ML methods. Consequently, there is a need for new approaches to generating explanations for RL inferences and recommendations.

In this project, we will develop a new approach to providing health coaching with RL-based CA, while at the same time addressing the more general challenges mentioned above on designing human-centered and scalable RL-based CA. We will build upon our prior investigations of CA for nutritional coaching in T2D [9,10]. In our prior work we developed T2 Coach, a fully-scripted CA that follows a coaching protocol Brief Action Planning (BAP, [6]). T2 Coach helps individuals to set nutritional goals and provides assistance with goal attainment via daily dialogs that include reminders, suggestions, and opportunities for reflection. Furthermore, we have investigated new approaches to more directly supporting daily mealtime choices with micro-coaching, which focuses on helping individuals determine how well their planned meals align with their specified nutritional goals, a critical need that often presents a significant barrier to individuals with low nutritional literacy, and is particularly acute for medically underserved communities that suffer from higher prevalence of T2D [1].

To that end, we have developed a micro-coaching CA that 1) elicits individuals’ meal plans (e.g. “Hi Lena, what do you plan to have for lunch?”), 2) collects answers as free-text responses (e.g., “a salad”), 3) converts these free-form descriptions into a computable form using natural language processing (NLP) and a food ontology, 4) assesses these descriptions on alignment with their previously established nutritional goals using a set of rules, and 5) uses follow-up questions learned using RL to disambiguate situations when initial meal descriptions do not contain sufficient information [9].

An initial evaluation study of this approach found that RL-based dialogs were able to efficiently collect information needed to determine whether meal plans met the goals or not. However, it required considerable human engineering of the RL action and reward space to align them with each nutritional goal, thus lacking generalizability. In addition, it suffered from challenges common to RL-based chatbots: while RL-learned follow-up question sequences were short and effective, they were rated low in quality and perceived as not intuitive. This was further exacerbated by the lack of explanations for both the questions and the final determination, which limited users’ ability to alter their meal choices.

In the proposed work we will address these limitations in two complementary ways. First, we will develop a more general, data-driven approach to learning appropriate follow-up questions that does not rely on manual engineering of the RL action and state spaces and is suitable for multiple nutritional goals. At the same time, we will work on addressing the more general challenges of RL-based CA of aligning RL-based dialogs with human reasoning and generating human-understandable explanations of RL inferences and recommendations.

Intellectual Merit

The intellectual merit of this research is two-fold as it will advance the state of the art in RL-based CA for health coaching, and address some of the general limitations of RL-based CA that extend beyond the domain of health coaching. Specifically, the project will include:

A set of mechanisms for delivering nutritional micro-coaching using RL-based CA. Specifically, we will develop a generalizable solution to learning appropriate follow-up questions that can work across different nutritional goals. An ability to disambiguate user statements with appropriate follow-up questions is a common need in many conversational domains and our proposed approach may inform further work in this area beyond nutritional coaching.
An approach to using representations learned by RL during question-asking to generate tailored feedback on meal-goal alignment that incorporates relevant information in the form of explanations. Explainable AI is an area of active research, and our approach to generating explanations for meal-goal alignment will contribute to this ongoing research.
A new approach to aligning RL-generated dialogs with human reasoning to improve their perceived intuitiveness and quality. Specifically, we will develop a human-centered reward design solution that incorporates user-feedback on the dialog quality and will explore inverse reinforcement learning solutions to replicate expert dialog flow and reasoning that will be informative beyond the nutritional CA use-case of this project.

Broader Impact

We anticipate that this research will be consequential to society at large in several ways. This research can pave the way for new, human-centered approaches for designing health coaching tools for diverse populations. We anticipate that conversational interfaces can lower entry barriers for engaging with technological interventions in health and wellness for diverse communities with different degrees of education and experience with technologies, and reduce “intervention-generated inequalities” in health [17]. Furthermore, while we focus on nutrition coaching, new techniques for aligning RL with human reasoning and explaining its inferences and choices to users can increase its applicability to a broader set of problems and domains.

On a broader level, this research and educational plan take important steps towards further promoting human-centered approaches to data science, ML, and AI education that can have broader impact on future research in this field. This will include development of a new course on interaction paradigms for intelligent decision support, a summer training program on principles of human-centered AI for students in biomedical informatics and related fields, and a summer training program on human-centered design of data-driven systems for undergraduate and high school students as part of the undergraduate research program at the Department of Biomedical Informatics at Columbia University.

References

Gerard Anderson and Jane Horvath. 2004. The growing burden of chronic disease in America. Public Health Reports (Washington, D.C.: 1974) 119, 3: 263–270. https://doi.org/10.1016/j.phr.2004.04.005
Tessa Beinema, Harm op den Akker, Hermie J. Hermens, and Lex van Velsen. 2022. What to Discuss?—A Blueprint Topic Model for Health Coaching Dialogues With Conversational Agents. International Journal of Human–Computer Interaction 0, 0: 1–19. https://doi.org/10.1080/10447318.2022.2041884
Timothy W. Bickmore, Rebecca A. Silliman, Kerrie Nelson, Debbie M. Cheng, Michael Winter, Lori Henault, and Michael K. Paasche-Orlow. 2013. A randomized controlled trial of an automated exercise coach for older adults. Journal of the American Geriatrics Society 61, 10: 1676–1683. https://doi.org/10.1111/jgs.12449
Diabetes Prevention Program Research Group. 2009. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. The Lancet 374, 9702: 1677–1686. https://doi.org/10.1016/S0140-6736(09)61457-4
Alison B. Evert, Michelle Dennison, Christopher D. Gardner, W. Timothy Garvey, Ka Hei Karen Lau, Janice MacLeod, Joanna Mitri, Raquel F. Pereira, Kelly Rawlings, Shamera Robinson, Laura Saslow, Sacha Uelmen, Patricia B. Urbanski, and William S. Yancy. 2019. Nutrition therapy for adults with diabetes or prediabetes: A consensus report. https://doi.org/10.2337/dci19-0014
T. Gutnick, C. Davis, K. Reims, and H.L Gainforth. 2014. Brief Action Planning to Facilitate Behavior Change and Support Patient Self-Management. Journal of clinical outcomes management: JCOM 21, 1: 17–29.
Alankar Jain, Florian Pecune, Yoichi Matsuyama, and Justine Cassell. 2018. A User Simulator Architecture for Socially-Aware Conversational Agents. In Proceedings of the 18th International Conference on Intelligent Virtual Agents (IVA ’18), 133–140. https://doi.org/10.1145/3267851.3267916
Liliana Laranjo, Adam G. Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie Y. S. Lau, and Enrico Coiera. 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9: 1248–1258. https://doi.org/10.1093/jamia/ocy072
Elliot Mitchell, Noemie Elhadad, and Lena Mamykina. 2022. Examining AI Methods for Micro-Coaching Dialogs. In CHI Conference on Human Factors in Computing Systems (CHI ’22), 1–24. https://doi.org/10.1145/3491102.3501886
Elliot G. Mitchell, Rosa Maimone, Andrea Cassells, Jonathan N. Tobin, Patricia Davidson, Arlene M. Smaldone, and Lena Mamykina. 2021. Automated vs. Human Health Coaching: Exploring Participant and Practitioner Experiences. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1: 99:1-99:37. https://doi.org/10.1145/3449173
Jeanette M Olsen and Bonnie J Nesbitt. 2010. Health Coaching to Improve Healthy Lifestyle Behaviors: An Integrative Review. American Journal of Health Promotion 25, 1: e1–e12. https://doi.org/10.4278/ajhp.090313-lit-101
Mark Peyrot, Richard R Rubin, Martha M Funnell, and Linda M Siminerio. 2009. Access to diabetes self-management education: results of national surveys of patients, educators, and physicians. The Diabetes educator 35, 2: 246–8, 252–6, 258–63. https://doi.org/10.1177/0145721708329546
Neesha Ramchandani. 2019. Virtual Coaching to Enhance Diabetes Care. Diabetes Technology and Therapeutics 21, S2: S2-48-S2-51. https://doi.org/10.1089/dia.2019.0016
Diana Sherifali, Virginia Viscardi, Johnny Wei Bai, and R. Muhammad Usman Ali. 2016. Evaluating the Effect of a Diabetes Health Coach in Individuals with Type 2 Diabetes. Canadian Journal of Diabetes 40, 84–94. https://doi.org/10.1016/j.jcjd.2015.10.006
Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, Mass.
Chun-Hua Tsai, Yue You, Xinning Gui, Yubo Kou, and John M. Carroll. 2021. Exploring and Promoting Diagnostic Transparency and Explainability in Online Symptom Checkers. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–17. https://doi.org/10.1145/3411764.3445101
Tiffany C Veinot, Hannah Mitchell, and Jessica S Ancker. 2018. Good intentions are not enough: how informatics interventions can worsen inequality. Journal of the American Medical Informatics Association. https://doi.org/10.1093/jamia/ocy052
Ruth Q. Wolever, Leigh Ann Simmons, Gary A. Sforzo, Diana Dill, Miranda Kaye, Elizabeth M. Bechard, Mary Elaine Southard, Mary Kennedy, Justine Vosloo, and Nancy Yang. 2013. A Systematic Review of the Literature on Health and Wellness Coaching: Defining a Key Behavioral Intervention in Healthcare. Global Advances in Health and Medicine 2, 4: 38–57. https://doi.org/10.7453/gahmj.2013.042

Human-Centered Reinforcement Learning for Personalized Coaching in Health

Introduction

Intellectual Merit

Broader Impact

References

Human-Centered Reinforcement Learning
for Personalized Coaching in Health