My Internship

My internship is with Dr. Elizabeth Umberfield at the Regenstrief Institute, where her research aims to amplify patients’ choices regarding their health care using a range of informatics methods. In this project, we aim to automatically identify documentation of goals of care conversations (GOCC) between clinicians and patients. Patients articulate their goals, values, and preferences about end-of-life care during GOCCs, so that their care can be guided by these goals. These conversations are incredibly important for health care providers to make informed decisions regarding their patients end-of-life care.

The problem is that goals of care information can become buried and difficult to find within clinical notes in the medical record. We want to be able to determine if we can reliably predict which clinical notes contain narrative descriptions of GOCCs using natural language processing methods. In accurately predicting which notes contain documentation of GOCCs, we can potentially identify and classify that documentation within clinical notes. Automatically identifying these source notes would support health care providers in identifying this critical information and acting on patient preferences. In the future, the natural language processing algorithm can be implemented and embedded in future tools such as clinical support systems that will automatically query advanced care planning documentation. This might decrease the time and cognitive burden on clinicians spend so they are able to provide more concordant care with patients being the central decision-makers regarding their own care.

Learning and Skills

The LHSI experience will allow me to learn the methods that are used to conduct research and further explore what area of research I want to go into post-graduation. I hope to better my critical thinking skills outside of a classroom setting as well as develop my communication skills in writing and displaying research. My role lies in supporting the Qualitative Analysis scope of the research project where I will aid in the completion of a Literature Review regarding the use of POLST/MOLST documentation forms. Additionally, my prior coding knowledge is used to assist in the data preparation and management prior to the development of a Natural Language Processing (NLP) algorithm to identify and extract goals of care conversations within clinician notes. As I am taking informatics courses simultaneously with this internship, I will be able to directly apply gained programming knowledge to the tasks at hand. Thus far, I have assisted in data extraction and normalization of this information. Additionally, I have contributed to the writing and editing of a textbook chapter titled “Syntactic Interoperability and the Role of Syntactic Standards in Health Information Exchange.” In this aspect of research, I have to continually work on my communication skills and how to express my ideas while matching the writing style of others so the chapter can flow seamlessly through different sections.

I am most excited to start the merging and cleaning of data sets to begin the Natural Language Processing aspect of the project. I have entry level knowledge in various coding languages, but have never breached the realm of machine learning. I want to expand my knowledge in this area and challenge my intellectual capabilities towards data analysis using coding solutions. As we start cleaning the preliminary data, I have to actively seek out solutions that may lead to failure. Failure is an intrinsic part of finding an effective solution as it provides new information that will further develop a strategy to solve the presented problem. I try to resolve most problems on my own and provide unique solutions but seek help when I reach a roadblock. In this way, constructive criticism is an extremely helpful guide towards the bigger picture of the research project.

One of my courses, Health Information Management Systems (HIM-M), fits directly into the topics of my internship. We regularly work in EHRGo, which is an educational Electronic Health Record that simulates how information is entered and queried in an EHR. With my internship site, we work with the applications of an EHR in advanced care planning and extracting goals of care conversations from narrative clinical notes. In analyzing and writing the chapter on Syntactic Interoperability, I was already familiar with the concepts that were presented in my HIM-M course material. For instance, we learned about the implementation of HL7 during weekly readings, but in my internship, I was actively writing excerpts about HL7 and FHIR (two standards for exchanging information within an EHR). In tandem with my internship, I am able to better understand my course material through the perspective of a professional in clinical informatics.

During my time as an intern, I have learned a great deal about myself and how to best tackle challenges in the data preparation stage prior to creating an NLP algorithmic model. Through this experience, I can begin to call myself an informatician or data analyst from the skills I've developed and grown. A skill I have continued to develop is my ability to adapt prior knowledge and use online documentation to find a solution that works for the current problem. While I learned the basics of common Python packages through my coursework, I have a developed a much better working knowledge through application. Additionally, I have learned an exorbitant amount about the field of clinical informatics and more specifically Advanced Care Planning (ACP). Before participating in the LHSI program, I had no prior knowledge of how clinical notes are extracted from an EHR or used on a daily basis to make decisions for a patient’s plan of care. ACP documentation becomes more prominent when these decisions start revolving around a patients’ preferences towards their end-of-life care. An important distinction I have been versed on during my time at the Regenstrief Institute is the difference between goals of care conversations (GOCC) and ACP. Patients articulate their goals, values, and preferences about end-of-life care during GOCCs while ACP details the physical documentation of these preferences.

My contributions have greatly impacted the overall progress of the project as I have been leading the data preparation and management prior to NLP analysis. When there were data discrepancies between the two datasets we are working with where GOCCs couldn’t be automatically mapped to their clinical source note using a Python program, I manually completed the mapping process using excel functions. This step was essential to create the training and testing dataset that would power an NLP algorithm. Without coding the source notes for the presence/absence of a GOCC, an NLP predictive model could not be developed. After this step was accomplished, I used CLAMP, a natural language processing software, that converted the source note text files into a xmi format in order to annotate them for their GOCCs. Meaning, for the source notes that contained a GOCC, these GOCCs needed to be annotated, or highlighted in xmi markup, to indicate where these were contained within the note. These xmi files are used to train a NLP algorithm for automated prediction of GOCCs.

From the spring self-evaluation, I feel that I am always able to prioritize, plan, and meet my deliverables in the time frame set by my supervisor. Since I typically work independently on all of my projects, I feel that I have a grown in my ability to adapt and apply previously learned techniques to a variety of situations. I continually work through my own challenges, and when I am unable to resolve them on my own, I always reach out for assistance. Due to the independent nature of my work, I would still like to develop my interpersonal skills in collaborating with others. Recently, my supervisor took on another intern through the Regenstrief Institute and we are starting to work together on projects that require us to share the workload. For example, I had to train her on the project I’ve been working on for the past two semesters (matching GOCCs to source notes). Through this process, I was able to work on my communication and leadership skills to convey the projects goals and how to complete the required tasks. I had to learn how to train another intern and delegate the workload while also taking leadership on the overall data management of datasets. Through this process, I was able to reflect on what aspects of my communication still need work based on the feedback given and what needed to be explained further. From this feedback, I can adjust my training method as we start to annotate the source notes using the CLAMP software.

In December 2021, I participated in the Indiana University Undergraduate Research Conference (IUURC) which has been one of my favorite experiences thus far. The poster I presented is shown below. I was able to practice writing an academic abstract and develop a poster to display the work we’re conducting. At the conference, presenting our poster was an enriching experience as I was able to discuss the project to individuals from a wide range of fields. Additionally, I was also able to see what other undergraduates from other IU campuses were working on in their respective projects. It was great to have discussions with other undergraduate researchers about the projects they were passionate about in an environment that breeds development. Below is the poster I presented as well as pictures from the event.

The Workplace

The most prominent thing I’ve noticed of professionals in the workplace is how they carry themselves. There is a level of competence of everyone I’ve seen as they are experts in their chosen field. Meaning, they are able to answer any question about their research project with integrity and clarity. As a professional, I want to grow more in my interpersonal communication skills to be able to succinctly express my ideas verbally. During meetings, my supervisor allows to me to input my own ideas or present challenges to hers. I often have to present at the beginning of these meetings to establish what progress has been made and what next steps need to be taken. I find constructive criticism and continual feedback very helpful in developing my skills as a professional and enjoy whenever my supervisor discusses what I did well on and what needs work when I complete a project.

Going into my internship, I expected to handle projects of lesser importance. However, my intern advisor has involved me in every aspect of research. I have started to view my position less as an intern and more as an undergraduate research assistant. She has stated that while she has research milestones she wants to achieve, as a mentor her goal is to aid in my success past the LHSI internship. Additionally, I expected I would have strict guidelines with the projects I was assigned. To my great surprise, my supervisor typically provides me loose guidelines to see what I come up with first before inputting constructive criticism. With this method towards the beginning of the internship, my supervisor now knows the level of my work and expects that with new assigned projects. This helps me set expectations for myself in what I view as acceptable work to present. I have a lot of flexibility on when I can complete work and get deliverables in by. Due to my independent nature, I enjoy working in this type of environment while still getting to interact with other professionals once I have a product to present. The structure of independent work until you need feedback is helpful as I don’t feel like I am being micro-managed and have room to figure out solutions on my own or until I need outside help.

Starting this experience, I was generally surprised about the overall dynamic of the mentor-mentee relationship. While I’ve taken part in mentorship programs such as the Honors Peer Mentor Program, I’ve gained so much insight into the professional realm of this field through my mentor. Dr. Umberfield is incredibly supportive of all my endeavors, even offering to proofread important emails or applications to external sites. Every week, we have a specified meeting time where we focus on a professional development topic. For example, one week we discussed what a salary negotiation looks like and how to initiate those conversations. These meetings are extremely helpful in developing the soft skills that I can apply after this internship experience. Through the partnership with Dr. Umberfield, I feel I have become a much more well-rounded professional thanks to my mentor.

The Regenstrief Institute has a huge network of working professionals that are all experts in their respective fields. In this way, when we need an outside opinion on a certain aspect of the project, I am able to interact with individuals outside of my daily network. Every individual comes from a different background, many from varying cultures. This allows for a diverse collaboration of professionals, yielding new perspectives to be brought in on a project. We continually communicate with our partners at the University of Texas (UT) that are experts on the Natural Language Processing (NLP) side of our study. I think it’s not only important to interact with those outside of our own cultural backgrounds, but also in terms of experience and fieldwork. While our partners at UT are experts in NLP analysis, we had to explicitly discuss the application of the NLP analysis in terms of GOCCs so they could fully understand the scope of the study. This was interesting to see how the terminologies of application versus technical language can create a divide in understanding. This process also stresses the importance of communication and learning to decrease specialized jargon when speaking to others outside ones direct field.

At my internship site, there is continual collaboration between departments. Since the Regenstrief Institute has such a wide breath of expertise, PIs are able to gather a team to accomplish a set goal within the local environment on site by pulling individuals from different areas. This type of collaboration across department requires constant communication. My internship site uses Microsoft Teams in order to exchange updates on projects either in 1:1 chat messages or group threads of multiple investigators. The tone in this medium is very informal and its usage is for check ins on progress and deliverables as well as questions that can be answered without a face-to-face meeting. Formal requests are sent via email with the entire team cc’ed so they are aware of what is going on without necessarily requiring their reply.

I feel the environment at my current internship site exemplifies my ideal workplace culture. I value that everyone at my workplace is able to communicated to others outside of their discipline and aren’t intrinsically familiar with the subject matter on a level that is welcoming. This instills a foundation of communication between individuals in the workplace where it isn’t necessary that the workplace population is of a homogenous mindset. I feel this enhances the diversity seen as different backgrounds and disciplines are converged in a single location. Additionally, I greatly enjoy the structure of independent work until you need feedback so I am able to resolve challenges on my own. One of my supervisors values short meetings, which has made me amply prepare for meetings so they are focused on the problem or future progress.

Since the projects within my internship site are independent in nature, interacting with those outside my direct cultural group occurs most often in receiving critical feedback. During these instances, my attitudes are continually challenged as I'm presented with different ways to solve a problem. For example, if I’ve hit a road block in my code and present the problem and attempted solutions, a differing perspective might provide insight to a path I hadn’t even considered. In the future, I would like to grow as professional on the basis of direct collaboration on a project with individuals from different backgrounds of my own.

Successes and Challenges

In writing sections of the Syntactic Interoperability textbook chapter, I will be a co-author on my first publication once the chapter gets submitted and accepted to the journal. I term this as a huge success as I thought having my name in a publication was only an option post-graduation. Collaborating with a group of researchers on a piece of writing was very validating as I was treated as an equal despite my limited experience. The data preparation stage prior to a natural language processing algorithm was quite extensive, requiring multiple python scripts for the datasets to be ready for text mapping. Every time I was able to get a programming piece to run without error was a tiny success as I was able to move forward with the next task.

While each time a script ran smoothly was a success, there were many challenges faced to get to that point. For instance, when I was writing the text mapping program that indicated if a clinical note contained a goals of care conversation, I kept getting no result. This was an unexpected result as the algorithm I wrote worked on a dummy dataset I made (I do this so I could work with a smaller dataset versus one with 31k clinical note text files). To fix this problem, I had to write a separate python script to normalize both the clinical notes and GOCC excerpts by removing xml tags, non-standardized character encodings, and white space variability.

As we enter the spring semester, I will keep working on my problem-solving skills as we move towards developing the natural language processing algorithm. Being able to identify mistakes and take actionable steps to fix them is a valuable skill to have. It is even more important to view these mistakes as learning opportunities instead of failures.

From my previously stated challenge, I later found major discrepancies between the two working datasets after normalization for xml tags, non-standardized character encodings, and white space variability. This meant I couldn’t design a program that would automatically conduct 1:1 GOCC excerpt to clinical source note matching. To remedy this problem, I have been manually matching the excerpts to their source notes through various Excel functions. While this is tedious, it is more thorough, and I can see exactly why my program didn’t initially work. This setback has led to a new success with being termed as a project manager on this project. In the upcoming weeks, I will be training other interns to conduct these manual matches so the project will hopefully be concluded in the next few weeks.

A new challenge I have faced is taking on another project at my internship site under a different supervisor that it outside my work under Dr. Umberfield. The challenge I am working through is the aspect of time management and splitting my time between these two projects. While I want to perform to the best of my ability in both spaces, it is difficult knowing I could get further in either if I could allocate my entire weeks time to a single project. Additionally, the actual work of the secondary project is outside my scope of working knowledge, so I continually have to push myself and consistently learn about new aspects of what I can use to create a working solution.

The most challenging part of this internship was consistently pushing my knowledge base to meet the technical needs of the projects I’m working on. I had to use my informatics coursework as a foundation to build upon instead of something to fall back on. In this way, I was continually learning on both the general and technical side. Coming into this experience, I faced a steep learning curve as I wasn't aware of the interworking processes of a health information exchange, how that information aids patient-driven care, and how it can further be improved. Co-authoring a textbook chapter on messaging standards in health information exchange allowed me to rapidly overcome this learning curve through reading and writing a surplus of material. In this process, I was able to practice professional writing to efficiently convey these ideas to readers that were learning just as I did. Additionally, becoming deeply familiar with datasets extracted directly from the INPC allowed me to understand how data was structured in an HIE so I was then able to restructure it for the projects purpose.

Most recently, the greatest success with the GOCC NLP project is finally completing the training and testing dataset prior the development of an NLP algorithm. The finalization of these dataset has allowed the study team to begin to plan the next steps for the project. To read more about the results, conclusions, and next steps of this project head to Project section under My Internship!


