Virtual 3D Human Chatbot
by Ruben de Boer for computer science honours 2023-2024
by Ruben de Boer for computer science honours 2023-2024
As part of the Disciplinary Honours Program at Utrecht University, I worked on the 3D Digital Human Chatbot project. The goal of this project was to develop an interactive chatbot specifically designed for the Game and Media Technology Master's program. The chatbot is intended to assist prospective students, particularly during events like the Master's Open Day, by answering their questions about the program.
What sets this chatbot apart is its human-like 3D interface. Rather than just exchanging text, users can engage in a more natural and immersive conversation. The bot features a digital human face and body, and it communicates using voice, facial expressions, gestures, and gaze, creating a more lifelike and intuitive interaction experience.
In this project, my main focus was on developing the prototype, which will later be further developed by a new team. The prototype features a lifelike (though slightly uncanny) digital human assistant. It delivers fast response times to ensure a smooth and comfortable user experience. The bot engages in natural, conversation and is capable of showing subtle human-like behavior like emotions and blinking. We conducted user tests to evaluate the prototype, test participants where intrigued (but a bit unsettled) by the human like appearance and behaviour of the assistant and were impressed by its ability to respond clearly and accurately to questions about the Master's program.
The people that have worked on this project are me (Ruben) and 2 master students, Pascal and Romy, who conducted their thesis work as part of the project. Romy focused on the development of human part of the project, while Pascal and I were mostly responsible for building the chatbot component. The project was supervised by Dr. Zerrin Yumak from Utrecht University.
As mentioned above I was partly responsible for developing the chatbot's core functionality. Specifically, I implemented the connection to OpenAI's API and integrated it with LangChain and NVIDIA Audio2Face, using Python. The main technical challenge maintaining low-latency responses while feeding custom data to the chatbot through the network (which costs quite a bit of overhead). Another key focus of my work was making the dialogue feel human like, I researched and applied prompt engineering techniques to guide the chatbot’s behavior, aiming to make its responses feel more engaging, and fluid. Finally, I studied relevant literature to guide our design choices and deepen my understanding on the subjects of LLM's and HCI.
The assistant is built by integrating several existing technologies. The dialogue is powered by OpenAI’s GPT4o which is given custom data from the university website with Langchain. When a user asks a question, it is sent to ChatGPT along with tailored instructions and relevant university information to guide its behavior and ensure accurate responses. ChatGPT answers in text which is then converted into speech (technology by OpenAI). This audio is processed by NVIDIA’s Audio2Face to animate the assistant’s lip movements in real time. The final output. including speech, lip sync, blinking, and other facial animations, is rendered in Unreal Engine using MetaHuman which creates a realistic and responsive virtual assistant.