Exemplars of ergonomic portable user interfaces that can be used to control robots remotely.
(a) A portable control console with IMUs for motion tracking
(b) a handheld controller with a Depth Camera and IMUs for motion tracking
(c) a handheld controller with Stereo Camera and IMUs for motion tracking
(d) a Wearable controller with embedded IMUs for motion tracking.
To offer users a more intuitive robot control experience, we have integrated natural language-based robot control into the HuBotVerse framework. Utilizing the capabilities of Large Language Models (LLMs), such as GPT3.5 Turbo, robots can now comprehend users' spoken instructions. Furthermore, by transmitting audio information to ChatGPT, the system formulates a response containing Python code commands for robot movement. Once these codes are executed, users can effectively direct the robots.
More specifically, we have embedded Speech-to-Text and Text-to-Speech functions into our proposed framework, leveraging Azure Cognitive Services and AudioSource components.
The procedure commences with users voicing commands, which are then recorded. This audio information is subsequently transcribed into text format.
Following this, the content of the transcribed text undergoes processing via the APIs of GPT3.5 or 4. Before applying these APIs, we need to tune parameters, such as temperature, max tokens, presence penalty, etc. The parameter tuning and configuration of prompts essentially tailor the model's response generation. To facilitate the easy integration of new robotic systems, we can produce a Python file for robot repositioning using ChatGPT based on pre-established prompts.