Approach & Method
Approach & Method
LLMNode decodes the textual-based natural language conversations.
CLIPNode provides a visual and semantic understanding of the robot's task environment.
REM node abstracts the high-level understanding from the LLMNode to the actual physical robot's actions.
ChatGUI serves as the user’s primary textual-based interaction point.
SRNode decodes the vocal or auditory natural language conversations from humans.