Conversational Agents (F20/F21 CA)
Class times: monday 9-11am: room GRID: Inspire & Collaborate 2: slides, videos etc are available on CANVAS: https://canvas.hw.ac.uk/courses/20751/modules
Project group meetings: weds and thurs 10am and 11am (allocation TBD): rooms:
BBC4: Alana on "The Joy of AI" with Jim Al Khalili
Course descriptor:
This course aims to give students the opportunity to develop:
An extensive, detailed and critical knowledge of design, implementation and evaluation techniques for conversational agents and spoken language processing.
An awareness of current research and emerging issues in the field of conversational agents and spoken language processing.
A range of interdisciplinary research methods and specialised practical skills involved in building working conversational interfaces.
This course covers current and emerging topics in conversational agents, spoken language processing, and multimodal interfaces, including:
Introduction to research areas, such as spoken dialogue systems, multi-modal interaction, natural language processing, and human-robot interaction (HRI).
Spoken input processing and interpretation.
Interaction Management.
Output generation, multimodal fission, speech and gesture synthesis
System development and evaluation.
Assessment
- by coursework and class participation: Presentation 15% + Demonstration 15% + Written report 60% + self-reflection-report 10%
Student Project description
Some ideas for projects:
next generation conversational search using Large Language Models (LLMs)
control a robot in an elderly care home, which will encourage people to play games and talk with each other (Furhat and ARI robots)
receptionist robot for the National Robotarium building (Furhat and ARI robots)
collaborative conversational systems for human-machine shared tasks.
Each project should address several of the following topics:
design: explain and motivate possible designs for your chosen system, including an analysis of the target user group/ and or a relevant corpus analysis. Compare and contrast at least 2 possible designs;
analyse: based on research literature and previous approaches relevant to your system, write an analytical review of the main problems and research issues relevant to your problem;
develop: use some of the tools listed below (or other relevant software, or your own code) to develop a working prototype of your system design;
evaluate: using either a working prototype or a functional mockup of your system(s), perform a contrastive evaluation of 2 or more possible designs, using qualitative and/or quantitative methods. Analyse your results for statistical significance.
Ethical issues, ethical approval: https://heriotwatt.sharepoint.com/sites/EthicsManagementSystem
Schedule
Week 1: Course introduction (OL) :
Intro to spoken dialogue systems and conversational agents. Intro to Natural Language Processing. System architectures.
Sci-fi vs reality: HAL, "Her", ELIZA, WITAS, Siri, Google Assistant, Alexa, ChatGPT, Alana
Alana system demo
Lab: group projects (presentations and discussion)
Reading: Jurafsky and Martin (J&M) chapter 24
Week 2: Pragmatics of conversation
What do we do when we have a conversation?
Speech acts, Relevance, Context
Grounding, Repair, Turn-taking
Lab: analysing and annotating dialogue data: bAbI, bAbI+, Alana
Lab: data collection and annotation
Week 3: Natural Language Understanding (NLU):
Introduction to Natural Language Understanding (NLU): rule-based systems, statistical methods, distributional semantics.
RASA NLU tools and Furhat NLU tools
Lab: using RASA and Furhat
Week 4: Evaluation and Data Collection (VR):
evaluation methods
crowdsourcing dialogue data
Class exercise: evaluation plan for each group
Week 5: Student Project presentations + feedback session (ALL):
Your presentation should cover:
What will your system do?
Example dialogues; screen shots / mock-ups.
What is the main research question / focus of the project? E.g. SLU, DM, NLG or ….?
What components will it have? Which tools / subsystems will you use?
How will you evaluate it? e.g. User tests? Simulations?
What are your roles in the project?
Project plan – from week 5-12
Week 6: Consolidation / project work
Week 7: Ethics/Saftety and Response Generation NLG) (VR):
Templates, grammars, aggregation, prosody. SimpleNLG, ...
Transformers: write with Transformer
Lab: (1) chatting to rule-based and neural chatbots, such as NeuralConvo (2) language models/ writing with transformers, (3) playing with word embeddings, (4) testing visual dialogue -- (exercises described in slides)
Reading: J&M chapter 20
Week 8: Neural Response Generation + Huggingface tutorial
Week 9 Dialogue management: (OL):
DM methods (VXML, AIML, rules, plans, RL ...)
Reinforcement Learning for Dialogue Management
Lab: using RASA core
Reading: J&M chapter 19
Week 10: TTS, HRI, and multimodal systems :
Lab/practical: developing emotional embodied conversational agents.
Week 11: Student project demos/ reports + feedback session (ALL):
Your demo presentation should be a mixture of slides and video or a live demo – recommended length about 10 mins - used to explain:
- the overall concept and aims of your system
- the main problems to be solved in creating your system
- what your demo is showing – i.e. what your system’s new features are
- the main software / NLP modules that your team has worked on
- the evaluation plan (and any results you have)
- how it could be improved / future work
- roles of different team members in the system development
Week 12: Project troubleshooting session, report writing (ALL):
Deadline: student project report (minimum 6 pages, maximum 8 pages: appendices and bibliography not counted in page count: ACL conference-paper style, use either the latex template, or word template ) + individual self-reflection report (1 page): send individually by by email with title: Self-reflection report <MY USER NAME> <MY STUDENT ID>
Deadline: Individual self-reflection report: Write a report (maximum 1 page) on your experience of the group process answering the following questions:
1. How did you plan and manage your own work within the group?
2 . To what extent did you independently solve problems and take initiative within the group?
3. How did you take responsibility for your own and other’s work by contributing effectively and conscientiously to the work of your group?
4. How did you actively maintain good working relationships with group members?
5. Did you lead the direction of the group project or any aspect of it?
6. Critically reflect on your roles and responsibilities within the group, and the roles and responsibilities of the other members.
Resources
SIGDIAL conferences: 2015, 2016, 2017, 2018, 2019 ...
Software and Tools
RASA masterclass / youtube channel
Chatito: https://github.com/rodrigopivi/Chatito
InProTK: video demonstration with OpenDial integration [here]
DyLan: semantic parser
LUIS: example-based trainable SLU via http
Cereproc: speech synthesiser, free voices for windows and Mac OSX
NVivo: software for qualitative research
NLTK: Natural Language ToolKit: Information Extraction, chunking, tagging, parsing, NER
CoreNLP: Stanford parsing and NLP tools
AIML: free online course
Praat: speech analysis software
ELAN: annotation tool
Android speech API: speech recognition and synthesis on Android
OpenEars: free speech recognition and synthesis on iPhone
Web speech API for Chrome
KALDI: speech recognition toolkit
Boxer: language understanding
WIT AI : API for spoken language understanding
OpenDial: dialogue system toolkit
IrisTK: multimodal dialogue system toolkit
Voice XML
SPSS: statistical analysis (this is installed on the university computers)
Sirius: open source personal assistant (like Siri)
Balsamiq: wireframing / mockup tools
Relevant videos
Dialogue Semantics and Pragmatics: David Schlangen: part 1 part2
Interaction Lab videos: SpeechCity, JAMES, ECHOES, Parlance etc
Incremental spoken dialogue system: "Numbers"
Google video on speech understanding, deep neural networks
WIT AI : API for spoken language understanding
Geoff Hinton's Royal Society Lecture on Deep Learning
Apps to play with
Google Now
Assistant (ai.api)
Siri
Cortana
Indigo
SpeechCity
Assessment -- see mark sheet below
![](https://www.google.com/images/icons/product/drive-32.png)