Skip to content

Latest commit

 

History

History
62 lines (44 loc) · 5.46 KB

README.md

File metadata and controls

62 lines (44 loc) · 5.46 KB

This repo contains the official implementation for the paper A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents, published as a demonstration at the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS),

by Rajmund Nagy, Taras Kucherenko, Birger Moëll, André Pereira, Hedvig Kjellström and Ulysses Bernardet.


We present a framework for integrating recent data-driven gesture generation models into interactive conversational agents in Unity. Our video demonstration is available below:

video demonstration

Instructions for running the demo

This branch contains the Blenderbot version of our implementation with a built-in chatbot and TTS. You may visit the dialogflow_demo branch for an alternative version that integrates DialogFlow to the project for speech generation.

Please follow the instructions in INSTALLATION.md to install and run the project.

Architecture

Our framework is designed to be fully modular therefore it can be applied to different voices, chatbot backends, gesture generation models and 3D characters. However, using it in a new project will require some coding for which we provide guidance below.

Unity integration

The source code of the Unity scene with DialogFlow integration is available on this link, while the Blenderbot version is available here. The relevant C# scripts are found in the Assets/Scripts/ folder. The entry point of the python code is the main.py file, while the bulk of the implementation is found in gesture_generator_service.py.

  • The C# and the python scripts comunicate over ActiveMQ, as implemented in the ActiveMQClient.cs and the messaging_server.py files.
  • Once the generated motion arrives to the 3D agent, the MotionVisualizer.cs file animates its model by modifying the localRotation values of each joint.
    • There is no clear convention of how 3D models handle joint rotations in Unity. The 3D joint angles generated by Gesticulator follow the BVH format; applying them to new character models will require Unity knowledge and some tinkering.

Chatbot backend

Gesture generation

  • We use the Gesticulator model in both demonstrations, which generates motion as 3D joint angles using speech text and audio as input.
  • In order to use other models, the following have to be considered:
    • 3D joint angles are necessary to animate the 3D model in Unity, therefore the gesture generation model must return the motion in that format.
    • For any model, an interface must be implemented for getting the generated gestures for any speech. The GesturePredictor class shows how we implemented that for Gesticulator.
  • StyleGestures is a good alternative model with a compatible codebase.

Acknowledgements

The authors would like to thank Lewis King for sharing the source code of his JimBot project with us.

Citation

If you use this code in your research, then please cite it:

@inproceedings{Nagy2021gesturebot,
author = {Nagy, Rajmund and Kucherenko, Taras and Moell, Birger and Pereira, Andr\'{e} and Kjellstr\"{o}m, Hedvig and Bernardet, Ulysses},
title = {A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents},
year = {2021},
isbn = {9781450383073},
publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
address = {Richland, SC},
booktitle = {Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems},
location = {Virtual Event, United Kingdom},
series = {AAMAS '21}
}