-
Notifications
You must be signed in to change notification settings - Fork 15
ASR Flipper2.0 MeaningMiner Integration Demo
This page explains the integration of ASR, Flipper2.0 and Meaning Miner inside Greta.
The Automatic Speech Recognizer (ASR) component identifies full phrase in the spoken language as a person is speaking, and converts them into machine-readable format.
Flipper2.0 (Flipper) is a dialogue engine that aims to help developers of embodied conversational agents (ECAs) to quickly and flexibly create dialogues. Flipper provides a technically stable and robust dialogue management system to integrate with other components of ECAs such as behavior realizers.
Meaning Miner focuses on representational gestures which are gestures used to accompany and illustrate the content of the speech. In particular, this component automatically produces metaphoric gestures that are aligned with the speech of the agent in terms of timing and meaning.
Asr-Flipper2.0-MeaningMiner demo can be launched using "Greta - ASR Flipper2 and Meaning Miner.xml"
configuration file.
Flipper receives the input from speech recognizer, evaluates its templates, finds appropriate rules to fire, executes chosen rule, update information state, and finally sends generated FML messages (FML-APML message) to MeaningMiner component to be played with Greta.
This Demo uses the predefined set of FML templates. One of the FML templates can be chosen by the flipper depending upon the template rules and the user input. One of the important aspects of this demonstration is that the Fml parameters of these templates can be dynamically modified during the execution time. That is the same template can be used to express different expressions of the agent depending upon the application logic and user input.
The interaction between different components (FlipperDemo, SpeechRecognizer, and MeaningMiner) happen using ActiveMQ messages at the port 61616
at the localhost. These modules subscribe to or publish the messages using ActiveMQ topic as shown in the following three figures.
Fig 1: FlipperDemo GUI frame configuration specifying ActiveMQ configuration and Flipper Configuration
Fig 2: ASR configuration specifying ActiveMQ configuration
Fig 3: Meaning Miner configuration specifying ActiveMQ configuration to receive input from Flipper
The following section illustrates the pipeline of the architecture for the demo. The class diagram of architecture is shown in Fig4.
Fig 4: Class Diagram for ASR-Flipper-MeaningMiner demo
The FlipperDemoGUIFrame class
collects activeMQ parameters and Flipper configuration specifications as shown in Fig1.
This class instantiates FlipperLauncherMain class
and initalizes flipper with the configuration parameters.
The configuration parameters include the path for the property file (.properties)
and the repository for the FML templates.
flipperDemo.properties
file contains the input specifications for the TemplateController
of Flipper. This property file mainly includes the list of templates
used by flipper.
Flipper instantiates the information state that stores information about the current state of the interaction.
<is name="example">
{
"init" : {},
"input" : {
"speech" : ""
},
"core" : {
"uIntent" : "",
"aIntent" : ""
},
"output" : {
"speech" : ""
},
"agent": {
"log": "",
"fileName": "",
"fml": {
"template": "",
"parameters": {}
}
}
}
</is>
Flipper initializes ASRInputManager and FMLGenerator as it executes the initializeModules
template.
<!-- Initialize the modules -->
<template id="initializeModules" name="initializeModules">
<preconditions>
<condition>is.example.init === "{}"</condition>
<condition>helpPrint("initializing")</condition>
</preconditions>
<initeffects>
<method name="init" is="is.example.init.ASR">
<object persistent="asr" class="greta.FlipperDemo.input.ASRInputManager">
<constructors/>
</object>
</method>
<method name="init" is="is.example.init.agent">
<object persistent="fmlGenerator" class="greta.FlipperDemo.dm.managers.FMLGenerator">
<constructors/>
</object>
</method>
</initeffects>
</template>
ASRInputManager recognizes the user's speech input and publishes the transcribed text. Flipper receives the user input, calculates suitable user intent, and updates the information state.
<!-- Set user intent when speech -->
<template id="setUserIntent">
<preconditions>
<condition>is.example.input.speech !== ""</condition>
</preconditions>
<effects>
<assign is="is.example.core.uIntent">getUserIntent(is.example.input.speech)</assign>
<assign is="is.example.input.speech">""</assign>
</effects>
</template>
Based on the user's input intent, the flipper computes the agent intent and chooses the corresponding FML output file name and list of fml parameters.
<!-- Set agent speech based on agent intent -->
<template id="setAgentSpeech">
<preconditions>
<condition>is.example.core.aIntent !== ""</condition>
</preconditions>
<effects>
<assign is="is.example.output.speech">setAgentSpeech(is.example.core.aIntent)</assign>
<assign is="is.example.agent.fml.template">setAgentSpeech(is.example.core.aIntent)</assign>
<assign is="is.example.agent.fml.parameters['emotion.e1.type']">"joy"</assign>
<assign is="is.example.core.aIntent">""</assign>
</effects>
</template>
Flipper now executes the template executeFMLTemplate
which actually launches the executeTemplate
behavior of the FMLGenerator
class. This class uses the that retrieves the content of the specified FML file and replace the fml Parameters.
The newly modified content (FML-APML) is sent to MeaningMiner
component in order to play and dynamically generate the metaphoric gestures automatically.
<!-- Say agent speech -->
<template id="executeFMLTemplate" >
<preconditions>
<condition>is.example.agent.fml.template !== ""</condition>
</preconditions>
<effects>
<behaviour name="executeTemplate">
<object class="greta.FlipperDemo.dm.managers.FMLGenerator" persistent="fmlGenerator"></object>
<arguments>
<value class="String" is="is.example.agent.fml" is_type="JSONString"/>
</arguments>
</behaviour>
<assign is="is.example.agent.fml.template">""</assign>
<assign is="is.example.output.speech">""</assign>
</effects>
</template>
The original Flipper uses the absolute path of the ClassLoader to read the specified files (e.g., templates, behaviors etc.). It results in an issue to use this as a library (.jar) inside Greta. Thus, we have modified the original Flipper code a little bit in order to replace the absolute path of the classLoader, with the relative path using fileInputStream.
For example,
InputStream libStream = this.getClass().getClassLoader().getResourceAsStream(libPath);
is replaced by
InputStream libStream = null;
try {
libStream = new FileInputStream(libPath);
} catch (FileNotFoundException ex) {
Logger.getLogger(TemplateController.class.getName()).log(Level.SEVERE, null, ex);
}
This change is affected in the following flipper source files: TemplateController.java
and Database.java
.
The newly generated Flipper2 jar library is compatible to be used as a library.
Different functionalities and an example to explain how the flipper templates work is described in the following paper.
@inproceedings{flipper2_IVA18_VanEtAl,
author = {van Waterschoot, Jelte and Bruijnes, Merijn and Flokstra, Jan and Reidsma, Dennis and Davison, Daniel and Theune, Mari\"{e}t and Heylen, Dirk},
title = {Flipper 2.0: A Pragmatic Dialogue Engine for Embodied Conversational Agents},
year = {2018},
publisher = {Association for Computing Machinery},
doi = {10.1145/3267851.3267882},
pages = {43–50},
numpages = {8},
series = {IVA '18}
}
The Flipper2.0 enables the user to specify XML-Templates which, based on an Information State (IS) of the current situation, can modify this Information State and execute behavior. The Templates specify what to do and when.
The documentation about flipper templates can be found at: https://github.com/ARIA-VALUSPA/Flipper/wiki/manual
This document first describes the structure of the Templates, and how to create new templates. Then it describes how the system works, and it will describe the general structure of the system.
The most important task of the Flipper is to select the templates to execute. This selection is done each time the Templates are evaluated. How often this is done depends on the user. When evaluating the Templates, the flipper will take the Information State and check the preconditions of all stored Templates. Based on the results, it will take the following actions:
- Templates without a Behaviour element (which only update the IS) are always executed.
- If there are 0 Templates with a Behaviour:
- Templates with a Behaviour that fulfills all preconditions except a trigger will be prepared.
- No behavior is executed at this moment.
- If there are multiple Behaviours with the highest quality-value, then 1 is chosen randomly.
When integrating this system into your own project, the following steps have to be taken:
- Write your templates in XML, and put those files in the 'templates' directory.
- Write FML-APML templates corresponding to your application need, and place them in FMLTemplates directory.
-
In your project:
- Write BehaviourClasses in your project that perform the behaviors you specified in the Templates.
- Create a new TemplateController properties (.properties) file, give it the names of the Template-files.
- Initalseze
FlipperLauncherThread
fromhmi.flipper2.launcher.FlipperLauncherThread;
using by passing property file (.properties) path.- Create an Information State.
-
Write the functions (javascript) you specified in the Templates.
-
These functions may include some computation logic and the logic to update the information state.
Advanced
- Generating New Facial expressions
- Generating New Gestures
- Generating new Hand configurations
- Torso Editor Interface
- Creating an Instance for Interaction
- Create a new virtual character
- Creating a Greta Module in Java
- Modular Application
- Basic Configuration
- Signal
- Feedbacks
- From text to FML
- Expressivity Parameters
- Text-to-speech, TTS
-
AUs from external sources
-
Large language model (LLM)
-
Automatic speech recognition (ASR)
-
Extentions
-
Integration examples
Nothing to show here