Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named entity trees #361

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

markferry
Copy link

Rationale

STTs like wit.ai (site requires webkit-based browser), along with the raw text they transcribe, also return a tree (or trees) of named entities identified from the text.
Named-entity trees provide considerably more reliable identification of the speaker's intent than simple word extraction and making use of them can shift a significant amount of error-prone work outside of the modules.

Changes

This change updates WitAiSTT.transcribe() to attach the named entity trees to transcribed text. To make this change transparent and backwards-compatible with existing modules, class TaggedText (a subclass of unicode) is introduced.

Example wit.ai response including named-entity data

{
  "msg_id": "5d7647b5-3970-44d5-94a0-d286773c0c98",
  "_text": "Lounge lights on",
  "outcomes": [
    {
      "_text": "Lounge lights on",
      "intent": "lights",
      "entities": {
        "on_off": [
          {
            "value": "on"
          }
        ],
        "room": [
          {
            "value": "lounge",
            "metadata": ""
          }
        ],
        "light_group": [
          {
            "suggested": true,
            "value": "lights",
            "type": "value"
          }
        ]
      },
      "confidence": 0.985
    }
  ]
}

The example is trained based on my wit.ai Home Automation app.

Usage

def handle(text, mic, profile):
    if hasattr(text, 'tags') and len(text.tags) > 0:
        _handle_entity_tree(text, mic, profile)
    else:
        # the usual handle function
        _handle_text(text, mic, profile)
...
def _handle_entity_tree(tagged_text, mic, profile):
    tree = tagged_text.tags
    # parse named-entity-tree

@markferry
Copy link
Author

You can see an example of the named-entity-tree use in my Home Automation module.

Subclassing unicode is obviously a little sneaky.
The alternatives I tried were:

  • Adding an extra entity_tree parameter to handle() and isValid()
    • This breaks all existing modules..
  • Adding extra handle_entity_tree() and is_valid_entity_tree() functions
    • This breaks modules calling mic functions internally (Joke, HN, etc).

@Holzhaus
Copy link
Member

Thanks, but I'd rather not integrate a feature that deep into Jasper that's only supported by a single STT engine. That'd lead to some modules only supporting the Wit.ai STT engine and thus forcing users that want to use that module to use Wit.ai - even if they'd prefer one of the other STT implementations. @shbhrsaha @crm416 What's your take on this?

@markferry
Copy link
Author

Thanks, that's a fair comment.

Perhaps a way forward is to abstract the Text-To-Named-Entities aspect into a separate service. So Jasper would plug together STT, TTNE and TTS services - with TTNE being optional just as TTS is.

I could imagine Jasper wanting to support multiple TTNE services (when and if they exist).

A simplistic local TTNE plugin could provide a mechanism for the existing example modules to map text to commands.

The goal would be to encourage module writers to use this (uniform) method rather than each author rolling yet another hybrid of regexes and decision trees to extract commands.

Perhaps each module could still handle its own Named-Entity recognition by providing a callback function for the TTNE plugin.

The wit.ai plugin could provide both STT and fast TTNE (with some simple caching).

Does this seem like a direction Jasper might want to explore?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants