Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix jovo-model-nlpjs to allow for regex and built-in entities for number, ordinal, date, time #71

Open
1 of 4 tasks
rmtuckerphx opened this issue May 3, 2022 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@rmtuckerphx
Copy link
Contributor

I'm submitting a...

  • Bug report
  • Feature request
  • Documentation issue or request
  • Other... Please describe:

Expected Behavior

NLP.js allows for specifying entities of various types including regex and depending on configured settings can handle various builtin types such as number, ordinal, date, and time.

These built-in entity types are defined based on specified packages such as builtin-microsoft:
https://github.com/axa-group/nlp.js/blob/581b945b19a4c0205d85e4b575b6542a4b69372b/packages/builtin-microsoft/src/builtin-microsoft.js#L85-L103

The JovoModel v4 structure is converted to an NLP.js corpus.json file but it appears that it only supports entities of type enum as can be seen from the interface that expects an options:

export interface NlpjsEntity {
options: Record<string, string[]>;
}
// Native NpJs JSON Format
export interface NlpjsModelFile {
name: string;
locale: string;
data: NlpjsData[];
entities?: Record<string, NlpjsEntity>;
}

Here is a sample corpus that shows an enum entity (hero) and a regex entity (email):

{
  "entities": {
    "hero": {
      "options": {
        "spiderman": ["spiderman", "spider-man"],
        "ironman": ["ironman", "iron-man"],
        "thor": ["thor"]
      }
    },
    "email": "/\\b(\\w[-._\\w]*\\w@\\w[-._\\w]*\\w\\.\\w{2,3})\\b/gi"
  }
}

Full sample is here

Tasks:

  1. Figure out a format when specifying a non-enum entity in the Jovo Model v4 format.
    Something like:
"entities": {
  "animal": {
    "type": "ANIMAL_SYNONYMS" // enum type
  },
  "size": {
    "type": "ANIMAL_SIZE" // enum type
  },
  "number": {
    "type": {
      "nlpjs": "builtin-microsoft::Number"
    }
  },
  "email": {
    "type": {
      "nlpjs": "regex::/\\b(\\w[-._\\w]*\\w@\\w[-._\\w]*\\w\\.\\w{2,3})\\b/gi"
    }
  }
}
  1. Update jovo-model-nlpjs so that it can handle non-enum entities
  2. Allow passing the needed configuration values to NLP.js whether you are using that as the NLU for the Jovo Debugger on in your Jovo app.

Current Behavior

Can only use enum types in NLP.js when you specify the model in Jovo Model v4 format.

Error Log

No error.

Your Environment

  • Jovo Framework version used: 4.2.12
  • Operating System: Windows 10 10.0.22000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants