Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/knowledge extractor #63

Merged
merged 11 commits into from
Sep 5, 2023
Merged

Feat/knowledge extractor #63

merged 11 commits into from
Sep 5, 2023

Conversation

marmg
Copy link
Collaborator

@marmg marmg commented Aug 29, 2023

Status Type ⚠️ Core Change Issue
Ready Feature No

Summary

Introducing a new component, the knowledge extractor.

The knowledge extractors will extract, at the same time, the spans and relations in a “triple” format (subject, relation, object).

The first knowledge extractor implemented is KnowGL:

Usage:

import zshot
import spacy

from zshot import PipelineConfig, displacy
from zshot.knowledge_extractor import KnowGL
from zshot.utils.mappings import spans_to_wikipedia

nlp = spacy.blank("en")
nlp_config = PipelineConfig(
    knowledge_extractor=KnowGL()
)
nlp.add_pipe("zshot", config=nlp_config, last=True)

text = "The Italian Space Agency’s Light Italian CubeSat for Imaging of Asteroids, or LICIACube, will fly by Dimorphos to capture images and video of the impact plume as it sprays up off the asteroid and maybe even spy the crater it could leave behind."
text = "CH2O2 is a chemical compound similar to Acetamide used in International Business Machines Corporation (IBM)."
doc = nlp(text)
displacy.render(doc, style='rel')
print(spans_to_wikipedia(doc._.spans))

@marmg marmg self-assigned this Aug 29, 2023
marmg added 5 commits August 29, 2023 15:01
Signed-off-by: Marcos Martinez <[email protected]>
Signed-off-by: Marcos Martinez <[email protected]>
* ✅🐛 Improve tests performance. Fix minor bugs related

Signed-off-by: Marcos Martinez <[email protected]>

* ✅ Added download models

Signed-off-by: Marcos Martinez <[email protected]>

* 🎨 Fixed flake

Signed-off-by: Marcos Martinez <[email protected]>

* 🎨 Fixed flake

Signed-off-by: Marcos Martinez <[email protected]>

* Added try/except on load models

Signed-off-by: Marcos Martinez <[email protected]>

* Added try/except on load models

Signed-off-by: Marcos Martinez <[email protected]>

* ✅ Added xfail to tars tests

Signed-off-by: Marcos Martinez <[email protected]>

* ✅ Updated pydantic requirements

Signed-off-by: Marcos Martinez <[email protected]>

* ✅ Added fewrel tests

Signed-off-by: Marcos Martinez <[email protected]>

* ✅ Updated tests

Signed-off-by: Marcos Martinez <[email protected]>

* Revert "✅ Updated tests"

This reverts commit eae7c8c.

Signed-off-by: Marcos Martinez <[email protected]>

---------

Signed-off-by: Marcos Martinez <[email protected]>
@marmg marmg merged commit 89f0e08 into main Sep 5, 2023
3 checks passed
@marmg marmg deleted the feat/knowledge_extractor branch September 5, 2023 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant