Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first draft of biotic interaction template #107

Merged
merged 9 commits into from
May 22, 2023
Merged

first draft of biotic interaction template #107

merged 9 commits into from
May 22, 2023

Conversation

diatomsRcool
Copy link
Member

New template to capture biotic interactions from text - ultimately want to add to GloBI. May need to have a discussion about the annotator.

@diatomsRcool diatomsRcool requested a review from caufieldjh May 18, 2023 15:05
Copy link
Member

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add an example text?

In should be in tests/input/cases/

and the name should be biotic_interactions-xxx.txt

@caufieldjh
Copy link
Member

Using the annotators that I've added here is probably excessive - ncbitaxon will capture everything and then some, while BERO is likely too broad for the kinds of things considered "biotic interactions" - so let's consider this the absolute upper bound of annotation and get more specific.

@caufieldjh
Copy link
Member

I've been using this abstract for testing: https://pubmed.ncbi.nlm.nih.gov/26639575/
@diatomsRcool is that kind of thing in scope?

@diatomsRcool
Copy link
Member Author

The only issue with NCBI is that they have identifiers only for about 25% of the known taxa. It might be ok to start with tho.

@diatomsRcool
Copy link
Member Author

and yes, that PubMed abstract would be a good example of the kind of thing that is in scope

@caufieldjh
Copy link
Member

OK, great - running

ontogpt pubmed-extract -t biotic_interaction.BioticInteraction 26639575

yields

WARNING:root:Could not find any mappings for NCIT:C121660                                                                           
WARNING:root:Could not find any mappings for NCIT:C121660
input_text: 'Title: Bacteria may contribute to distant species recognition in ant-aphid
  mutualistic relationships.

 ...

raw_completion_output: "label: mutualistic relationship \nsource_taxon: ant and aphid\
  \ species \ntarget_taxon: ant and aphid species \ninteraction_type: mutualistic\
  \ interaction \n\nlabel: attraction \nsource_taxon: bacterial honeydew \ntarget_taxon:\
  \ ant \ninteraction_type: attraction \n\nlabel: discrimination \nsource_taxon: aphid\
  \ species \ntarget_taxon: ant \ninteraction_type: semiochemical-based discrimination\
  \ \n\nlabel: preference \nsource_taxon: Aphis fabae \ntarget_taxon: ant \ninteraction_type:\
  \ preference \n\nlabel: microbial influence \nsource_taxon: bacteria \ntarget_taxon:\
  \ ant and aphid species \ninteraction_type: influence"

...

extracted_object:
  label: microbial influence
  source_taxon: NCBITaxon:1869227
  target_taxon: ant and aphid species
  interaction_type: influence
named_entities:
- id: NCBITaxon:1869227
  label: bacteria

@diatomsRcool
Copy link
Member Author

diatomsRcool commented May 19, 2023

It seems to completely miss the ant - aphid interaction. Could that be a result of NCBI being so heavily skewed toward microbes?

…ogpt into thessen-1

merging changes from Chris and Harry# Please enter a commit message to explain why this merge is necessary,
@diatomsRcool
Copy link
Member Author

I added some example text as Chris requested. It's all about sharks. Should be some predator/prey and parasite interactions

Copy link
Member

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test cases should be labeled biotic_interaction- (not pluralized) but we can deal later

@caufieldjh
Copy link
Member

It seems to completely miss the ant - aphid interaction. Could that be a result of NCBI being so heavily skewed toward microbes?

I suspect ncbitaxon isn't ideal for this, yeah - maybe the taxon slim is more balanced

@caufieldjh
Copy link
Member

It catches some ant vs. aphid interactions, just not in a way it maps to anything:

raw_completion_output:
	label: discrimination
	source_taxon: aphid species
	target_taxon: ant
	interaction_type: semiochemical-based discrimination

	label: preference
	source_taxon: Aphis fabae
	target_taxon: ant
	interaction_type: preference

@caufieldjh caufieldjh self-requested a review May 22, 2023 15:37
@caufieldjh
Copy link
Member

caufieldjh commented May 22, 2023

ah, found an issue on translating to pydantic:

$ make src/ontogpt/templates/biotic_interaction.py
poetry run gen-pydantic src/ontogpt/templates/biotic_interaction.yaml > src/ontogpt/templates/biotic_interaction.py.tmp && mv src/ontogpt/templates/biotic_interaction.py.tmp src/ontogpt/templates/biotic_interaction.py
ValueError: File "biotic_interaction.yaml", line 42, col 15: A semi-colon separated list of taxon to taxon relationships for example: Carcharodon carcharias eats elephant seal; Pandarus sinuatus parasitizes Carcharodon carcharias; orca eats Carcharodon carcharias: Not a valid NCName
make: *** [Makefile:31: src/ontogpt/templates/biotic_interaction.py] Error 1

(will add fix in new pr)

@cmungall cmungall merged commit 213a631 into main May 22, 2023
@diatomsRcool
Copy link
Member Author

So what was the problem?

@caufieldjh
Copy link
Member

The prompt example needed to be under the annotations heading. I also split it into prompt for description text and prompt.example for the example text.

See #113

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants