Possible float value in EFO #49

masarunakajima · 2024-05-22T21:55:27Z

Running
df1 = text2term.map_terms("test/unstruct_terms.txt", "http://www.ebi.ac.uk/efo/efo.owl")
Gave me an error which is something like

.....
  File "/MY_PATH_TO/text2term/tfidf_mapper.py", line 50, in _tokenize
    vocabulary = count_vectorizer.fit(source_terms + target_labels).vocabulary_
....
  File "/MY_PATH_TO/sklearn/feature_extraction/text.py", line 69, in _preprocess
    doc = doc.lower()
          ^^^^^^^^^
AttributeError: 'float' object has no attribute 'lower'

The problem seems to be caused by giving count_vectorizer.fit a list including a float.
I guess a float value is somewhere in EFO.
Changing the above code to
df1 = text2term.map_terms("test/unstruct_terms.txt", "http://purl.obolibrary.org/obo/cl.owl")
worked.

The text was updated successfully, but these errors were encountered:

tomarashish · 2024-05-31T08:47:53Z

Changing the line 69 in file /MY_PATH_TO/sklearn/feature_extraction/text.py to doc = str(doc).lower() worked

rsgoncalves · 2024-06-03T20:38:22Z

Thank you for reporting.

There is indeed a bug in the t2t input handler, which throws this error when an ontology term label or synonym is not a string. In this case, there is a synonym 92.1 for a term in EFO that is causing the issue.

A bug fix will be released, either today or tomorrow, to address this issue.

closes #49

rsgoncalves · 2024-06-05T17:08:09Z

This issue should no longer occur in the latest release v4.1.4

rsgoncalves added the bug Something isn't working label Jun 3, 2024

rsgoncalves added a commit that referenced this issue Jun 3, 2024

Parameterize ngram length. Ensure inputs are strings

d2f7efc

closes #49

rsgoncalves mentioned this issue Jun 3, 2024

Minor improvements and bug fixes #50

Merged

rsgoncalves closed this as completed in #50 Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible float value in EFO #49

Possible float value in EFO #49

masarunakajima commented May 22, 2024

tomarashish commented May 31, 2024

rsgoncalves commented Jun 3, 2024

rsgoncalves commented Jun 5, 2024

Possible float value in EFO #49

Possible float value in EFO #49

Comments

masarunakajima commented May 22, 2024

tomarashish commented May 31, 2024

rsgoncalves commented Jun 3, 2024

rsgoncalves commented Jun 5, 2024