-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Python type hints #708
Conversation
Types used in tests do not cover all cases. This also avoids many Union[] sets.
These would allow all types
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #708 +/- ##
=======================================
Coverage 99.66% 99.67%
=======================================
Files 89 89
Lines 6295 6380 +85
=======================================
+ Hits 6274 6359 +85
Misses 21 21
☔ View full report in Codecov by Sentry. |
1ca7f9b
to
7c3c5dc
Compare
a193ade
to
2711ba4
Compare
12d710b
to
ab9e815
Compare
ab9e815
to
bb9951f
Compare
I annotated manually one module (annif.backend.hyperopt.py) that monkeytype was not able to process. These annif package modules are still lacking type hints altogether:
Not sure if they all need type hints. Currently running Errorsannif/lexical/tokenset.py:44: error: Need type annotation for "_index" [var-annotated]
annif/lexical/tokenset.py:58: error: Need type annotation for "subj_tsets" (hint: "subj_tsets: Dict[<type>, <type>] = ...") [var-annotated]
annif/exception.py:47: error: Incompatible types in assignment (expression has type "str", base class "AnnifException" defined the type as "None") [assignment]
annif/exception.py:53: error: Incompatible types in assignment (expression has type "str", base class "AnnifException" defined the type as "None") [assignment]
annif/exception.py:60: error: Incompatible types in assignment (expression has type "str", base class "AnnifException" defined the type as "None") [assignment]
annif/exception.py:66: error: Incompatible types in assignment (expression has type "str", base class "AnnifException" defined the type as "None") [assignment]
annif/suggestion.py:157: error: Argument 1 to "SuggestionResults" has incompatible type "Generator[SuggestionBatch, None, None]"; expected "List[SuggestionBatch]" [arg-type]
annif/suggestion.py:161: error: Argument 1 to "from_iterable" of "chain" has incompatible type "List[SuggestionBatch]"; expected "Iterable[Iterable[<nothing>]]" [arg-type]
annif/suggestion.py:161: error: Argument 1 to "from_iterable" of "chain" has incompatible type "List[SuggestionBatch]"; expected "Iterable[Iterable[Any]]" [arg-type]
annif/lexical/util.py:24: error: "Node" has no attribute "language" [attr-defined]
annif/eval.py:85: error: Need type annotation for "_suggestion_arrays" (hint: "_suggestion_arrays: List[<type>] = ...") [var-annotated]
annif/eval.py:86: error: Need type annotation for "_gold_subject_arrays" (hint: "_gold_subject_arrays: List[<type>] = ...") [var-annotated]
annif/eval.py:97: error: Argument 1 to "from_sequence" of "SuggestionBatch" has incompatible type "Union[List[List[SubjectSuggestion]], List[Iterator[Any]]]"; expected "List[List[SubjectSuggestion]]" [arg-type]
annif/eval.py:106: error: "SubjectSet" has no attribute "__iter__" (not iterable) [attr-defined]
annif/eval.py:178: error: Incompatible types in assignment (expression has type "dict_keys[str, Callable[[], Any]]", variable has type "Sequence[str]") [assignment]
annif/eval.py:234: error: "SubjectIndex" has no attribute "__iter__" (not iterable) [attr-defined]
annif/eval.py:235: error: "SubjectIndex" has no attribute "__iter__" (not iterable) [attr-defined]
annif/eval.py:268: error: Argument 4 to "output_result_per_subject" of "EvaluationBatch" has incompatible type "Optional[str]"; expected "str" [arg-type]
annif/backend/backend.py:52: error: Name "datetime.datetime" is not defined [name-defined]
annif/backend/backend.py:113: error: Argument 1 to "int" has incompatible type "Optional[Any]"; expected "Union[str, bytes, bytearray, memoryview, array[Any], mmap, _CData, PickleBuffer, SupportsInt, SupportsIndex, SupportsTrunc]" [arg-type]
annif/lexical/mllm.py:39: error: List or tuple literal expected as the second argument to "namedtuple()" [misc]
annif/lexical/mllm.py:45: error: List or tuple literal expected as the second argument to "namedtuple()" [misc]
annif/lexical/mllm.py:49: error: Second argument of IntEnum() must be string, tuple, list or dict literal for mypy to determine Enum members [misc]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "doc_length" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "subject_id" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "freq" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "is_pref" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "n_tokens" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "ambiguity" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "first_occ" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "last_occ" for "Candidate" [call-arg]
annif/lexical/mllm.py:63: error: Unexpected keyword argument "spread" for "Candidate" [call-arg]
annif/lexical/mllm.py:110: error: "Candidate" has no attribute "subject_id" [attr-defined]
annif/lexical/mllm.py:111: error: Value of type variable "_SCT" of "zeros" cannot be "bool" [type-var]
annif/lexical/mllm.py:111: error: "ModelData" has no attribute "related" [attr-defined]
annif/lexical/mllm.py:113: error: "ModelData" has no attribute "broader" [attr-defined]
annif/lexical/mllm.py:114: error: "ModelData" has no attribute "narrower" [attr-defined]
annif/lexical/mllm.py:115: error: "ModelData" has no attribute "related" [attr-defined]
annif/lexical/mllm.py:116: error: "ModelData" has no attribute "collection" [attr-defined]
annif/lexical/mllm.py:118: error: "Candidate" has no attribute "subject_id" [attr-defined]
annif/lexical/mllm.py:119: error: "Type[Feature]" has no attribute "freq" [attr-defined]
annif/lexical/mllm.py:119: error: "Candidate" has no attribute "freq" [attr-defined]
annif/lexical/mllm.py:120: error: "Type[Feature]" has no attribute "doc_freq" [attr-defined]
annif/lexical/mllm.py:120: error: "ModelData" has no attribute "doc_freq" [attr-defined]
annif/lexical/mllm.py:121: error: "Type[Feature]" has no attribute "subj_freq" [attr-defined]
annif/lexical/mllm.py:121: error: "ModelData" has no attribute "subj_freq" [attr-defined]
annif/lexical/mllm.py:122: error: "Type[Feature]" has no attribute "tfidf" [attr-defined]
annif/lexical/mllm.py:122: error: "Candidate" has no attribute "freq" [attr-defined]
annif/lexical/mllm.py:122: error: "ModelData" has no attribute "idf" [attr-defined]
annif/lexical/mllm.py:123: error: "Type[Feature]" has no attribute "is_pref" [attr-defined]
annif/lexical/mllm.py:123: error: "Candidate" has no attribute "is_pref" [attr-defined]
annif/lexical/mllm.py:124: error: "Type[Feature]" has no attribute "n_tokens" [attr-defined]
annif/lexical/mllm.py:124: error: "Candidate" has no attribute "n_tokens" [attr-defined]
annif/lexical/mllm.py:125: error: "Type[Feature]" has no attribute "ambiguity" [attr-defined]
annif/lexical/mllm.py:125: error: "Candidate" has no attribute "ambiguity" [attr-defined]
annif/lexical/mllm.py:126: error: "Type[Feature]" has no attribute "first_occ" [attr-defined]
annif/lexical/mllm.py:126: error: "Candidate" has no attribute "first_occ" [attr-defined]
annif/lexical/mllm.py:127: error: "Type[Feature]" has no attribute "last_occ" [attr-defined]
annif/lexical/mllm.py:127: error: "Candidate" has no attribute "last_occ" [attr-defined]
annif/lexical/mllm.py:128: error: "Type[Feature]" has no attribute "spread" [attr-defined]
annif/lexical/mllm.py:128: error: "Candidate" has no attribute "spread" [attr-defined]
annif/lexical/mllm.py:129: error: "Type[Feature]" has no attribute "doc_length" [attr-defined]
annif/lexical/mllm.py:129: error: "Candidate" has no attribute "doc_length" [attr-defined]
annif/lexical/mllm.py:130: error: "Type[Feature]" has no attribute "broader" [attr-defined]
annif/lexical/mllm.py:131: error: "Type[Feature]" has no attribute "narrower" [attr-defined]
annif/lexical/mllm.py:132: error: "Type[Feature]" has no attribute "related" [attr-defined]
annif/lexical/mllm.py:133: error: "Type[Feature]" has no attribute "collection" [attr-defined]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "broader" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "narrower" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "related" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "collection" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "doc_freq" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "subj_freq" for "ModelData" [call-arg]
annif/lexical/mllm.py:158: error: Unexpected keyword argument "idf" for "ModelData" [call-arg]
annif/lexical/mllm.py:245: error: Need type annotation for "_doc_freq" [var-annotated]
annif/lexical/mllm.py:247: error: Need type annotation for "_subj_freq" [var-annotated]
annif/lexical/mllm.py:349: error: "Candidate" has no attribute "subject_id" [attr-defined]
annif/corpus/skos.py:93: error: Invalid index type "str" for "Union[defaultdict[str, List[str]], defaultdict[None, List[str]]]"; expected type "None" [index]
annif/corpus/skos.py:94: error: Invalid index type "str" for "Union[defaultdict[str, List[str]], defaultdict[None, List[str]]]"; expected type "None" [index]
annif/corpus/skos.py:95: error: Invalid index type "None" for "Union[defaultdict[str, List[str]], defaultdict[None, List[str]]]"; expected type "str" [index]
annif/corpus/skos.py:96: error: Invalid index type "None" for "Union[defaultdict[str, List[str]], defaultdict[None, List[str]]]"; expected type "str" [index]
annif/backend/yake.py:31: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/yake.py:40: error: Dict entry 1 has incompatible type "str": "float"; expected "str": "int" [dict-item]
annif/backend/yake.py:41: error: Dict entry 2 has incompatible type "str": "str"; expected "str": "int" [dict-item]
annif/backend/yake.py:44: error: Dict entry 5 has incompatible type "str": "None"; expected "str": "int" [dict-item]
annif/backend/yake.py:45: error: Dict entry 6 has incompatible type "str": "List[str]"; expected "str": "int" [dict-item]
annif/backend/yake.py:170: error: Item "None" of "Optional[Any]" has no attribute "get" [union-attr]
annif/backend/yake.py:173: error: Argument 2 to "max" has incompatible type "int"; expected "floating[_64Bit]" [arg-type]
annif/backend/yake.py:177: error: "list" expects 1 type argument, but 2 given [type-arg]
annif/backend/yake.py:178: error: "list" expects 1 type argument, but 2 given [type-arg]
annif/backend/stwfsa.py:32: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/stwfsa.py:50: error: Dict entry 0 has incompatible type "str": "str"; expected "str": "int" [dict-item]
annif/backend/stwfsa.py:51: error: Dict entry 1 has incompatible type "str": "str"; expected "str": "int" [dict-item]
annif/backend/stwfsa.py:52: error: Dict entry 2 has incompatible type "str": "str"; expected "str": "int" [dict-item]
annif/backend/stwfsa.py:110: error: "object" not callable [operator]
annif/backend/stwfsa.py:130: error: Item "None" of "Optional[Any]" has no attribute "suggest_proba" [union-attr]
annif/backend/hyperopt.py:45: error: "Trial" has no attribute "value" [attr-defined]
annif/backend/hyperopt.py:107: error: Argument "callbacks" to "optimize" of "Study" has incompatible type "List[Callable[[Study, Trial], None]]"; expected "Optional[List[Callable[[Study, FrozenTrial], None]]]" [arg-type]
annif/backend/http.py:8: error: Library stubs not installed for "dateutil.parser" [import]
annif/backend/http.py:8: note: Hint: "python3 -m pip install types-python-dateutil"
annif/backend/http.py:8: note: (or run "mypy --install-types" to install all missing stub packages)
annif/backend/http.py:8: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
annif/backend/http.py:8: error: Library stubs not installed for "dateutil" [import]
annif/backend/http.py:9: error: Library stubs not installed for "requests" [import]
annif/backend/http.py:10: error: Library stubs not installed for "requests.exceptions" [import]
annif/backend/http.py:10: note: Hint: "python3 -m pip install types-requests"
annif/backend/http.py:22: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/http.py:36: error: Incompatible return value type (got "Union[bool, str, None]", expected "bool") [return-value]
annif/backend/dummy.py:15: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/corpus/subject.py:44: error: The return type of a generator function should be "Generator" or one of its supertypes [misc]
annif/corpus/subject.py:52: error: Argument 1 to "serialize_subjects_to_skos" has incompatible type "None"; expected "Iterator[Any]" [arg-type]
annif/corpus/subject.py:72: error: Incompatible types in assignment (expression has type "None", variable has type "Dict[str, Optional[str]]") [assignment]
annif/corpus/subject.py:89: error: Item "None" of "Optional[List[str]]" has no attribute "__iter__" (not iterable) [union-attr]
annif/corpus/subject.py:94: error: The return type of a generator function should be "Generator" or one of its supertypes [misc]
annif/corpus/subject.py:103: error: Argument 1 to "serialize_subjects_to_skos" has incompatible type "None"; expected "Iterator[Any]" [arg-type]
annif/corpus/subject.py:117: error: Need type annotation for "_subjects" (hint: "_subjects: List[<type>] = ...") [var-annotated]
annif/corpus/subject.py:118: error: Need type annotation for "_uri_idx" (hint: "_uri_idx: Dict[<type>, <type>] = ...") [var-annotated]
annif/corpus/subject.py:119: error: Need type annotation for "_label_idx" (hint: "_label_idx: Dict[<type>, <type>] = ...") [var-annotated]
annif/corpus/subject.py:134: error: Incompatible return value type (got "None", expected "List[str]") [return-value]
annif/corpus/subject.py:141: error: Incompatible types in assignment (expression has type "List[Any]", variable has type "None") [assignment]
annif/corpus/subject.py:195: error: "None" has no attribute "__iter__" (not iterable) [attr-defined]
annif/corpus/subject.py:200: error: "SubjectIndex" has no attribute "__iter__" (not iterable) [attr-defined]
annif/corpus/subject.py:241: error: Argument 1 of "__eq__" is incompatible with supertype "object"; supertype defines the argument type as "object" [override]
annif/corpus/subject.py:241: note: This violates the Liskov substitution principle
annif/corpus/subject.py:241: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
annif/corpus/subject.py:241: note: It is recommended for "__eq__" to work with arbitrary objects, for example:
annif/corpus/subject.py:241: note: def __eq__(self, other: object) -> bool:
annif/corpus/subject.py:241: note: if not isinstance(other, SubjectSet):
annif/corpus/subject.py:241: note: return NotImplemented
annif/corpus/subject.py:241: note: return <logic to compare two SubjectSet instances>
annif/backend/tfidf.py:32: error: Need type annotation for "_buffer" (hint: "_buffer: List[<type>] = ...") [var-annotated]
annif/backend/tfidf.py:65: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/tfidf.py:108: error: "None" has no attribute "vocabulary_" [attr-defined]
annif/backend/tfidf.py:134: error: "None" has no attribute "transform" [attr-defined]
annif/backend/tfidf.py:135: error: Value of type "Optional[Any]" is not indexable [index]
annif/backend/svc.py:27: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/svc.py:105: error: Item "None" of "Optional[Any]" has no attribute "classes_" [union-attr]
annif/backend/svc.py:115: error: "None" has no attribute "transform" [attr-defined]
annif/backend/svc.py:116: error: Item "None" of "Optional[Any]" has no attribute "decision_function" [union-attr]
annif/backend/omikuji.py:29: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/omikuji.py:80: error: "None" has no attribute "vocabulary_" [attr-defined]
annif/backend/omikuji.py:140: error: "None" has no attribute "transform" [attr-defined]
annif/backend/omikuji.py:143: error: Need type annotation for "batch_results" (hint: "batch_results: List[<type>] = ...") [var-annotated]
annif/backend/omikuji.py:150: error: Item "None" of "Optional[Any]" has no attribute "predict" [union-attr]
annif/backend/mllm.py:32: error: "AnnifBackend" has no attribute "_load_train_data" [attr-defined]
annif/backend/mllm.py:38: error: "AnnifBackend" has no attribute "_generate_candidates" [attr-defined]
annif/backend/mllm.py:42: error: Name "np.float" is not defined [name-defined]
annif/backend/mllm.py:49: error: "AnnifBackend" has no attribute "_model" [attr-defined]
annif/backend/mllm.py:55: error: "AnnifBackend" has no attribute "_model" [attr-defined]
annif/backend/mllm.py:57: error: "AnnifBackend" has no attribute "_model" [attr-defined]
annif/backend/mllm.py:60: error: "AnnifBackend" has no attribute "_prediction_to_result" [attr-defined]
annif/backend/mllm.py:78: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/mllm.py:89: error: Dict entry 2 has incompatible type "str": "float"; expected "str": "int" [dict-item]
annif/backend/mllm.py:93: error: Signature of "get_hp_optimizer" incompatible with supertype "AnnifHyperoptBackend" [override]
annif/backend/mllm.py:93: note: Superclass:
annif/backend/mllm.py:93: note: def get_hp_optimizer(self, corpus: DocumentCorpus) -> Any
annif/backend/mllm.py:93: note: Subclass:
annif/backend/mllm.py:93: note: def get_hp_optimizer(self, corpus: DocumentCorpus, metric: str) -> MLLMOptimizer
annif/backend/mllm.py:156: error: Item "None" of "Optional[MLLMModel]" has no attribute "generate_candidates" [union-attr]
annif/backend/mllm.py:170: error: Item "None" of "Optional[MLLMModel]" has no attribute "predict" [union-attr]
annif/backend/fasttext.py:26: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/fasttext.py:48: error: Dict entry 1 has incompatible type "str": "float"; expected "str": "int" [dict-item]
annif/backend/fasttext.py:50: error: Dict entry 3 has incompatible type "str": "str"; expected "str": "int" [dict-item]
annif/backend/fasttext.py:156: error: Item "None" of "Optional[Any]" has no attribute "predict" [union-attr]
annif/backend/fasttext.py:170: error: Need type annotation for "label_scores" [var-annotated]
annif/backend/ensemble.py:135: error: Incompatible return value type (got "Union[floating[_64Bit], float]", expected "float") [return-value]
annif/backend/ensemble.py:145: error: Incompatible types in assignment (expression has type "str", base class "AnnifBackend" defined the type as "None") [assignment]
annif/backend/ensemble.py:155: error: No return value expected [return-value]
annif/backend/ensemble.py:157: error: Signature of "get_hp_optimizer" incompatible with supertype "AnnifHyperoptBackend" [override]
annif/backend/ensemble.py:157: note: Superclass:
annif/backend/ensemble.py:157: note: def get_hp_optimizer(self, corpus: DocumentCorpus) -> Any
annif/backend/ensemble.py:157: note: Subclass:
annif/backend/ensemble.py:157: note: (Skipping most remaining errors due to unresolved imports or missing stubs; fix these first)
Found 154 errors in 20 files (checked 55 source files)
|
Now I noticed that the hints of the standard collection types do not need to be from the ones from the Also, it could be considered to try using type hints only with Python 3.10. That would allow simpler syntax for Union[] and dropping using Optional[] (PEP604). |
Sounds promising, but would the code still run with Python 3.8 and 3.9 if it included type hints using the new syntax? |
Changing |
I think we should use the simplest, most elegant, most modern approach that works with Python 3.8 (the annotations don't have to work, but the code needs to run). So starting directly with PEP 585 and PEP 604 would make sense if that can be done. |
Using the And it seems mypy also supports the PEP 585 and 604 features/syntax also on Python 3.8 from mypy version 0.800 onwards (when using the mentioned import). |
- Use standard collection types instead of types from Typing (PEP 585) - Write union types as X | Y (PEP 604) - Write optional values as X | None (PEP 604)
return self.vocab.subjects | ||
|
||
def _get_info(self, key): | ||
def _get_info(self, key: str) -> bool | datetime | None: |
Check notice
Code scanning / CodeQL
Explicit returns mixed with implicit (fall through) returns
def __init__( | ||
self, | ||
backend_id: str, | ||
config_params: dict[str, Any] | SectionProxy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This union could be replaced by a MutableMapping, but the current dict[str, Any] | SectionProxy
union seemed more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should convert objects from configparser module into plain dicts before passing them into the backend? I think that would be cleaner
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
I think this could be merged now, and with a squash merge. This just adds type hints to function/method signatures and is already touching quite many lines. However, runtime behaviour should not be affected. There are many mypy errors (see PR description/first comment), but resolving many of them would need modifying actual code (which could then affect runtime, e.g.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, let's leave the next steps (e.g. fixing mypy warnings, possible use of mypyc) to subsequent PRs. I agree that a squash merge makes sense.
The first round of type hints were generated by running
monkeytype apply --pep_563
on all modules.Then type hints have been manually corrected, and simplified by (at least with)
float
forint | float
(allowed by PEP484)dict[X, Any]
for many dictionaries, because otherwise more narrow union types become long and verbose (the type of the key/value cannot be set as union inside the dict, e.g.dict[str, float | str]
does not work, but this should be written asdict[X, str] | dict[X, int]
; )float
fornumpy.float64
andint
fornumpy.int32
etc.I have been using this mypy configuration (in
mypy.ini
), which allows to just runmypy
in root of Annif repo and it makes mypy ignore a bunch of errors:With this config mypy gives "only" 20 errors.
Show errors
When not ignoring any errors, mypy throws 182 of them.
Show all errors