-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#269: fixed token class span output #270
#269: fixed token class span output #270
Conversation
result_output = Output(result[0].rows) | ||
mock_base_model_factory: Union[ModelFactoryProtocol, MagicMock] = create_autospec(ModelFactoryProtocol, | ||
_name="mock_base_model_factory") | ||
number_of_intendet_used_models = params.expected_model_counter# todo is this always same? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on the test case, so having the number in the parameters is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the name expected_model_counters is a bit odd. Maybe expected_model_calls or something like thst
…f agg strategy is "none"
...df_wrapper_params/sequence_classification/error_on_prediction_single_model_multiple_batch.py
Show resolved
Hide resolved
class ErrorNotCachedMultipleModelMultipleBatch: | ||
""" | ||
not cached error, multiple model, multiple batch | ||
""" | ||
expected_model_counter = 0 | ||
expected_model_counter = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as https://github.com/exasol/transformers-extension/pull/270/files#r1845243573 I guess , but shouldn't it be called twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new issue for docstring improvements #276
class ErrorNotCachedSingleModelMultipleBatch: | ||
""" | ||
not cached error, single model, multiple batch | ||
""" | ||
expected_model_counter = 1 | ||
expected_model_counter = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve docstring
entity_covered_text, entity_type, score, entity_docid, entity_char_begin, entity_char_end, | ||
error_msg)] | ||
|
||
#todo if use in all tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a permanent todo, please create a ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
score=0.1 | ||
error_msg = None | ||
|
||
#todo comment explain entity/token naming mess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is with this todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
||
def run(ctx: UDFContext): | ||
udf.run(ctx) | ||
|
||
class ErrorOnPredictionMultipleModelMultipleBatch: | ||
""" | ||
not cached error, multiple model, multiple batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve docstring
...ts/udf_wrapper_params/token_classification/error_not_cached_multiple_model_multiple_batch.py
Outdated
Show resolved
Hide resolved
...ts/udf_wrapper_params/token_classification/error_not_cached_multiple_model_multiple_batch.py
Outdated
Show resolved
Hide resolved
...ests/udf_wrapper_params/token_classification/error_not_cached_single_model_multiple_batch.py
Outdated
Show resolved
Hide resolved
...udf_wrapper_params/token_classification/error_on_prediction_multiple_model_multiple_batch.py
Outdated
Show resolved
Hide resolved
|
||
def run(ctx: UDFContext): | ||
udf.run(ctx) | ||
|
||
class ErrorOnPredictionSingleModelMultipleBatch: | ||
""" | ||
error on prediction, single model, multiple batch, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve docstring
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
entity_type="ENTITY_TYPE" | ||
score=0.1 | ||
error_msg = None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hints to functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also added to new ticket
while the type/class of the found token is called "entity_group". | ||
unless aggregation_strategy == "none", then the type/class of the found | ||
token is called "entity" in the model output. | ||
returns a list of number_entities times the model output row. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use proper docstring formatting for return values, such that we use a unified style, see other projects. See https://google.github.io/styleguide/pyguide.html#383-functions-and-methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also added to new ticket
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
tests/unit_tests/udf_wrapper_params/token_classification/make_data_row_functions.py
Outdated
Show resolved
Hide resolved
|
||
def make_number_of_strings(input_str: str, desired_number: int): | ||
""" | ||
returns desired number of "input_strX", where X is counting up to desired_number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring style
@@ -110,6 +119,66 @@ def create_mock_metadata(udf_wrapper): | |||
) | |||
return meta | |||
|
|||
# todo these functions should be reusable for the other unit tests. should we move them to a utils file or something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need them, yes it is probably a good idea to move them. However, I suggest you create a ticket and do it in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to #274
token_docid = 1 | ||
start = 0 | ||
end = 20 | ||
bfs_conn1, bfs_conn2 = make_number_of_strings(bucketfs_conn, 2) # todo why two in this test case? multiple model could still be same bfs con right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably no specific reason might make sense to use the same bfsconn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
@@ -13,34 +13,33 @@ class MultipleModelMultipleBatchComplete: | |||
data_size = 2 | |||
n_entities = 3 | |||
|
|||
bfs_conn1, bfs_conn2 = make_number_of_strings(bucketfs_conn, 2) # todo why two in this test case? multiple model could still be same bfs con right? | |||
sub_dir1, sub_dir2 = make_number_of_strings(sub_dir, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe for another PR, also multiple subdir might be not necessary
All Submissions:
[CodeBuild]
to the commit messagefixes #269
fixes #272
fixes #273