-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[alpha] Improvements to ModelWrapper and better QA/Classification implementation #8
Conversation
See `evalem.misc.datasets`. We have `datasets.get_squad_v2(...)` function.
Now we have `metrics.semantics.SemanticMetric`. There are 2 implementation for now: - `metrics.semantics.BertScore` - `metrics.semantics.BartScore`
We make use of 2 kwargs to any model wrapper: - `inputs_preprocessor` (maps inputs to a specific format, defaults to identity) - `predictions_postprocessor` (maps model outputs to a specific format, defaults to identity) Also `models.HFPipelineWrapperForQuestionAnswering` is created. `models.DefaultQAModelWrapper` is deprecated.
See `models.defaults.TextClassificationHFPipelineWrapper`. Also improve the concstruction of hf pipeline object in existing wrapper. `evaluators.basics.TextClassificationEvaluator` is also added.
This flag is used to return precision/recall/f1 score per prediction instance.
…rics-models Fix merge conflicts
TextClassificationHFPipelineWrapper Previously, tokenizer was set to some defaults. However, that is incorrect. We want tokenizer to be the one for which provided model was trained on. So, now `tokenizer` is set to None by default.
""" | ||
nsamples = nsamples or 0 | ||
data = load_dataset("imdb")[data_type] | ||
data = data.shuffle(seed=42) if shuffle else data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move seed to a config or a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ya. Good call. The framework-level config could be a nice way to manage these seeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I resolve this in the next PR? It doesn't hamper the behavior of the framework at this point.
Major Changes
evalem.models.QuestionAnsweringHFPipelineWrapper
andevalem.models.TextClassificationHFPipelineWrapper
are now the main wrappers for QA and Text Classification tasks respectively.hf_params
dict
is also provided as a parameter that will be used for initializing the HF pipelineevalem.evaluators.TextClassificationEvaluator
has been added with basics metrics for text classification (F1 score, precision, recall, confusion matrix)evalem.models._base.ModelWrapper
now utilizes 2 distinct processing parameters (one for pre-preocessing and one for post-processing) which should beCallable
(lambda function, external modules that can be called, etc.)inputs_preprocessor
is used for working on input dataset and change inputs to the models.ModelWrapper._preprocess_inputs
method is used that can also be overridden by any downstream sub-classpredictions_postprocessor
is used on model's predictions to do post processing on predictions.ModelWrapper._postprocess_predictions
method is used that can also be overridden by any downstream sub-classMinor Changes
evalem.models.DefaultQAModelWrapper
has been deprecated. User will getDeprecationWarning
error when trying to initialize the objectevalem.metrics.BertScore
now usesbert-base-uncased
as the default model instead ofroberta-large
.evalem.misc.datasets.get_imdb
function is added to load IMDB dataset out-of-box.Usage
QA Task
Defaults
Using the custom model and post-processing functionality
Text Classification
defaults
customized