-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue/classifier classes/test #481
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove code repetition:
- If you are going to keep using unittest, create the test once, and create a list of classifiers. Then call the test for each element of the list inside a subtest block.
- Otherwise, you can investigate how to do the same using Pytest fixtures.
The method you choose is up to you, but it should have minimal repetition and be legible.
I have made the corresponding changes, but I am not sure if the typing in the list of tested classifiers is correct. I assume using |
stratify=y, | ||
random_state=0, | ||
) | ||
self._X_train = X_train |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As test classes are not to be used by users, you can made their fields public without problem. That way it is more legible.
"""Check classes attribute.""" | ||
# Iterate over all classifiers with index | ||
for i, clf in enumerate(self.tested_classifiers): | ||
with self.subTest(i=i): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values passed in subTest
are used for displaying purposes if the test fails. The number of the iteration is not very informative in that regard. Use the class name instead.
self._y_train = y_train | ||
|
||
self.classes = _classifier_get_classes(self._y_train)[0] | ||
self.tested_classifiers: list[ClassifierMixin[FData, NDArrayAny]] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we cease support for Python 3.8 you need to use List
instead (imported from typing
).
self._y_train = y_train | ||
|
||
self.classes = _classifier_get_classes(self._y_train)[0] | ||
self.tested_classifiers: list[ClassifierMixin[FData, NDArrayAny]] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, the tests won't pass as no fixes have been done, so I cannot accept it. I propose to leave the list empty and then adding each fixed classifier in their corresponding PR.
def setUp(self) -> None: | ||
"""Establish train and test data sets.""" | ||
X, y = fetch_growth(return_X_y=True) | ||
X_train, X_test, y_train, _ = train_test_split( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test we don't actually require splitting, do we?
|
||
def setUp(self) -> None: | ||
"""Establish train and test data sets.""" | ||
X, y = fetch_growth(return_X_y=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use synthetic data for this test, as the actual functions are not used.
I would also try a case where y
consists of string labels and another were numbers are used instead.
Codecov ReportBase: 85.28% // Head: 85.31% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #481 +/- ##
===========================================
+ Coverage 85.28% 85.31% +0.02%
===========================================
Files 137 138 +1
Lines 11044 11065 +21
===========================================
+ Hits 9419 9440 +21
Misses 1625 1625
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
message = ( | ||
f'Testing classifier {clf.__class__.__name__} ' | ||
f'with classes {classes}' | ||
) | ||
with self.subTest(msg=message): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is enough:
message = ( | |
f'Testing classifier {clf.__class__.__name__} ' | |
f'with classes {classes}' | |
) | |
with self.subTest(msg=message): | |
with self.subTest(classifier=clf, classes=classes): |
np.resize(classes, n_samples) | ||
for classes in self.classes_list | ||
] | ||
self.X = make_multimodal_samples( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather use make_gaussian process
unless you really need something that only make_multimodal_samples
could provide. Gaussian processes are the equivalent to the normal distribution for functional data. The multimodal samples are used only in registration examples.
Tests for securing the existence of the
classes_
attribute in a classifier after calling thefit
method.