-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support spatial transform
in estimators that inherit from TransformerMixin
#16
Comments
Looks great @grovduck, thanks! I just did a quick check and it looks like nearly all transformers implement from sklearn.utils.discovery import all_estimators
for name, trans in all_estimators("transformer"):
if not hasattr(trans, "get_feature_names_out"):
print(name)
"""
FeatureHasher
HashingVectorizer
LabelBinarizer
LabelEncoder
MultiLabelBinarizer
PatchExtractor
""" At a glance, none of these seem relevant to applying in 2D space. |
That definitely seems like a reasonable approach to me - those transformers don't strike me as any that would be necessary. The flip side would be whether those estimators/transformers that do implement |
Yeah, most of those 84 probably would never be used, but I suppose if they follow a consistent protocol and we can implement them all in one go, there's no harm (maybe with a disclaimer that we only explicitly test/support a subset of them?). One thing we'll need to watch out for is any estimator/transformer that modifies the spatial shape, i.e. that returns more or fewer samples than it was fit with, since that would break the Dask side. I'm not aware of anything like that, but I've probably never touched 90% of the functionality in sklearn, so I wouldn't be shocked. I'm pretty excited for this feature - being able to run PCA or StandardScaler on images seamlessly will be great! |
Good call. I recognized only a handful of those transformers as well, so I'm unfamiliar if any would modify the shape, but definitely something to watch out for.
Absolutely! Just to be clear, your plan is to tackle #13 before handling this one, correct? Please let me know if there are "side jobs" on either of these issues that you'd like my help with (other than reviews). |
Yes, I made some changes in #18 that should hopefully reduce some code duplication when adding new methods like this (although that means your implementation will need to be refactored a little bit to match, sorry!). If you want to tackle this issue once #18 is merged, that would be great! You've got a better idea of the real-world use cases for this, and you've already figured out the tricky part of getting the output shape. |
For some use cases, it will be valuable to return the spatial representation of an estimator's
transform
method where that estimator inherits from sklearn'sTransformerMixin
(for example, sklearn'sPCA
or sknnr'sCCATransformer
).As of #12,
sknnr-spatial
supportspredict
andkneighbors
in a functional context (soon to be implemented as an estimator wrapper in #13). Supportingtransform
will likely be an extension of this logic. Based on some initial crude experimentation, the following code implementstransform
using a functional API. This code is not fully tested, but introducing here to keep a record of what was done.src/sknnr-spatial/image/_base.py
src/sknnr-spatial/image/ndarray.py
src/sknnr-spatial/image/_dask_backed.py
src/sknnr-spatial/image/dataarray.py
src/sknnr-spatial/image/dataset.py
The text was updated successfully, but these errors were encountered: