-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: relieve constraints on non edsnlp custom attributes #220
Comments
Thank you for letting us know! Indeed, this is problematic. This change was made to provide a uniform, generic, access to the normalized entity value (see #47), useful for printing results, exporting as a dataframe, etc, regardless of the span label. In any case, you're right:
|
Thanks for the PR link, it does make more sense understanding your approach. However I still find dubious the use of the "value" extension you are suggesting, as a label_ getter. It seems you are basically erasing this extension by linking it to "label_" which is basically limited to span categories, when it could simply be used to different use cases. I do understand the will to store normalized values. Why not get directly the "label_" attribute when needed ? In my use case I see the value extension more like a free entry value which could be set and overridden by different process, sort of how you use the value of your measurements. I run various pipelines, which store information in their respective specific attributes, defined in specifically named attributes, and "decisions" pipeline which decide for a value given the information of the spans. That means the value is not specifically normalized and could be of many shapes including custom models. I think attributes with strong constraints should be isolated in very specific names to avoid overlapping with more general attributes if that makes sense. I'm personnaly not scared of verbose, we could imagine "span._.label_getter" with the same usecase. Anyway thanks for explaining your view, I can understand that multiple approaches can be considered. |
Agreed, we will soon allow this. But in the case a value has not been set by the user, we will keep the getter behavior.
It's true that just typing
In our case, a normalized value can also be an instance of a specific class (for instance an |
Feature type
Enhance compatibility of EDSnlp custom attributes with potential external pipelines.
Description
in the BaseComponent class, in this commit you added this line :
Doing this, you are enforcing (overwriting if already defined) an attribute which is non edsnlp specific. i.e not named specificaly for your use case, and could be very easily required by anyone using your package as part of a broader pipeline.(see spacy good practices regarding naming components/attributes)
Forcing a getter function means if later on a component were to try to set a value, it would be ignored.
Example of potential conflict:
WIth my modest experience I would suggest avoiding enforcing attributes in general, but if necessary, renaming the attribute to avoid conflicts. If not possible allowing renaming of the attribute, and/or make sure you let the user know what attributes you are enforcing. (I believe this should be made very clear in the doc specifically in the case of attributes such as "value").
I ran into this conflict upgrading edsnlp from 0.7.4 to 0.9 I might be missing something here and would love to hear what the reasons are if this is absolutely needed.
The text was updated successfully, but these errors were encountered: