-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standardize and normalize #339
base: master
Are you sure you want to change the base?
Conversation
…py of data, have invalid col check, force float dtype have better docs, standardization uses ddof=0
… (separate functions for raster and vector data)
@em-t or @lehtonenp , would you have time to review this? Or perhaps @msmiyels or @nialov if you are not too busy. |
@nmaarnio, I got time to review. So, I'll review this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran tests and cli functions successfully.
I was wondering whether we should have private functions for both normalize and standardize just like in other functions. Are private functions needed? I understand that the functions are short. But the somewhat lengthy if else statements would not be so cluttered as now.
Initially we implemented private functions for all tools, but this aspect our style guide has loosened to the point where it is not required anymore. I personally started shifting towards implementing one or more private functions only when I felt they were useful / the function had more contents than just checks and simple call to a library function. Additionally, the gh-pages documentation site generated by Here I felt they would not add much and was not sure which part(s) to separate. However, I am open to suggestions and not against refactoring this! Do you have a specific suggestion for diving the code? |
I see that the private functions would not add much value to normalize and standardize for the given reasons. It is a plus that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests and cli functions work as expected. Private function is not required.
Approving.
@nmaarnio sorry for beeing late on this. We actually have the What we do not have there yet is handling of tabular data (and I don´t know the CLI status of those), since this was decided during the review process of the functions. |
Could you point out the |
No worries. I was aware that they can be performed with the existing transformations functions, but I thought it a good idea to create public functions that are called There are some differences with the inputs and handling. I myself prefer that operations are defined for both vector and raster data if applicable for the plugin and CLI, but defining the core operation on data (Numpy array or Pandas DF) if it makes sense. Maybe we should give this a little thought now, should we merge this or not and how to proceed. |
@nmaarnio I think it is quite easy to bring in confusion here, since there can be only very subtle differences between those terms and their purpose. Generally:
Explicitly:
So it looks that there are some inconsistencies among the naming of these functions, but all of these are What about naming these functions Alternatively, change the existing public functions to accept tabular data? I know it's repetition, but this was explicitely excluded during the review process of the transformation functions 😵 Another thought that just came in 🤯: if it's only for the Plugin 📺 or even CLI ⚙️, we could just name it "stan...", "norm..." and call the existing functions under the hood. What do you think? |
Hi @msmiyels , I am revisiting this now. So I agree that a good way to implement for example I guess accepting tabular data or just vector datasets is another thing. I don't know how many people wish the set of transformation tools to accept vector data input in QGIS plugin, but this would mean changes to all of the transformation functions. So at least for now it would make sense to either create a simple higher-level function for |
No description provided.