-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide standardize API for 1D array #490
Conversation
Codecov Report
@@ Coverage Diff @@
## master #490 +/- ##
==========================================
- Coverage 83.99% 83.95% -0.04%
==========================================
Files 21 21
Lines 2162 2163 +1
==========================================
Hits 1816 1816
- Misses 346 347 +1
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #490 +/- ##
==========================================
+ Coverage 90.15% 90.47% +0.32%
==========================================
Files 21 21
Lines 2031 2100 +69
==========================================
+ Hits 1831 1900 +69
Misses 200 200
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. @wildart, is this OK?
Correct the implementation, add tests and update docs. |
Can you also adjust |
Do you mean to implement a Sorry for adding unrelated whitespaces. My editor do so while I save. And could I merge
into
also? |
Co-Authored-By: yuehhua <[email protected]>
Yes, you should do that. |
Sorry, I hadn't realized that this is inconsistent with the fact that vectors are treated as column vectors in Julia. Can you change this? I think this also illustrates a problem that I hadn't spotted when reviewing the original PR which added transformations: we shouldn't apply standardization to rows by default, as it's inconsistent with what e.g. |
Great! OK, Let's check our plan.
The Suppose we have a m-by-n matrix
Similar to
We could separate the API and implementation. API design would follow the design above. The implementation goes in column-wise fashion for calculation efficiency. Thus, we provide the efficient implementation with user-friendly API. The details of implementation: For For The addition of
Could I change/redesign the behavior of whole transformations.jl to achieve the description above? |
Yes,
Yes, your proposal sounds consistent with how
Yes, you can use Otherwise, the plan sounds good. |
A intuitive and user-friendly design is necessary, however, I also considering the consistency between APIs. So far, I saw the counter-arguments are also used in Flux.jl. I have no further guesses about this. Maybe some suggestions?
I also considered the scenarios you mentioned above. Transpose doesn't copy the matrix, while the deep copy may not get efficiency. So, I decide to implement it straight forward. |
What do you mean about Flux.jl? AFAIK the difference in conventions is only about defaults, which can be tackled separately from the meaning we assign to |
For your reference, FluxML/Flux.jl#563 |
I think that the design of arguments is not related to the domain (statistics/machine learning). |
That's fine with me, but the fact that the current method chose a different default indicates that not everybody feels that way. Anyway as a first step we should add a |
Something like this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dims
argument handling
src/transformations.jl
Outdated
if dims == 1 | ||
return transform!(x, t, x) | ||
elseif dims == 2 | ||
return transform!(x', t, x')' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return x
instead of calling '
, which would return an Adjoint
object wrapping it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still applies:
return transform!(x', t, x')' | |
transform!(x', t, x') | |
return x |
Same below. Tests should be adapted of course.
src/transformations.jl
Outdated
T = eltype(X) | ||
m, s = mean_and_std(X, 2) | ||
|
||
return ZScoreTransform(d, (center ? vec(m) : zeros(T, 0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dims
must be stored in ZScoreTransform
as @wildart suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still an issue.
src/transformations.jl
Outdated
T = eltype(X) | ||
m, s = mean_and_std(X, 2) | ||
|
||
return ZScoreTransform(d, (center ? vec(m) : zeros(T, 0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still an issue.
new transform API new reconstruct API Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> fix fix Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> fix Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Fix fix
If we keep the |
I guess |
Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Update src/transformations.jl Co-Authored-By: Milan Bouchet-Valat <[email protected]> Refactor
@nalimilan Is it good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Round 2
It is common for in-place update functions to pass as a first parameter a variable where the result of the operation will be stored. So, for |
Yeah... I got your point. I know it is common way to go.
If I move dimensions checks from
Implementing a similar algorithm in row is not applicable. |
You should be able to avoid this by creating a new transformation object with the expected |
@nalimilan I committed. Would you please review it? Thank you. |
Co-Authored-By: Milan Bouchet-Valat <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In-place transforms
Fix bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I found that
standardize
API is just available for 2D array, and not for 1D array.I just add supporting for 1D array.