Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix some issues with equality of factorizations #41228

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

simeonschaub
Copy link
Member

  • hash did not respect the type of a factorization, so completely
    different factorizations with the same underlying data would result in
    same hash leading to inconsistencies with isequal. This likely
    doesn't occur very often in practice, but definitely seems worth
    fixing.
  • == and isequal only returned true if two factorizations are of
    exactly the same type, which is inconsistent with their implementation
    for other objects and with the definition of hash for factorizations.
  • Equality for QRCompactWY did not ignore the subdiagonal entries of
    T leading to nondeterministic behavior. Perhaps T should be
    directly stored as UpperTriangular in QRCompactWY, but that seems
    potentially breaking.

Relying on implementation details of DataType here is certainly less
than ideal, but I could not come up with a nicer solution.

@simeonschaub simeonschaub added linear algebra Linear algebra bugfix This change fixes an existing bug backport 1.6 Change should be backported to release-1.6 backport 1.7 labels Jun 14, 2021
lq,
lu,
qr,
x -> qr(x, ColumnNorm()),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be manually changed to qr(x, Val(true)) when backporting this to 1.6. @KristofferC what is the best way to ensure this? Should I maybe remove the backport-1.6 label here and open a new PR with that change against the release-1.6 branch?

- `hash` did not respect the type of a factorization, so completely
  different factorizations with the same underlying data would result in
  same `hash` leading to inconsistencies with `isequal`. This likely
  doesn't occur very often in practice, but definitely seems worth
  fixing.
- `==` and `isequal` only returned true if two factorizations are of
  exactly the same type, which is inconsistent with their implementation
  for other objects and with the definition of `hash` for factorizations.
- Equality for `QRCompactWY` did not ignore the subdiagonal entries of
  `T` leading to nondeterministic behavior. Perhaps `T` should be
  directly stored as `UpperTriangular` in `QRCompactWY`, but that seems
  potentially breaking.

Relying on implementation details of `DataType` here is certainly less
than ideal, but I could not come up with a nicer solution.
@simeonschaub simeonschaub force-pushed the sds/fix_hash_factorization branch from ae06e73 to 642e6e9 Compare June 14, 2021 22:11
@simeonschaub simeonschaub added the equality Issues relating to equality relations: ==, ===, isequal label Jun 16, 2021
@KristofferC KristofferC mentioned this pull request Jun 17, 2021
20 tasks
Base.:(==)( F::T, G::T) where {T<:Factorization} = all(f -> getfield(F, f) == getfield(G, f), 1:nfields(F))
Base.isequal(F::T, G::T) where {T<:Factorization} = all(f -> isequal(getfield(F, f), getfield(G, f)), 1:nfields(F))::Bool
function Base.hash(F::Factorization, h::UInt)
return mapreduce(f -> hash(getfield(F, f)), hash, 1:nfields(F); init=hash(typeof(F).name.wrapper, h))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been told that you should never fetch the internal fields like this. Maybe you can hash propertynames instead?

More generally, should the hash traverse propertynames instead of the fields? That should avoid the issue with non-active memory affecting the hash and maybe make the specialized methods redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been told that you should never fetch the internal fields like this. Maybe you can hash propertynames instead?

I agree it's a bit ugly, but I am unsure just using propertynames for the hash is quite enough, since a lot of them like vectors and values for Eigen are quite generic and not necessarily unique. Perhaps that is being a bit too paranoid, but hashing by type identity seems more formally correct, at least to me. For a stdlib, relying on internals is probably also not quite as bad, since it will always get tested and updated alongside other Base changes.

More generally, should the hash traverse propertynames instead of the fields?

Yeah, maybe? I think that would mean hashing the same data twice for some factorizations though, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a bit ugly

I don't think the reason to avoid it was aesthetics but it's really not my department. Others should comment that if it should be avoided. Maybe it used to be a problem but no longer is. Otherwise, let's just keep it.

I think that would mean hashing the same data twice for some factorizations though, right?

Yes. For e.g. Cholesky it would probably end up hashing the same data three times so we should probably have specialized versions. However, hashing the fields seems wrong since the inactive memory will affect the hash so I think using the properties will be the correct, although slow, fallback. E.g.

julia> A = Symmetric(randn(3,3) + 10I, :U)
3×3 Symmetric{Float64, Matrix{Float64}}:
 8.52303    0.491492   0.311278
 0.491492   9.43415   -1.63795
 0.311278  -1.63795    9.80197

julia> Ac = copy(A);

julia> Ac.data[3,1] += 1
1.9147549806916584

julia> hash(cholesky(A)) == hash(cholesky(Ac))
false

julia> cholesky(A).U == cholesky(Ac).U
true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason we discourage use of internal APIs like this is because they might change at any time, which will break such code. I don't think there are any other glaring problems with this approach.

Using propertynames for the fallback seems reasonable, I will change that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, trying this locally, using propertynames fails for bunchkaufman.

For e.g. Cholesky it would probably end up hashing the same data three times so we should probably have specialized versions.

It seems like this is an issue for quite a lot of factorizations, so if we want this to be efficient, we would end up having to define this manually for a lot of cases, which kind of defeats the purpose of having this definition in the first place.

The more I think about it, the more I am convinced that we should just remove these definitions for the abstract type Factorization, since they really don't make much sense in terms of the abstract Factorization interface, but instead have a macro similar to https://github.com/andrewcooke/AutoHashEquals.jl, which defines these for each factorization separately. That would also avoid having to rely on F.name.wrapper. What do you think? It might be somewhat breaking if users define their own factorization types though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump. How do you think we should proceed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the macro based version end up inspecting the fields which was what I argued against in #41228 (comment)

simeonschaub added a commit that referenced this pull request Jun 25, 2021
Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior. Perhaps `T` should be
directly stored as `UpperTriangular` in `QRCompactWY`, but that seems
potentially breaking.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
simeonschaub added a commit that referenced this pull request Jun 28, 2021
Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
simeonschaub added a commit that referenced this pull request Jun 28, 2021
Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
KristofferC pushed a commit that referenced this pull request Jun 29, 2021
* fix equality of QRCompactWY (#41363)

Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
@KristofferC KristofferC mentioned this pull request Jun 29, 2021
45 tasks
KristofferC pushed a commit that referenced this pull request Jun 30, 2021
Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.

(cherry picked from commit 74fab49)
johanmon pushed a commit to johanmon/julia that referenced this pull request Jul 5, 2021
Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from JuliaLang#41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
@KristofferC KristofferC mentioned this pull request Nov 19, 2021
29 tasks
@KristofferC KristofferC mentioned this pull request Jan 10, 2022
50 tasks
@KristofferC KristofferC mentioned this pull request May 16, 2022
45 tasks
staticfloat pushed a commit that referenced this pull request Dec 23, 2022
* fix equality of QRCompactWY (#41363)

Equality for `QRCompactWY` did not ignore the subdiagonal entries of
`T` leading to nondeterministic behavior.

This is pulled out from #41228, since this change should be less
controversial than the other changes there and this particular bug just
came up in ChainRules again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 bugfix This change fixes an existing bug equality Issues relating to equality relations: ==, ===, isequal linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants