-
-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection Equality (Sets, Seqs and Maps should be equal by content - modulo type) #1555
Comments
Scala does behave as follows: scala> val set1 = scala.collection.immutable.SortedSet.empty[String] + ("a", "c", "b")
set1: scala.collection.immutable.SortedSet[String] = TreeSet(a, b, c)
scala> val set2 = scala.collection.immutable.Set.empty + ("c", "b", "a")
set2: scala.collection.immutable.Set[String] = Set(c, b, a)
scala> set1.equals(set2)
res0: Boolean = true
scala> set2 == set1
res1: Boolean = true
scala> set1 == set2
res2: Boolean = true
scala> set2.equals(set1)
res3: Boolean = true Sequences seems to be equal too if type differs but Seq/Set does not work that way: scala> List(1, 2, 3) == Stream(1 ,2 ,3)
res4: Boolean = true
scala> List(1, 2, 3) equals Stream(1 ,2 ,3)
res5: Boolean = true
scala> List(1, 2, 3) equals Set(1 ,2 ,3)
res6: Boolean = false In Javaslang We should definitely rethink the current behavior, maybe for Changing Update: For now I will consider it as bug and target 2.0.4. After investigation we might change the target to 3.0.0. |
This changes the behavior of collection's Collection Equality
Source: Scala Documentation A test for sequences on the Scala REPL:
|
It is not a bug - it is a change of behavior. The structural equals method |
Is For
The same code with different values:
So, if all you got is two Because of the above, you can only use Is there something that we could do about this? |
@zsolt-donca thank you! I think on the level of Value the only thing we can do is updating the Javadoc. A Traversable, e.g. a HashSet, may contain elements that have no natural order. So there will be no chance for We could also add more introspection methods like
But Value seems to be the wrong place for these methods, they would fit better to Traversable. Note: If we add more of these methods, we can build the Splitterator characteristics automatically. I think this is a good idea! Value could have a default implementation for single-valued types. Traversable should override it and use the introspection-methods to build a Splitterator. (see #1635) |
Hi! After some investigation I have found these problems:
@danieldietrich What do you think? Especially about the second problem. |
@v1ctor thank you for your investigation!
We should create static methods located in the package-private class Collections in order to hide these methods from the public API. It is sufficient to do these for Seq, Set, Map and Multimap: class Collections {
static boolean equals(Set<T> self, Object that) {
...
}
static boolean equals(Seq<T> self, Object that) {
...
}
static boolean equals(Map<K, V> self, Object that) {
...
}
static boolean equals(Multimap<K, V> self, Object that) {
...
}
} We need also to ensure that the hashCodes are equal, if the collections are equal, especially for Sets. Because Sets may have different element order we can't just do
We focus only on Set, Seq, Map and Multimap. Iterator, Tree and PriorityQueue are out of scope for this issue. |
No, it was solved #1818 |
@danieldietrich |
Great! Feel free to create an unfinished PR at any time if you want early feedback. |
OUTDATED The actual version can be found below. Scala will introduce 'Multiversal Equality', currently explored in Dotty (using type classes / a trait Eq). That means the types are partitioned into multiple disjoint universes (by type) and equality is defined there. Note: we do this already for our types (like Option) by checking In order to keep symmetry, i.e. When computing equals() we determine the actual partition by calculating the upper type-bound of the given collections. Example:
When having equals(SortedSet, SortedSet) we need to respect element order. We do this by using iteration order during comparison. IdeaWe might expect our
(Here X is a collection type and ~ ⊆ X x X, a ~ b := equals(a, b) == true) I suggest to relax this equivalence property a bit by allowing sub-partitions In particular we need that for our sorted collections. Proposed solutionfinal class Collections {
static <T> boolean equals(Seq<T> seq1, Object o) {
if (o == seq1) {
return true;
} else if (seq1 != null && o instanceof Seq) {
@SuppressWarnings("unchecked")
final Seq<T> seq2 = (Seq<T>) o;
return seq1.size() == seq2.size() && areEqual(seq1, seq2);
} else {
return false;
}
}
static <T> boolean equals(Set<T> set1, Object o) {
if (o == set1) {
return true;
} else if (set1 != null && o instanceof Set) {
@SuppressWarnings("unchecked")
final Set<T> set2 = (Set<T>) o;
if (set1.size() != set2.size()) {
return false;
}
return !set1.isOrdered() ? set2.forAll(set1::contains) :
!set2.isOrdered() ? set1.forAll(set2::contains) : areEqual(set1, set2);
} else {
return false;
}
}
...
// already existis!
static boolean areEqual(Iterable<?> iterable1, Iterable<?> iterable2) { ... }
} Tests:{
Set<Integer> set1 = Set(1, 2, 3);
Set<String> set2 = Set("a", "b", "c");
println(equals(set1, set2)); // = false (especially no ClassCastException)
println(equals(set2, set1)); // = false
}
{
Set<Integer> set1 = SortedSet(1, 2, 3);
Set<Integer> set2 = API.<Integer> SortedSet((i, j) -> i - j, 1, 2, 3);
println(equals(set1, set2)); // = true (because of same order)
println(equals(set2, set1)); // = true
}
{
Set<Integer> set1 = SortedSet(1, 2, 3);
Set<Integer> set2 = API.<Integer> SortedSet((i, j) -> j - i, 1, 2, 3);
println(equals(set1, set2)); // = false (because of different order)
println(equals(set2, set1)); // = false
}
{
Set<Integer> set1 = Set(1, 2, 3);
Set<Integer> set2 = API.<Integer> SortedSet((i, j) -> j - i, 1, 2, 3);
println(equals(set1, set2)); // = true (ignoring different order)
println(equals(set2, set1)); // = true
} References |
@v1ctor There is a counter-example for transitivity of the proposed Set equality:
Set<Integer> a = API.<Integer> SortedSet((i, j) -> i - j, 1, 2, 3);
Set<Integer> b = Set(1, 2, 3);
Set<Integer> c = API.<Integer> SortedSet((i, j) -> j - i, 1, 2, 3); Then We have to think about another implementation... In Scala: scala> val a: Set[Int] = SortedSet(1, 2, 3)((i, j) => i - j)
a: scala.collection.immutable.Set[Int] = TreeSet(1, 2, 3)
scala> val b: Set[Int] = Set(1, 2, 3)
b: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> val c: Set[Int] = SortedSet(1, 2, 3)((i, j) => j - i)
c: scala.collection.immutable.Set[Int] = TreeSet(3, 2, 1)
scala> a == b
res0: Boolean = true
scala> b == c
res1: Boolean = true
scala> a == c
res2: Boolean = true |
We can't just use the Set<Integer> set1 = SortedSet(1, 2, 3);
Set<String> set2 = SortedSet("a", "b", "c");
println(equals(set1, set2)); // throws ClassCastException
println(equals(set2, set1)); // throws ClassCastException |
@v1ctor You were right! - using // scala
trait GenSetLike[A, +Repr] with (A => Boolean) {
override def equals(that: Any): Boolean = that match {
case that: GenSet[_] =>
(this eq that) ||
(that canEqual this) &&
(this.size == that.size) &&
(try this subsetOf that.asInstanceOf[GenSet[A]]
catch { case ex: ClassCastException => false })
case _ =>
false
}
} (Source: scala/scala: GenSetLike.scala) We only need to catch ClassCastException that may occur when calling Please take the following equals impl for Sets, it obeys all laws of an equivalence relation: // javaslang
@SuppressWarnings("unchecked")
static <T> boolean equals(Set<T> set1, Object o) {
if (o == set1) {
return true;
} else if (set1 != null && o instanceof Set) {
final Set<T> set2 = (Set<T>) o;
try {
return set1.size() == set2.size() && set1.forAll(set2::contains);
} catch(ClassCastException x) {
return false;
}
} else {
return false;
}
} Because we can't rely on the order of elements for Sets, we need another hashing strategy. Scala uses the following: // scala
trait GenSetLike[A, +Repr] with (A => Boolean) {
override def hashCode()= scala.util.hashing.MurmurHash3.setHash(seq)
} Proposed hashing strategy see here. |
* issue #1555: add equals and hashCode, to Seq, Set, Map, Multimap * issue #1555: rewrite equals methods and fixed all tests * issue #1555: supress rawtypes warning * issue #1555: supress unchecked warning * issue #1555: add tests * issue #1555: rewrite equals methods * issue #1555: add some tests * Hardening collection equality constraints and adding javadoc
Fixed with #1948 |
One would typically expect implementations of a set interface to consider two sets equal regardless of subtype or ordering as long as they contained exactly the same elements.
But in javaslang this:
Results in this:
Is this the intended behavior or could it be improved?
The text was updated successfully, but these errors were encountered: