Fix spurious subtype check pruning when both sides have unions #18213

Linyxus · 2023-07-14T23:31:58Z

In TypeComparer, fourthTry calls isNewSubType and isCovered to detect the subtype queries that have been covered by previous attempts and prune them. However, the pruning is spurious when both sides contain union types, as exemplified by the following subtype trace before the PR:

==> isSubType (test1 : (Int | String){def foo(x: Int): Int}) <:< Int | String?
  ==> isSubType (Int | String){def foo(x: Int): Int} <:< Int | String?
    ==> isSubType (Int | String){def foo(x: Int): Int} <:< Int?
    <== isSubType (Int | String){def foo(x: Int): Int} <:< Int = false
    ==> isSubType (Int | String){def foo(x: Int): Int} <:< String?
      ==> isSubType (Int | String){def foo(x: Int): Int} <:< String?
      <== isSubType (Int | String){def foo(x: Int): Int} <:< String = false
    <== isSubType (Int | String){def foo(x: Int): Int} <:< String = false
    // (1): follow-up subtype checks are pruned here by isNewSubType
  <== isSubType (Int | String){def foo(x: Int): Int} <:< Int | String = false
<== isSubType (test1 : (Int | String){def foo(x: Int): Int}) <:< Int | String = false

At (1), the pruning condition is met, and follow-up recursions are skipped. However, in this case, only after (1) are the refinement on LHS dropped and the subtype between two identical OrTypes are accepted. The pruning is spurious.

This PR tempers the pruning conditions specified in isCovered and isNewSubType to fix these false negatives.

odersky

I believe this needs a better justification why expensive recursion operations are needed and some reworking to avoid allocations.

odersky · 2023-07-17T15:51:43Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

-      if (isCovered(tp1) && isCovered(tp2))
+
+      def isCovered(tp: Type): (Boolean, Boolean) =
+        var containsOr: Boolean = false


This entails an allocation since variables accessed from local functions are heap allocated

odersky · 2023-07-17T15:54:42Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+      val (covered1, hasOr1) = isCovered(tp1)
+      val (covered2, hasOr2) = isCovered(tp2)
+
+      if covered1 && covered2 && !(hasOr1 && hasOr2) then


So total, 4 allocations just here, not counting all the lists built in the recursive calls. We should avoid them in subtype checks, where possible.

Techniques do so:

Instead of a variable, pass a second parameter to recur.

Use an enum instead of a pair of booleans. We need three states: Uncovered, CoveredWithOr, Covered

odersky · 2023-07-17T15:56:49Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+
+      def isCovered(tp: Type): (Boolean, Boolean) =
+        var containsOr: Boolean = false
+        @annotation.tailrec def recur(todos: List[Type]): Boolean = todos match


Why recur over a list if types?

odersky · 2023-07-17T15:57:23Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+              case tp: TypeRef =>
+                if tp.symbol.isClass && tp.symbol != NothingClass && tp.symbol != NullClass then recur(todos)
+                else false
+              case tp: AppliedType => recur(tp.tycon :: todos)


These nested recurrences might turn out to be really expensive. Why do we need them?

The recursion is equivalent to the one before. I refactored it to take a list of types, which is essentially a working list of recursive invocations, to make the function tail-recursive. (e.g. recur(tp.tp1) && recur(tp.tp2) becomes recur(tp.tp1 :: tp.tp2 :: todos)) I thought that this saves the stack and could lead to better performance, but I could revert this refactorization to avoid passing around a list during recursion and creating a new list object at each new invocation.

Previously we did not recurse in arguments of AppliedTypes, just in the type constructor. That's the one I would expect to matter most.

Generally, I think using stack is cheaper than allocating.

Here we are not recursing on the arguments of the AppliedType: note that the todos is the working list, not the arguments of the AppliedType.

Thanks for pointing this out! I will revert this rewriting, and in the future I'll prefer reducing heap allocations over achieving tail recursion.

Ah, right, I had misread that!

In fact, maybe it's simplest to just use Ints as the result of recur. Then && could be replaced by min.

Yes, I just pushed the commit that reverts the tailrec rewriting and uses an integer-based representation for the result of isCovered, interpreted bit-wisely. min is used to aggregate results as you suggested. (I was at first using a bitwise & for aggregation but later found that min is more logical). We use bitwise operations to analyze the results as well.

- Revert tailrec rewrite to reduce allocation - Use a integer-based representation for the result of `isCovered` The results are now aggregated and inspected with bitwise operations.

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

Co-authored-by: Dale Wijnand <[email protected]>

odersky

Style comments only. Otherwise all LGTM

odersky · 2023-07-19T17:06:30Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

-      if (isCovered(tp1) && isCovered(tp2))
+      def isCovered(tp: Type): CoveredStatus =
+        tp.dealiasKeepRefiningAnnots.stripTypeVar match
+          case tp: TypeRef if tp.symbol.isClass && tp.symbol != NothingClass && tp.symbol != NullClass => CoveredStatus.Covered


Suggested change

case tp: TypeRef if tp.symbol.isClass && tp.symbol != NothingClass && tp.symbol != NullClass => CoveredStatus.Covered

case tp: TypeRef =>

if tp.symbol.isClass && tp.symbol != NothingClass && tp.symbol != NullClass

then CoveredStatus.Covered

else CoveredStatus.Uncovered

It's a bit faster since it does not try the other patterns in case of a non-class TypeRef.

odersky · 2023-07-19T17:09:48Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+          if s == Uncovered then Uncovered
+          else s min that
+
+    inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = ~((s1 | s2) & NotHasOr) != 0


Suggested change

inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = ~((s1 | s2) & NotHasOr) != 0

inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = s1 == CoveredWithOr && s2 == CoveredWithOr

Or just inline the rhs where it is used.

odersky · 2023-07-19T17:10:52Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+          else s min that
+
+    inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = ~((s1 | s2) & NotHasOr) != 0
+    inline def bothCovered(s1: Repr, s2: Repr): Boolean = (s1 & s2 & IsCovered) != 0


Suggested change

inline def bothCovered(s1: Repr, s2: Repr): Boolean = (s1 & s2 & IsCovered) != 0

inline def bothCovered(s1: Repr, s2: Repr): Boolean = s1 >= CoveredWithOr && s2 >= CoveredWithOr

Or just inline rhs at the point of use.

Then you don't need the isCovered and NotHasOr fields. Better not to be too clever with bitsets unless it really gains performance - it's easy to get these wrong, and is generally harder to read than straight comparisons.

odersky · 2023-07-19T17:14:05Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+
+      val covered1 = isCovered(tp1)
+      val covered2 = isCovered(tp2)
+      if CoveredStatus.bothCovered(covered1, covered2) && !CoveredStatus.bothHaveOr(covered1, covered2) then


Suggested change

if CoveredStatus.bothCovered(covered1, covered2) && !CoveredStatus.bothHaveOr(covered1, covered2) then

if (convered1 min covered2) >= ConveredStatus.CoveredWithOr && (covered1 max covered2) = ConveredStatus.Covered then

Then no helper functions are needed, and it's still quite clear, I think.

Yes, that's much simpler and clearer. Thanks!

odersky · 2023-07-19T17:18:14Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

@@ -2991,6 +2996,33 @@ object TypeComparer {
  end ApproxState
  type ApproxState = ApproxState.Repr

+  /** Result of `isCovered` check. */
+  object CoveredStatus:


Can be private

odersky · 2023-07-19T17:19:22Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+    private inline val NotHasOr = 1
+
+    /** The type is not covered. */
+    val Uncovered: Repr = 1


Style: In this case I'd write the three vals on subsequent lines with // comments to the right. It's more compact taht way and just as legible.

odersky

Otherwise LGTM

odersky · 2023-07-20T08:41:11Z

compiler/src/dotty/tools/dotc/core/TypeComparer.scala

+  private object CoveredStatus:
+    type Repr = Int
+
+    private inline val IsCovered = 2


These two are no longer needed.

- Speedup pattern matching in `isCovered` - Make `CoveredStatus` private - Use direct integer comparison instead of bit operations for better readability and maintainability

Fix spurious subtype check pruning when both sides have unions

455aa50

Linyxus requested a review from odersky July 14, 2023 23:32

Linyxus mentioned this pull request Jul 14, 2023

Fix issue #17465 #17627

Closed

Linyxus assigned odersky Jul 17, 2023

odersky requested changes Jul 17, 2023

View reviewed changes

odersky assigned Linyxus and unassigned odersky Jul 17, 2023

Linyxus force-pushed the fix-i17465 branch 2 times, most recently from 72405a5 to 2d3b0d9 Compare July 17, 2023 17:31

Speedup isCovered

380bd1b

- Revert tailrec rewrite to reduce allocation - Use a integer-based representation for the result of `isCovered` The results are now aggregated and inspected with bitwise operations.

Linyxus force-pushed the fix-i17465 branch from 2d3b0d9 to 380bd1b Compare July 17, 2023 17:36

dwijnand reviewed Jul 17, 2023

View reviewed changes

compiler/src/dotty/tools/dotc/core/TypeComparer.scala Outdated Show resolved Hide resolved

Linyxus and others added 2 commits July 17, 2023 19:51

Fix precedence error found in review

33c8b60

Co-authored-by: Dale Wijnand <[email protected]>

Enable short circuiting for CoveredStatus.CombinedWith

8275cef

Linyxus force-pushed the fix-i17465 branch from f98b664 to 8275cef Compare July 18, 2023 04:13

Linyxus assigned odersky and unassigned Linyxus Jul 19, 2023

odersky reviewed Jul 19, 2023

View reviewed changes

odersky assigned Linyxus and unassigned odersky Jul 19, 2023

Linyxus force-pushed the fix-i17465 branch from 111f43e to 2638448 Compare July 20, 2023 08:32

odersky approved these changes Jul 20, 2023

View reviewed changes

Linyxus force-pushed the fix-i17465 branch from 2638448 to b046c94 Compare July 20, 2023 08:43

Apply suggested changes in review

13c87ba

- Speedup pattern matching in `isCovered` - Make `CoveredStatus` private - Use direct integer comparison instead of bit operations for better readability and maintainability

Linyxus force-pushed the fix-i17465 branch from b046c94 to 13c87ba Compare July 20, 2023 08:44

Linyxus enabled auto-merge July 20, 2023 08:45

Linyxus merged commit 78e7163 into scala:main Jul 20, 2023

Linyxus deleted the fix-i17465 branch July 20, 2023 11:47

Kordyjan added this to the 3.4.0 milestone Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix spurious subtype check pruning when both sides have unions #18213

Fix spurious subtype check pruning when both sides have unions #18213

Linyxus commented Jul 14, 2023 •

edited

Loading

odersky left a comment

odersky Jul 17, 2023

odersky Jul 17, 2023 •

edited

Loading

odersky Jul 17, 2023

odersky Jul 17, 2023 •

edited

Loading

Linyxus Jul 17, 2023 •

edited

Loading

odersky Jul 17, 2023

Linyxus Jul 17, 2023

odersky Jul 17, 2023

odersky Jul 17, 2023

Linyxus Jul 17, 2023 •

edited

Loading

odersky left a comment

odersky Jul 19, 2023

odersky Jul 19, 2023

odersky Jul 19, 2023

odersky Jul 19, 2023

odersky Jul 19, 2023

Linyxus Jul 20, 2023

odersky Jul 19, 2023

odersky Jul 19, 2023

odersky left a comment

odersky Jul 20, 2023

	inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = ~((s1 \| s2) & NotHasOr) != 0
	inline def bothHaveOr(s1: Repr, s2: Repr): Boolean = s1 == CoveredWithOr && s2 == CoveredWithOr

	inline def bothCovered(s1: Repr, s2: Repr): Boolean = (s1 & s2 & IsCovered) != 0
	inline def bothCovered(s1: Repr, s2: Repr): Boolean = s1 >= CoveredWithOr && s2 >= CoveredWithOr

	if CoveredStatus.bothCovered(covered1, covered2) && !CoveredStatus.bothHaveOr(covered1, covered2) then
	if (convered1 min covered2) >= ConveredStatus.CoveredWithOr && (covered1 max covered2) = ConveredStatus.Covered then

Fix spurious subtype check pruning when both sides have unions #18213

Fix spurious subtype check pruning when both sides have unions #18213

Conversation

Linyxus commented Jul 14, 2023 • edited Loading

odersky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odersky Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odersky Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

Linyxus Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Linyxus Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

odersky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odersky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Linyxus commented Jul 14, 2023 •

edited

Loading

odersky Jul 17, 2023 •

edited

Loading

odersky Jul 17, 2023 •

edited

Loading

Linyxus Jul 17, 2023 •

edited

Loading

Linyxus Jul 17, 2023 •

edited

Loading