Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CHOOSE semantics wrong #841

Open
lemmy opened this issue Jun 3, 2021 · 12 comments
Open

[BUG] CHOOSE semantics wrong #841

lemmy opened this issue Jun 3, 2021 · 12 comments
Assignees
Labels
FSMT Feature: Improvements in the SMT encoding product-owner-triage This should be triaged by the product owner user-support helping the users

Comments

@lemmy
Copy link
Contributor

lemmy commented Jun 3, 2021

Specifying Systems page 73

---- MODULE C ----
EXTENDS Integers

S ==
    {1,2,3}

VARIABLE 
    \* @type: Int;
    x

Init ==
    x = CHOOSE n \in S: TRUE

Next ==
    x' = CHOOSE n \in S: TRUE

Inv ==
    x = x'
====

Note that replacing S with Nat causes scala.NotImplementedError: A set filter over InfSet[Int] is not implemented

markus@avocado:~/src/TLA/ewd998(main)$ apalache check --inv=Inv C.tla
# Tool home: /home/markus/src/TLA/_community/apalache
# Package:   /home/markus/src/TLA/_community/apalache/mod-distribution/target/apalache-pkg-0.15.7-SNAPSHOT-full.jar
# JVM args:  -Xmx4096m -DTLA-Library=/home/markus/src/TLA/_community/apalache/src/tla
#
# APALACHE version 0.15.7-SNAPSHOT build v0.15.6-11-g95c1068
#
# WARNING: This tool is in the experimental stage.
#          Please report bugs at: [https://github.com/informalsystems/apalache/issues]
# 
# Usage statistics is ON. Thank you!
# If you have changed your mind, disable the statistics with config --enable-stats=false.

Checker options: filename=C.tla, init=, next=, inv=Inv            I@20:53:02.942
Tuning:                                                           I@20:53:03.312
PASS #0: SanyParser                                               I@20:53:03.313
Parsing file /home/markus/src/TLA/ewd998/C.tla
Parsing file /tmp/Integers.tla
Parsing file /tmp/Naturals.tla
PASS #1: TypeCheckerSnowcat                                       I@20:53:03.849
 > Running Snowcat .::.                                           I@20:53:03.849
 > Your types are great!                                          I@20:53:03.958
 > All expressions are typed                                      I@20:53:03.959
PASS #2: ConfigurationPass                                        I@20:53:03.995
  > C.cfg: Loading TLC configuration                              I@20:53:03.998
  > No TLC configuration found. Skipping.                         I@20:53:04.003
  > Command line option --init is not set. Using Init             I@20:53:04.003
  > Command line option --next is not set. Using Next             I@20:53:04.003
  > Set the initialization predicate to Init                      I@20:53:04.003
  > Set the transition predicate to Next                          I@20:53:04.004
  > Set an invariant to Inv                                       I@20:53:04.005
PASS #3: DesugarerPass                                            I@20:53:04.027
  > Desugaring...                                                 I@20:53:04.027
PASS #4: UnrollPass                                               I@20:53:04.044
  > Unroller                                                      I@20:53:04.061
PASS #5: PrimingPass                                              I@20:53:04.078
  > Introducing InitPrimed for Init'                              I@20:53:04.083
PASS #6: CoverAnalysisPass                                        I@20:53:04.101
PASS #7: InlinePass                                               I@20:53:04.103
  > InlinerOfUserOper                                             I@20:53:04.103
  > LetInExpander                                                 I@20:53:04.104
  > InlinerOfUserOper                                             I@20:53:04.105
Leaving only relevant operators: CInitPrimed, Init, InitPrimed, Inv, Next I@20:53:04.107
PASS #8: VCGen                                                    I@20:53:04.124
  > Producing verification conditions from the invariant Inv      I@20:53:04.125
  > VCGen produced 1 verification condition(s)                    I@20:53:04.132
PASS #9: PreprocessingPass                                        I@20:53:04.153
  > Before preprocessing: unique renaming                         I@20:53:04.154
 > Applying standard transformations:                             I@20:53:04.161
  > PrimePropagation                                              I@20:53:04.162
  > Desugarer                                                     I@20:53:04.176
  > UniqueRenamer                                                 I@20:53:04.190
  > Normalizer                                                    I@20:53:04.205
  > Keramelizer                                                   I@20:53:04.220
  > After preprocessing: UniqueRenamer                            I@20:53:04.240
PASS #10: TransitionFinderPass                                    I@20:53:04.258
  > Found 1 initializing transitions                              I@20:53:04.273
  > Found 1 transitions                                           I@20:53:04.275
  > No constant initializer                                       I@20:53:04.275
  > Applying unique renaming                                      I@20:53:04.277
PASS #11: OptimizationPass                                        I@20:53:04.288
 > Applying optimizations:                                        I@20:53:04.316
  > ConstSimplifier                                               I@20:53:04.317
  > ExprOptimizer                                                 I@20:53:04.319
  > ConstSimplifier                                               I@20:53:04.320
PASS #12: AnalysisPass                                            I@20:53:04.342
 > Marking skolemizable existentials and sets to be expanded...   I@20:53:04.346
  > SkolemizationMarker                                           I@20:53:04.347
  > ExpansionMarker                                               I@20:53:04.349
 > Running analyzers...                                           I@20:53:04.352
  > Introduced expression grades                                  I@20:53:04.364
  > Introduced 0 formula hints                                    I@20:53:04.367
PASS #13: PostTypeCheckerSnowcat                                  I@20:53:04.368
 > Running Snowcat .::.                                           I@20:53:04.373
 > Your types are great!                                          I@20:53:04.440
 > All expressions are typed                                      I@20:53:04.448
PASS #14: BoundedChecker                                          I@20:53:04.484
Step 0: picking a transition out of 1 transition(s)               I@20:53:04.842
State 0: Checking 1 action invariants                             I@20:53:04.853
State 0: action invariant 0 violated. Check the counterexample in: counterexample.tla, MC.out, counterexample.json E@20:53:04.885
The outcome is: Error                                             I@20:53:04.889
Checker has found an error                                        I@20:53:04.889
It took me 0 days  0 hours  0 min  1 sec                          I@20:53:04.890
Total time: 1.950 sec                                             I@20:53:04.890
EXITCODE: ERROR (12)
---------------------------- MODULE counterexample ----------------------------

EXTENDS C

(* Constant initialization state *)
ConstInit == TRUE

(* Initial state *)
State0 == x = 1

(* Transition 0 to State1 *)
State1 == x = 2

(* The following formula holds true in the last state and violates the invariant *)
InvariantViolation == ~(x = x')

================================================================================
(* Created by Apalache on Wed Jun 02 20:53:04 PDT 2021 *)
(* https://github.com/informalsystems/apalache *)

TLC

...
Inv == [][x = x']_x
TLC2 Version 2.16 of Day Month 20?? (rev: 5682c4a)
Running breadth-first search Model-Checking with fp 13 and seed -8927696691377591672 with 1 worker on 4 cores with 5291MB heap and 64MB offheap memory [pid: 232344] (Linux 5.4.0-73-generic amd64, Ubuntu 11.0.11 x86_64, MSBDiskFPSet, DiskStateQueue).
Parsing file /home/markus/src/TLA/ewd998/C.tla
Parsing file /tmp/Integers.tla (jar:file:/opt/toolbox/tla2tools.jar!/tla2sany/StandardModules/Integers.tla)
Parsing file /tmp/Naturals.tla (jar:file:/opt/toolbox/tla2tools.jar!/tla2sany/StandardModules/Naturals.tla)
Semantic processing of module Naturals
Semantic processing of module Integers
Semantic processing of module C
Starting... (2021-06-02 20:56:37)
Computing initial states...
Finished computing initial states: 1 distinct state generated at 2021-06-02 20:56:37.
Model checking completed. No error has been found.
  Estimates of the probability that TLC did not check all reachable states
  because two distinct states had the same fingerprint:
  calculated (optimistic):  val = 5.4E-20
2 states generated, 1 distinct states found, 0 states left on queue.
The depth of the complete state graph search is 1.
The average outdegree of the complete state graph is 0 (minimum is 0, the maximum 0 and the 95th percentile is 0).
Finished in 00s at (2021-06-02 20:56:37)
@lemmy lemmy added the bug label Jun 3, 2021
@konnov
Copy link
Collaborator

konnov commented Jun 3, 2021

Yep, we have a non-deterministic translation of CHOOSE to SMT. It's an over-approximation of CHOOSE in TLA+. In the worst case, you get a counterexample, as there are more executions, like in your example. So far, I have not seen real specifications that use determinism of CHOOSE, though I could write one :-)

Actually, I know how to make CHOOSE deterministic in our encoding, but that will produce O(n) constraints, where n is the number of elements in the set approximation. Perhaps, we will make an option that enforces determinism of CHOOSE.

@konnov
Copy link
Collaborator

konnov commented Jun 3, 2021

The main reason for keeping CHOOSE non-deterministic was to reduce complexity of recursive operators that heavily depend on CHOOSE. Once we support the fold operator, this should be less important.

@konnov konnov self-assigned this Jun 3, 2021
@konnov konnov added this to the July iteration milestone Jun 3, 2021
@konnov konnov added usability UX improvements FSMT Feature: Improvements in the SMT encoding user-support helping the users and removed usability UX improvements labels Jun 3, 2021
@lemmy
Copy link
Contributor Author

lemmy commented Jun 3, 2021

The main reason for keeping CHOOSE non-deterministic was to reduce complexity of recursive operators that heavily depend on CHOOSE. Once we support the fold operator, this should be less important.

It also relies on users switching to fold operators. Perhaps, the CommunityModules should be bundled with/included in Apalache to lower the barrier?

Also, if there are no specs that rely on CHOOSE, it should probably be deterministic by default, and the option allows advanced users to explicitly switch to non-determinism.

@konnov
Copy link
Collaborator

konnov commented Jun 3, 2021

Also, if there are no specs that rely on CHOOSE, it should probably be deterministic by default, and the option allows advanced users to explicitly switch to non-determinism.

It will make Apalache slow by default. A faster tool with false positives that can be turned off is better than a slow tool by default. False negatives are of course bad :)

@konnov
Copy link
Collaborator

konnov commented Jun 3, 2021

It also relies on users switching to fold operators. Perhaps, the CommunityModules should be bundled with/included in Apalache to lower the barrier?

Yeah, maybe we should place a warning or something when we start to support fold.

@konnov
Copy link
Collaborator

konnov commented Jun 3, 2021

I remember that TLC did not faithfully support CHOOSE on equivalent sets, right?

@lemmy
Copy link
Contributor Author

lemmy commented Jun 3, 2021

See the warning in 14.6 of Specifying Systems (p. 262). In practice, the given example is too simple, though:

Welcome to the TLA+ REPL!
TLC2 Version 2.16 of Day Month 20??
Enter a constant-level TLA+ expression.
(tla+) CHOOSE s \in {1,2,3,3,2,1}: s < 3
1
(tla+) CHOOSE s \in {3,2,1}: s < 3
1

I haven't checked, but I expect this to only happen if symbolic (lazy) sets don't get normalized (enumerated) first.

@konnov
Copy link
Collaborator

konnov commented Jun 27, 2021

We will introduce the new operator as scheduled in #888. However, I do not understand this issue anymore. The book clearly says that it's OK for TLC to loosen semantics of CHOOSE, when it is too hard to enforce it in TLC. When we say the same, this is not considered acceptable.

@konnov
Copy link
Collaborator

konnov commented Jun 27, 2021

Since this comment is public, just to keep the record that there is no bias towards TLC (as Leslie has written in private communication). The point is that our non-deterministic semantics of CHOOSE may be exploited in unexpected ways. So we will introduce an alternative in #888 to reflect the difference.

@lemmy
Copy link
Contributor Author

lemmy commented Jun 29, 2021

(Leslie agreed to make the aforementioned discussion public)

@xxyzz:

I presume you're talking about the weakening of the semantics of CHOOSE on page 262 of "Specifying Systems". The weakening in question is there because maintaining the precise semantics of CHOOSE is computationally impossible. The semantics of CHOOSE implies that if this formula is true:

(*)    (S = T) /\ (\A x \in S : P(x) = Q(x))  =>  (CHOOSE x \in S : P(x))  =  (CHOOSE x \ T : Q(x))

For arbitrary S, T, P, and Q, this is impossible to guarantee (*) because the truth of S = T is undecidable. In principle, it is possible to guarantee if S and T are computable expressions of finite sets. However, even if S and T are identical, guaranteeing this for all possible P and Q is nontrivial. TLC makes a reasonable effort to maintain this aspect of the semantics---for example, for finite sets S and T, it assures

   (S = T)  =>  (CHOOSE x \in S : P(x))  = (CHOOSE x \in T : P(x))  

as long as it can compute S and T. However, I decide that because it will seldom matter in practice.

It never occurred to me not to require that this should always be true:

(+)   (CHOOSE x \in S : P(x))  =  (CHOOSE x \in S : P(x)) 

As a mathematician, I would find that as bad as not requiring 1+1 = 1+1 always to be true.

Even so, my initial reaction on learning that Apalache would violate (+) is that it was OK because it would seldom matter in practice. However, after thinking about it for a while, I decided that it would matter. Many users believe that (+) isn't necessarily true, and having Apalache reinforce this misconception would be a bad idea. (A warning in the user manual would be even easier to ignore than all the things I've written saying that (+) is always true.) Requiring Apalache users to explicitly choose between the semantics of CHOOSE and efficiency was not just a nuisance; it would teach users something. At worst, Apalache users would forget that CHOOSE exists and only use ApChoose. In that case, they'd be reminded (and be forced to learn how to write correct specs) if they used another tool like TLC or TLAPS.

The subsequent discussion about CHOOSE, recursive operators, and FoldSet can be found at tlaplus/rfcs#3

@konnov konnov removed their assignment Nov 8, 2021
@konnov konnov removed the bug label Nov 8, 2021
@konnov konnov removed this from the July iteration milestone Nov 18, 2021
@shonfeder
Copy link
Contributor

Turned up on #1123, here's an example of unsoundness resulting from interactions between our inability to guard against duplicate indices and the indeterminacy of our implementation of CHOOSE:

------------------- MODULE Dup -----------------------

EXTENDS Naturals

VARIABLES
  \* @type: Int;
  n,
  \* @type: Int -> Int;
  someFun

Init ==
  /\ n = 0
  /\ someFun = [x \in {0} |-> n]

Next ==
  /\ someFun' = [x \in {0,0,0} |-> CHOOSE y \in 0..10: TRUE]
  /\ n' = n + 1

Inv == someFun[0] /= 5

============================================================

Run with apalache-mc check --inv=Inv scratch/Dup.tla

Giving the counterexample:

---------------------------- MODULE counterexample ----------------------------

EXTENDS Dup

(* Constant initialization state *)
ConstInit == TRUE

(* Initial state *)
State0 == n = 0 /\ someFun = 0 :> 0

(* Transition 0 to State1 *)
State1 == n = 1 /\ someFun = (0 :> 0 @@ 0 :> 5) @@ 0 :> 0

(* The following formula holds true in the last state and violates the invariant *)
InvariantViolation == someFun[0] = 5

================================================================================

@konnov
Copy link
Collaborator

konnov commented Dec 3, 2021

Let's add the expensive uniqueness check in the encoding and introduce the non-deterministic version of this operator, so the user would be able to disable this check.

@konnov konnov added the product-owner-triage This should be triaged by the product owner label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FSMT Feature: Improvements in the SMT encoding product-owner-triage This should be triaged by the product owner user-support helping the users
Projects
None yet
Development

No branches or pull requests

3 participants