Skip to content

Commit

Permalink
Add a section to the manual for Cryptol->Coq extraction.
Browse files Browse the repository at this point in the history
Other minor manual fixes.
  • Loading branch information
robdockins committed Nov 10, 2021
1 parent a12b6b1 commit 58fcd63
Show file tree
Hide file tree
Showing 3 changed files with 249 additions and 29 deletions.
2 changes: 1 addition & 1 deletion doc/manual/galois-whitepaper.cls
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@
\rhead{\sffamily \textit{\@title{}}}
\makeatother

\lfoot{\sffamily\color{lightgray}\small\copyright{} 2019 Galois, Inc.}
\lfoot{\sffamily\color{lightgray}\small\copyright{} 2021 Galois, Inc.}

\rfoot{
% \sffamily \color{galoislightgray}\small
Expand Down
276 changes: 248 additions & 28 deletions doc/manual/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ command-line options:

`-b path, --java-bin-dirs`

~ Specfy a colon-delimited list of paths to search for a Java
~ Specify a colon-delimited list of paths to search for a Java
executable.

`-d num, --sim-verbose=num`
Expand Down Expand Up @@ -498,7 +498,7 @@ understand the role of the Cryptol language in SAW.

# Cryptol and its Role in SAW

Cyptol is a domain-specific language originally designed for the
Cryptol is a domain-specific language originally designed for the
high-level specification of cryptographic algorithms. It is general
enough, however, to describe a wide variety of programs, and is
particularly applicable to describing computations that operate on
Expand Down Expand Up @@ -1082,19 +1082,19 @@ Whether proving small lemmas (in the form of rewrite rules) or a
top-level theorem, the process builds on the idea of a *proof script*
that is run by one of the top level proof commands.

* `prove_print : ProofScript SatResult -> Term -> TopLevel Theorem`
* `prove_print : ProofScript () -> Term -> TopLevel Theorem`
takes a proof script (which we'll describe next) and a `Term`. The
`Term` should be of function type with a return value of `Bool` (`Bit`
at the Cryptol level). It will then use the proof script to attempt to
show that the `Term` returns `True` for all possible inputs. If it is
successful, it will print `Valid` and return a `Theorem`. If not, it
will abort.

* `sat_print : ProofScript SatResult -> Term -> TopLevel ()` is similar
* `sat_print : ProofScript () -> Term -> TopLevel ()` is similar
except that it looks for a *single* value for which the `Term` evaluates
to `True` and prints out that value, returning nothing.

* `prove_core : ProofScript SatResult -> String -> TopLevel Theorem`
* `prove_core : ProofScript () -> String -> TopLevel Theorem`
proves and returns a `Theorem` from a string in SAWCore syntax.

## Automated Tactics
Expand Down Expand Up @@ -1197,11 +1197,11 @@ provers is to unfold everything before sending a goal to a prover.
However, with some provers it is possible to indicate that specific
named subterms should be represented as uninterpreted functions.

* `unint_cvc4 : [String] -> ProofScript SatResult`
* `unint_cvc4 : [String] -> ProofScript ()`

* `unint_yices : [String] -> ProofScript SatResult`
* `unint_yices : [String] -> ProofScript ()`

* `unint_z3 : [String] -> ProofScript SatResult`
* `unint_z3 : [String] -> ProofScript ()`

The list of `String` arguments in these cases indicates the names of the
subterms to leave folded, and therefore present as uninterpreted
Expand All @@ -1218,15 +1218,15 @@ In addition to the built-in automated provers already discussed, SAW
supports more generic interfaces to other arbitrary theorem provers
supporting specific interfaces.

* `external_aig_solver : String -> [String] -> ProofScript SatResult`
* `external_aig_solver : String -> [String] -> ProofScript ()`
supports theorem provers that can take input as a single-output AIGER
file. The first argument is the name of the executable to run. The
second argument is the list of command-line parameters to pass to that
executable. Any element of this list equal to `"%f"` will be replaced
with the name of the temporary AIGER file generated for the proof goal.
The output from the solver is expected to be in DIMACS solution format.

* `external_cnf_solver : String -> [String] -> ProofScript SatResult`
* `external_cnf_solver : String -> [String] -> ProofScript ()`
works similarly but for SAT solvers that take input in DIMACS CNF format
and produce output in DIMACS solution format.

Expand All @@ -1237,15 +1237,15 @@ until a later time, there are functions to write the current goal to a
file in various formats, and then assume that the goal is valid through
the rest of the script.

* `offline_aig : String -> ProofScript SatResult`
* `offline_aig : String -> ProofScript ()`

* `offline_cnf : String -> ProofScript SatResult`
* `offline_cnf : String -> ProofScript ()`

* `offline_extcore : String -> ProofScript SatResult`
* `offline_extcore : String -> ProofScript ()`

* `offline_smtlib2 : String -> ProofScript SatResult`
* `offline_smtlib2 : String -> ProofScript ()`

* `offline_unint_smtlib2 : [String] -> String -> ProofScript SatResult`
* `offline_unint_smtlib2 : [String] -> String -> ProofScript ()`

These support the AIGER, DIMACS CNF, shared SAWCore, and SMT-Lib v2
formats, respectively. The shared representation for SAWCore is
Expand All @@ -1259,23 +1259,26 @@ listed in its first argument as uninterpreted functions.
Some proofs can be completed using unsound placeholders, or using
techniques that do not require significant computation.

* `assume_unsat : ProofScript SatResult` indicates that the current goal
should be assumed to be unsatisfiable. At the moment,
`jvm_verify` and `llvm_verify` (described below) run
their proofs in a satisfiability-checking (negated) context, so
`assume_unsat` indicates that the property being checked should be
assumed to be true. This is likely to change in the future.
* `assume_unsat : ProofScript ()` indicates that the current goal
should be assumed to be unsatisfiable. This is an alias for
`assume_valid`. Users should prefer to use `admit` instead.

* `assume_valid : ProofScript ProofResult` indicates that the current
goal should be assumed to be valid.
* `assume_valid : ProofScript ()` indicates that the current
goal should be assumed to be valid. Users should prefer to
use `admit` instead

* `quickcheck : Int -> ProofScript SatResult` runs the goal on the given
* `admit : String -> ProofScript ()` indicates that the current
goal should be assumed to be valid without proof. The given
string should be used to record why the user has decided to
assume this proof goal.

* `quickcheck : Int -> ProofScript ()` runs the goal on the given
number of random inputs, and succeeds if the result of evaluation is
always `True`. This is unsound, but can be helpful during proof
development, or as a way to provide some evidence for the validity of a
specification believed to be true but difficult or infeasible to prove.

* `trivial : ProofScript SatResult` states that the current goal should
* `trivial : ProofScript ()` states that the current goal should
be trivially true (i.e., the constant `True` or a function that
immediately returns `True`). It fails if that is not the case.

Expand Down Expand Up @@ -1354,9 +1357,6 @@ operate on AIGs.
"blasting" all of its primitive operations (things like bit-vector
addition) down to the level of individual bits.

* `cec : AIG -> AIG -> TopLevel ProofResult` compares two `AIG` values,
returning a `ProofResult` representing whether the two are equivalent.

* `load_aig : String -> TopLevel AIG` loads an `AIG` from an external
AIGER file.

Expand Down Expand Up @@ -2757,3 +2757,223 @@ let main : TopLevel () = do {
~~~~
[^4]: https://en.wikipedia.org/wiki/Salsa20
# Extraction to the Coq theorem prover
In addition to the (semi-)automatic and compositional proof modes
already discussed above, SAW has experimental support for exporting
Cryptol and SAWCore values as terms to the Coq proof assistant[^5].
This is intended to support more manual proof efforts for properties
that go beyond what SAW can support (for example, proofs requiring
induction) or for connecting to preexisting formalizations in Coq
of useful algorithms (e.g. the fiat crypto library[^6]).
This support consists of two related pieces. The first piece is a library of
formalizations of the primitives underlying Cryptol and SAWCore and
various supporting concepts that help bridge the conceptual gap
between SAW and Coq. The second piece is a term translation
that maps the syntactic forms of SAWCore onto corresponding
concepts in Coq syntax, designed to dovetail with the concepts
defined in the support library. SAWCore is a quite similar language
to the core calculus underlying Coq, so much of this translation is
quite straightforward; however, the languages are not exactly
equivalent, and there are some tricky cases that mostly arise from
Cryptol code that can only be partially supported. We will note
these restrictions later in the manual.
We expect this extraction process to work with a fairly wide range of Coq
versions, as we are not using bleeding-edge Coq features. It has been
most fully tested with Coq version 8.13.2.
## Support Library
In order to make use of SAW's extraction capabilities, one must first
compile the support library using Coq so that the included definitions
and theorems can be referenced by the extracted code. From the top
directory of the SAW source tree, the source code for this support
library can be found in the `saw-core-coq/coq` subdirectory.
In this subdirectory you will find a `_CoqProject` and a `Makefile`.
A simple `make` invocation should be enough to compile
all the necessary files, assuming Coq is installed and `coqc` is
available in the user's `PATH`. HTML documentation for the support
library can also be generated by `make html` from the same directory.
Once the library is compiled, the recommended way to import
it into your subsequent development is by adding the following
lines to your `_CoqProject` file:
```
-Q <SAWDIR>/saw-core-coq/coq/generated/CryptolToCoq CryptolToCoq
-Q <SAWDIR>/saw-core-coq/coq/handwritten/CryptolToCoq CryptolToCoq
```
Here `<SAWDIR>` refers to the location on your system where the
SAWScript source tree is checked out. This will add the relevant
library files to the `CryptolToCoq` namespace, where the extraction
process will expect to find them.
The support library for extraction is broken into two parts: those
files which are handwritten, versus those that are automatically
generated. The handwritten files are generally fairly readable and
are reasonable for human inspection; they define most of the
interesting pipe-fitting that allows Cryptol and SAWCore definitions
to connect to corresponding Coq concepts. In particular the
file `SAWCoreScaffolding.v` file defines most of the bindings of base
types to Coq types, and the `SAWCoreVectorsAsCoqVectors.v` defines the
core bitvector operations. The automatically generated files are direct
translations of the SAWCore source files
(`saw-core/prelude/Prelude.sawcore` and
`cryptol-saw-core/saw/Cryptol.sawcore`) that correspond to the
standard libraries for SAWCore and Cryptol, respectively.
The autogenerated files are intended to be kept up-to-date with
changes in the corresponding `sawcore` files, and end users should
not need to generate them. Nonetheless, if they are out of sync for some
reason, these files may be regenerated using the `saw` executable
by running `(cd saw-core-coq/coq; saw saw/generate_scaffolding.saw)`
from the top-level of the SAW source directory before compiling them
with Coq as described above.
You may also note some additional files and concepts in the standard
library, such as `CompM.v`, and a variety of lemmas and definitions
related to it. These definitions are related to the "heapster" system,
which form a separate use-case for the SAWCore to Coq translation.
These definitions will not be used for code extracted from Cryptol.
## Cryptol module extraction
There are several modes of use for the SAW to Coq term extraction
facility, but the easiest to use is whole Cryptol module extraction.
This will extract all the definitions in the given Cryptol module,
together with it's transitive dependencies, into a single Coq module
which can then be compiled and pulled into subsequence developments.
Suppose we have a Cryptol source file named `source.cry` and we want
to generate a Coq file named `output.v`. We can accomplish this by
running the following commands in saw (either directly from the `saw`
command prompt, or via a script file)
```
enable_experimental;
write_coq_cryptol_module "source.cry" "output.v" [] [];
```
In this default mode, identifiers in the Cryptol source will be
directly translated into identifiers in Coq. This may occasionally
cause problems if source identifiers clash with Coq keywords or
preexisting definitions. The third argument to
`write_coq_cryptol_module` can be used to remap such names if
necessary by giving a list of `(in,out)` pairs of names. The fourth
argument is a list of source identifiers to skip translating, if
desired. Authoritative online documentation for this command can be
obtained directly from the `saw` executable via `:help
write_coq_cryptol_module` after `enable_experimental`.
The resulting "output.v" file will have some of the usual hallmarks of
computer-generated code; it will have poor formatting and, explicit
parenthesis and fully-qualified names. Thankfully, once loaded into
Coq, the Coq pretty-printer will do a much better job of rendering
these terms in a somewhat human-readable way.
## Proofs involving uninterpreted functions
It is possible to write a Cryptol module that references uninterpreted
functions by using the `primitive` keyword to declare them in your
Cryptol source. Primitive Cryptol declarations will be translated into
Coq section variables; as usual in Coq, uses of these section
variables will be translated into additional parameters to the
definitions from outside the section. In this way, consumers of
the translated module can instantiate the declared Cryptol functions
with corresponding terms in subsequent Coq developments.
Although the Cryptol interpreter itself will not be able to compute
with declared but undefined functions of this sort, they can be used
both to provide specifications for functions to be verified with
`llvm_verify` or `jvm_verify` and also for Coq extraction.
For example, if I write the following Cryptol source file:
```
primitive f : Integer -> Integer
g : Integer -> Bool
g x = f (f x) > 0
```
After extraction, the generated term `g` will have Coq type:
```
(Integer -> Integer) -> Integer -> Bool
```
## Translation limitations and caveats
Translation from Cryptol to Coq has a number of fundamental
limitations that must be addressed. The most severe of these is that
Cryptol is a fully general-recursive language, and may exhibit runtime
errors directly via calls to the `error` primitive, or via partial
operations (such as indexing a sequence out-of-bounds). The internal
language of Coq, by contrast, is a strongly-normalizing language of
total functions. As such, our translation is unable to extract all
Cryptol programs.
### Recursive programs
The most severe of the currently limitations for our system
is that the translation is unable to translate any recursive Cryptol
program. Doing this would require us to attempt to find some
termination argument for the recursive function sufficient to satisfy
Coq; for now, no attempt is made to do so. if you attempt to
extract a recursive function, SAW will produce an error
about a "malformed term" with `Prelude.fix` as the head symbol.
Certain limited kinds of recursion are available via the
`foldl` Cryptol primitive operation, which is translated directly
into a fold operation in Coq. This is sufficient for many basic
iterative algorithms.
### Type coercions
Another limitation of the translation system is that Cryptol uses SMT
solvers during its typechecking process and uses the results of solver
proofs to justify some of its typing judgments. When extracting these
terms to Coq, type equality coercions must be generated. Currently, we
do not have a good way to transport the reasoning done inside
Cryptol's typechecker into Coq, so we just supply a very simple `Ltac`
tactic to discharge these coercions (see `solveUnsafeAssert` in
`CryptolPrimitivesForSAWCoreExtra.v`). This tactic is able to
discover simple coercions, but for anything nontrivial it may
fail. The result will be a typechecking failure when compiling the
generated code in Coq when the tactic fails. If you encounter this
problem, it may be possible to enhance the `solveUnsafeAssert` tactic
to cover your use case.
### Error terms
A final caveat that is worth mentioning is that Cryptol can sometimes
produce runtime errors. These can arise from explicit calls to the
`error` primitive, or from partially defined operations (e.g.,
division by zero or sequence access out of bounds). Such instances
are translated to occurrences of an unrealized Coq axiom named `error`.
In order to avoid introducing an inconsistent environment, the `error`
axiom is restricted to apply only to inhabited types. All the types
arising from Cryptol programs are inhabited, so this is no problem
in principle. However, collecting and passing this information around
on the Coq side is a little tricky.
The main technical device we use here is the `Inhabited` type class;
it simply asserts that a type has some distinguished inhabitant. We
provide instances for the base types and type constructors arising
from Cryptol, so the necessary instances ought to be automatically
constructed when needed. However, polymorphic Cryptol functions add
some complications, as type arguments must also come together with
evidence that they are inhabited. The translation process takes care
to add the necessary `Inhabited` arguments, so everything ought to
work out. However, if Coq typechecking of generated code fails with
errors about `Inhabited` class instances, it likely represents some
problem with this aspect of the translation.
[^5]: https://coq.inria.fr
[^6]: https://github.com/mit-plv/fiat-crypto
Binary file modified doc/manual/manual.pdf
Binary file not shown.

0 comments on commit 58fcd63

Please sign in to comment.