- Youngju Song, Minki Cho, Dongjoo Kim, Yonghyun Kim, Jeehoon Kang, Chung-Kil Hur.
CompCertM: CompCert with Lightweight Modular Verification and Multi-Language Linking
CompCertM is based on CompCertR which is a refactored version of CompCert. In order to build CompCertM, you need to build CompCertR first.
-
Clone https://github.com/snu-sf/CompCertR and follow the installation instructions.
-
Go to the CompCertR directory and issue the following commands:
git clone https://github.com/snu-sf/CompCertM.git cd CompCertM make -j[N]
For directories that existed in CompCert (e.g. backend/), there are files that extends original CompCert's files.
New directories are roughly as follows.
- bound/: adequacy w.r.t. C and assembly (UpperBound_A.v, UpperBound_B.v, LowerBound.v)
- compose/: interaction semantics and its properties
- demo/: examples including Unreadglob, mutual-sum and utod
- selfsim/: self-simulation of Clight and Assembly
- scripts/: scripts used for SLOC counting (Table 1, 2, 3)
- proof/: our meta theory
We list a few important definitions/proofs from the paper.
Semantics
- Interaction Semantics (Section 5) : compose/Sem.v
- Module (Page 18, Fig. 9): compose/Mod.v
- Module Semantics (Page 18, Fig. 9): compose/ModSem.v
Verification Techniques
- RUSC Theory (Section 2.3): proof/RUSC.v
- Mixed Simulation (Section 2.4): proof/Simulation.v -
xsim
- Parameter-MemRel (Page 20, Fig. 10): proof/SimMem.v
- Parameter-SymbRel (Page 20, Fig. 10): proof/SimSymb.v
- Parameter-MemPred (Page 20, Fig. 10): proof/Sound.v
(NOTE: The definition of a memory predicate is a little different from that in the paper. If we consider(Sound.t * Memory.mem)
as a single component (= "t" in the paper), two definitions are equivalent.) - Parameterized Open Simulation (Page 21, Fig. 11): proof/SimModSem.v
- Open Preservation (Page 21, Fig. 11): proof/Preservation.v -
local_preservation_standard
- Adequacy of Parameterized Open Simulation (Page 22, Theorem 6.2): proof/AdequacyLocal.v -
Theorem adequacy_local
, corollary.
Correctness Theorems
- Compiler cocrrectness theorem (main result): driver/CompilerC.v
- Compositional Correctness I (Page 23, Theorem 7.3):
Theorem compiler_correct
- Compositional Correctness II (Page 24, Theorem 7.5):
Theorem compiler_correct_full
- Compositional Correctness I (Page 23, Theorem 7.3):
- Adequacy w.r.t. Assembly (Page 24, Theorem 7.6): bound/LowerBound.v -
Theorem lower_bound_correct
- Adequacy w.r.t. C (Page 24, Theorem 7.7): bound/UpperBound_B.v -
Theorem upper_bound_b_correct
Examples
- Unreadglob example (Section 4.1): demo/unreadglob (demo directory in CompCertR also contains actual optimization and proof files)
- mutual-sum example (Section 4.2): demo/mutrec
- a.c, b.asm (Page 16, Fig. 7): a.c, b.s
- a.spec, b.spec (Page 16, Fig. 7): MutrecAspec.v, MutrecBspec.v
- ab.spec (Page 16, Fig. 7): MutrecABspec.v
- refinement proof: MutrecRefinement.v -
Theorem Mutrec_correct
- utod example: demo/utod
We first describe a step-by-step process of software verification using CompCert, and then describe such instructions for CompCertM.
(Running)
Write a C program, compile it to binary using CompCert executable (i.e. ccomp
), and run it.
(Verifying)
- Translate each C module into Coq using
clightgen
. - CompCert's C semantics formally defines behavior of your translated program.
- Verify properties about such behavior with your own technique. (e.g. a workflow using VST)
For CompCertM, it is basically the same except we also support hand-written assembly modules.
(Running)
Write a program (you can use both C and assembly), compile C modules to binary using CompCert executable (i.e. compcomp
), and run it. (a.c, b.s)
(Verifying)
-
Translate each C module into Coq using
clightgen
as before. (MutrecA.v)Unfortunately, as there is no such translation tool yet for assembly language, user should manually translate assembly modules into Coq. (MutrecB.v)
CompCert emits assembly module (defined in Coq) into actual
.s
file usingPrintAsm.ml
.You should make sure that printing your assembly (defined in Coq) using
PrintAsm.ml
yields a.s
file that is equivalent to the one you have manually wrote. -
CompCertM's interaction semantics formally defines behavior of your translated program.
-
Verify properties about such behavior with your own technique.
As discussed in the paper, we advocate the use of RUSC theory in program verification, and we demonstrate this with mutrec example.
Define open specs for each module. (MutrecAspec.v, MutrecBspec.v)
Verify each module against its specification modularly. (MutrecAproof.v, MutrecBproof.v)
Merge such open specs togather (MutrecABproof.v)
Prove self-simulations required for RUSC theory. (IdSim*.v)
Get final result with RUSC theory. (MutrecRefinement.v)
Roughly speaking, the implementation is strictly more general or equivalent to the presentation of the paper.
Fig 11
- In (STEP) case, the implementation is more general than the paper version, allowing "stuttering". (We support 4 different kinds of stuttering, forward/backward x source/target)
- In (CALL) case, the implementation existentially quantifies source call data (
(SAFESRC: ms_src.(is_call) st_src0)
) and then universally quantifies it(ATSRC: ms_src.(at_external) st_src0 args_src)
, following forward simulation style. In paper, it is just existentially quantified. However, as we requireat_external
to be deterministic (ModSem.at_external_dtm
), both are equivalent. The implementation followed this style just for uniformity with forward (STEP) case. - The paper hides coinductive nature of
open_sim
(following CompCert's style), but in implementation we used coinductive predicate using Paco library, which gives more flexibility to the user (e.g. allowing intermediate states outside thematch_states
). - In (INIT) case, the paper presents it in a forward-style, but it is implemented in backward-style.
- In paper, (SIM:MOD) asserts
open_prsv
ofsound_state
but in implementation (SIM:MODSEM) asserts it. We designedMod
datatype to be minimal syntactic data, so it does not know the type ofstate
and can't state aboutsound_state
. Changing the implementation to follow the paper's presentation will take just 3 person-hour or so, but we prefer current design. - In implementation, we have
local_preservation_noguarantee
which is not mentioned in the paper. This predicate is strictly easier to prove than thelocal_preservation
, lessening the proof obligation for user once (s)he has provenlocal_preservation
for at least one analysis.