Newer
Older
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
<p>WIP (TA)</p>
<p>The Frama-C kernel acts as a <em>lingua franca</em> for all <a href=
"https://frama-c.frama-c.com/html/kernel-plugin.html">Frama-C plugins</a>. A software analysis platform that
checks security, verifies requirements, and guarantees trust in C programs.</p>
<p>Frama-C is a source code analysis platform that aims at conducting verification of industrial-size C programs. It
provides its users with a collection of plug-ins that perform static analysis, deductive verification, and testing, for
safety- and security-critical software. Collaborative verification across cooperating plug-ins is enabled by their
integration on top of a shared kernel and datastructures, and their compliance to a common specification language. This
foundational article presents a consolidated view of the platform, its main and composite analyses, and some of its
industrial achievements.</p>
<h1>Architecture</h1>
<p>The Frama-C platform is written in OCaml [LDF+13], a functional language whose features are very in- teresting for
implementing program analyzers [CSB+09]. Fig. 2 shows a functional view of the Frama-C plug-in-oriented architecture (à
la Eclipse) whose kernel is based on a modified version of CIL [NMRW02]. CIL is a front-end for C that parses ISO C99
programs into a normalized representation. For instance, loop constructs (for, while, do & while) are given a single
normalized form, normalized expressions have no side-effects, etc. Frama-C extends CIL to support dedicated source code
annotations expressed in ACSL (see Section 3.2). This modified CIL front-end produces the C + ACSL abstract syntax tree
(AST), an abstract view of the program shared among all analyzers. This AST takes into account machine-dependent
parameters (size of integral types, endianness, etc.) which can easily be modified by the end-user.<br>
In addition to the AST, the kernel provides several general services for helping plug-in development [SCP13] and
providing convenient features to Frama-C’s end-user.</p>
<ul>
<li>Messages, source code and annotations are uniformly displayed.</li>
<li>Parameters and command line options are homogeneously handled.</li>
<li>A journal of user actions can be synthesized and replayed afterwards, a useful feature in debugging and qualification contexts [Sig14].</li>
<li>A safe serialization mechanism [CDS11] allows the user to save results of analyses and reload them later.</li>
<li>Projects, presented in Section 3.3, isolate unrelated program representations, and guarantee the integrity of analyses.</li>
<li>Consistency mechanisms control the collaboration between analyzers (Section 3.4).</li>
<li>A visitor mechanism, partly inherited from CIL, facilitates crawling through the AST and writing code transformation plug-ins.</li>
</ul>
<p>Analyzers are developed as separate plug-ins on top of the kernel. Plug-ins are dynamically linked against the kernel
to offer new analyses, or to modify existing ones. Any plug-in can register new services in a plug-in database stored in
the kernel, thereby making these services available to all plug-ins.</p>
<h1>ACSL</h1>
Functional properties of C programs can be expressed within Frama-C as ACSL annotations [BFH+13]. ACSL, the ANSI/ISO C
Specification Language, is a formal specification language inspired by Java’s JML [BCC+05], both being based on the
notion of function contract introduced by Eiffel [Mey97]. In its most basic form, the specification (or contract) of a
function states the pre-conditions it requires from its caller and the post- conditions it ensures when returning. Among
these post-conditions, one kind of clause plays a particular role by saying which memory locations the function assigns,
i.e. which locations might have a different value between the pre- and the post-state. Conversely, any memory location
not mentioned in this clause is supposed to be left unmodified by the function.<br>
Annotations are written in first-order logic, and it is possible to define custom functions and predicates for use in
annotations together with ACSL built-ins. Indeed, ACSL features its own functions and predicates to describe memory
states. However, it does not introduce any notion beyond the C standard, leaving each plug-in free to perform its own
abstractions over the concrete memory layout.</p>
<p>For instance, Fig. 3 provides a formal ACSL specification for a swap function. Informally, swap is supposed to
exchange the content of two (valid) pointer cells given as argument.<br>
The first pre-condition states that the two arguments must be valid int pointers, i.e. that dereferencing a or b will not
produce a run-time error. In addition, the second pre-condition asks that the two locations do not overlap. \valid and
\separated are two built-in ACSL predicates.<br>
The assigns clause states that only the locations pointed to by a and b might be modified by a call to swap; any other
memory location is untouched. Finally, the post-condition says that at the end of the function, *a contains the value
that was in *b in the pre-state, and vice versa.<br>
Function contracts can also be structured into behaviors. In addition to the clauses described above, a behavior can be
guarded by assumes clauses, which specify under which conditions the behavior is activated. This provides a convenient
way to distinguish various cases under which the function can be called. Behaviors do not need to be either complete (at
least one behavior is active for any given call) or disjoint (there is at most one behavior active for a given call), but
this can be stipulated with the appropriate clauses in order to check that the specification is correct in this respect.
As an example, Fig. 4 presents the specification of a function swap_or_null similar to swap except that it will attempt
to swap contents only when both pointers are non-null. This corresponds to the behavior not_null, in which case the
function expects to have valid (in C a non-null pointer is not necessarily valid) and separated pointers and will act as
above. On the other hand, in the null case, the function does nothing: at least one pointer is null, so the values cannot
be swapped. We are always in exactly one of these two cases, so null and not_null are clearly complete and disjoint.<br>
In addition to function specifications, ACSL offers the possibility of writing annotations in the code, in the form of
assertions, properties that must be true at a given point, or loop invariants, properties that are true for each loop
step. More precisely, a loop invariant is associated to a for, while, or do & while loop. It must hold when arriving at
the loop entry for the first time, and must be preserved by a loop step. That is, if we enter the loop body in a state
that verifies the invariant, then the invariant must hold at the end of the block – except if we exit from the block
through goto, break or continue. With these two properties, we can then prove by induction that the invariant is true for
any number of loop steps<br>
(including 0). As with assigns clauses in function contracts, loop assigns clauses are a particular case of invariant
that state which memory locations might have been modified since the beginning of the loop (implicitly stating that all
other locations are not modified by the loop). Loop invariants are required for<br>
deductive verification (see Section 5).</p>
<p>As an example, Fig. 5 presents a simple for loop initializing the elements of an array, together with its associated
invariants. The first invariant specifies the bounds of the index of the loop. Note that the invariant uses i<=10,
while the test in the code is i<10. Indeed, the invariant must be preserved by any step of the loop, including the
last one, in which i is equal to 10 at the end of the step. The second invariant states that the i-1 first cells of the
array have been initialized. It is true before entering the loop, as there is no j such that 0<=j<0, so the
implication is trivially true. Then, if we suppose that the invariant holds at the beginning of a loop step, the step
will initialize cell i and increment i, so that the invariant is still true af the end of the step. Similarly, the loop
assigns clause indicates that the loop has potentially modified i and the first i-1 cells of the array. Namely, this is
true when entering the loop (nothing has been modified), and if we suppose that only i and the beginning of the array has
been assigned, the next loop step also assigns i and the ith cell, again making the invariant true at the end of the
step. Loop invariants are not concerned with termination. Instead, termination can be ensured through the special clause
loop variant. It requires an integer expression that strictly decreases during a loop step, while remaining positive each
time the loop enters a new iteration. When both these conditions are met, we know that the loop can only be taken a
finite number of times. An example of variant for the loop in Fig. 5 would be the following: loop variant 10 - i; Each
plug-in can provide a validity status to any ACSL property and/or generate ACSL annotations. This lets ACSL annotations
play an important role in the communication between plug-ins, as explained in Section 3.4.</p>
<h1>Projects</h1>
<p>Frama-C allows a
user to work on several programs in parallel thanks to the notion of project. A project consistently stores a program
with all its required information, including results computed by analyzers and their parameters. Several projects may
coexist in memory at the same time. A non-interference theorem guarantees project partitioning [Sig09]: any modification
on a value of a project P does not impact a value of another project P′. Such a feature is of particular interest when
a program transformer such as E-ACSL (Section 7), Aora ̈ı (Section 8), or Slicing (Section 10.1) is used. The result of
the transformation is a fresh AST that coexists with the original one, making backtracking and comparisons easy. This is
illustrated in Section 10.2. Another use of projects is to process the same program in different ways – for instance
with different analysis parameters.</p>
<h1>Collaborations across analyzers</h1>
In Frama-C, analyzers can interoperate in two
different ways: either sequentially, by chaining analysis results to perform complex operations; or in parallel, by
combining partial analysis results into a full program verification. The former consists in using the results of an
analyzer as input to another one thanks to the plug-in database stored by the Frama-C kernel. Refer to Section 10.1 for
an illustration of a sequential analysis. The parallel collaboration of analyzers consists in verifying a program by
heterogeneous means. ACSL is used to this end as a collaborative language: plug-ins generate program annotations, which
are then vali- dated by other plug-ins. Partial results coming from various plug-ins are integrated by the kernel to
provide a consolidated status of the validity of all ACSL properties. For instance, when the Value plug-in (Section 4) is
unable to ensure the validity of a pointer p, it emits an unproved ACSL annotation assert \valid(p). ACSL’s blocking
semantics states that an execution trace leading to an invalid property stops its execu- tion (see [GGJK08, HMM12,
CS12]). Thus, Value can assume that p is valid from this program point onwards, since the execution can only continue if
the assert is valid. The WP plug-in (Section 5) may later be used to prove this hypothesis, or it can be checked at
runtime by leveraging the E-ACSL plug-in (Section 7). The kernel automatically computes the validity status of each
program property from the information provided by all analyzers and ensures the correctness of the entire verification
process. For that, plug-ins can set the local validity status of an ACSL property, together with its set of dependencies.
Those dependencies can be:</p>
<ul>
<li>other ACSL properties that are assumed to be true;</li>
<li>the reachability of a statement;</li>
<li>some parameters of the plug-in that set a hypothesis over the memory state of the program.</li>
</ul>
<p>With this information, the
kernel can then compute a consolidated validity status for each ACSL property. [CS12] presents the algorithm that is used
for that purpose. Its main correctness property can be stated as: “if the consolidated status of a property is computed
as valid [resp. invalid] by the kernel, then the property is valid [resp. invalid] with respect to ACSL semantics”.
This algorithm is also complete:“if a property is proven valid [resp. invalid] by at least one analyzer, then its
computed consolidated status is valid [resp. invalid] as soon as one other analyzer does not prove the contrary, in which
case the special status ‘inconsistent’ is computed.”</p>