Commit 3f1124d5 authored by Valentin Perrelle's avatar Valentin Perrelle
Browse files

Suite et fin.

parent afa55b8b
---
title: "Interpreted Automata and how they are used in Frama-C"
title: "A new Control Flow Graph representation for Frama-C"
author: Valentin Perrelle
date: April 9, 2018
header-includes:
- \usepackage{tikz}
- \usepackage{macros}
- \usetikzlibrary{chains,interpretedautomata,backgrounds,calc}
- \renewcommand{\ttdefault}{pcr}
- \lstset{
escapechar=^,
basicstyle=\footnotesize\ttfamily,
keywordstyle=\bfseries,
columns=fullflexible
}
...
......@@ -17,26 +24,16 @@ Context
- Static analysis of C code with \FramaC / \Eva
- Control Flow partially abstracted by Cil
- All loops are transformed into `while (true)` loops
- All structured loops are transformed into `while (true)` loops
- Lazy operators are translated to `if-then-else`
- Returns are only allowed as the last statement
- The `Cfg` module can translate `break`, `continue` and `switch` to
`if` and `goto`
- The analysis still have to deal with "undefined sequences"
- The analysis still has to deal with "undefined sequences"
- ... and new constructs from C++ like `try` - `catch`
- Generic dataflow analysis are possible (modules `Dataflow2`, `Dataflows`)
- It completely abstract the control-flow for the analyser, but forces the
logic and the order of iteration
Why use intermediate representations ?
--------------------------------------
- Avoir une représentation intermédiaire du flot de contrôle
- Décharge du besoin de se préoccuper de l'analyse du flot de contrôle
- Ouverte aux transformations
- (Indépendante du langage source)
- Sans notion d'ordre d'itération
- Les analyses peuvent fixer l'ordre d'itération librement
- Generic dataflow analysis are available (modules `Dataflow2`, `Dataflows`)
- It completely abstracts the control-flow for the analyser, but forces
the logic and the order of a fixpoint iteration
Automates Interprétés : Exemple {.fragile}
......@@ -79,13 +76,15 @@ while (i < 10) {
\end{minipage}
Définition
----------
Why use an intermediate representation ?
----------------------------------------
- Graphe de flot de contrôle
Flot de contrôle + annotation sur les arcs
- Représentation intermédiaire
- Motivation : Simplifier le traitement des statements
- An intermediate representation for the control flow graph
- discharges the anlyses from reconstructing it
- with only light constraints
- Suitable for transformations
- Independent from source language, as long as transfer functions are
general enough
Semantics {.fragile}
......@@ -135,7 +134,7 @@ $f$ is a {\bf command} : & $c1,\sigma \longrightarrow c2,f(\sigma)$ &
Y = \bigcup_{c_i \rightarrow d} F_i\tikzlabel{tf}(X_i)
\]
\end{minipage}
\uncover<2>{\tikzcomment{tf}{{\it Transfer functions}, in theory could be \\ any monotonic function}}
\uncover<2>{\tikzcomment{tf}{\textwidth,-2ex}{{\it Transfer functions}, in theory could be \\ any monotonic function}}
### Abstract Semantics
......@@ -193,13 +192,12 @@ Expressivity {.fragile}
\end{tikzpicture}
Weak Topological Order {.fragile}
----------------------
\noindent
\begin{minipage}{.4\textwidth}
\begin{onlyenv}<1-2>
\begin{onlyenv}<1-3>
\begin{lstlisting}[language=C]
int A[10][10];
int i = 0;
......@@ -213,7 +211,7 @@ while (i < 10) {
}
\end{lstlisting}
\end{onlyenv}
\begin{onlyenv}<3->
\begin{onlyenv}<4->
\begin{lstlisting}[language=C]
int A[10][10];
int i = 0;
......@@ -225,15 +223,17 @@ l1: if (i < 10) {
goto l2;
}
i++;
goto l1;
goto^\tikzlabel{goto}^ l1;
}
\end{lstlisting}
\tikzcomment{goto}{\textwidth,0}{Works the same with \\ {\tt goto} loops}
\end{onlyenv}
\end{minipage}
\begin{minipage}{.5\textwidth}
\centering
\only<1>{\tikzstyle{head} = [cp]}
\only<2->{\tikzstyle{head} = [wp]}
\begin{tikzpicture}[chainautomata]
\begin{tikzpicture}[chainautomata,remember picture]
\node[cp] (c1) {$c_1$};
\node[head] (c2) {$c_2$};
\node[cp] (c3) {$c_3$};
......@@ -258,33 +258,94 @@ l1: if (i < 10) {
\uncover<2->{
\draw[component,fill=white!80!green] ($(c7.south west)-(1.5,0.2)$) rectangle ($(c2.north east)+(1.5,0.1)$);
\draw[component] ($(c6.south west)-(0.4,0.2)$) rectangle ($(c4.north east)+(0.4,0.1)$);
\node (scc2) at ($(c4.north east)+(0.4,0.1)$) {};
}
\end{scope}
\end{tikzpicture}
\uncover<2->{
$(c_1~(c_2~c_3~(c_4~c_5~c_6)~c_7)~c_8)$
}
\end{minipage}
\begin{minipage}{.05\textwidth}
\begin{overlayarea}{\textwidth}{2cm}
\only<3>{
\tikzcomment{c2}{\textwidth,12ex}{Loop head}
\tikzcomment{scc2}{\textwidth,4ex}{Strongly connected {\bf sub}-component}
}
\end{overlayarea}
\end{minipage}
WTO properties
--------------
Pourquoi le Wto ?
-----------------
- Rebuild the original structure of the program
- But also works on destructurated programs with non natural loops.
- For abstract interpretation, gives widening points
- Calcul
- Intérêt
- reconstruit les boucles originales mais détecte également les boucles non originales
- marche aussi sur les boucles non naturelles
- définit des points de widening
WTO with priorities {.fragile}
-------------------
\noindent
\begin{minipage}{.55\textwidth}
\begin{lstlisting}[language=C]
start:
int i = 0;
//@ loop invariant i <= 10;
while (true) {
i++;
if (i >= 10)
break;
}
goto start;
\end{lstlisting}
\end{minipage}
\begin{minipage}{.40\textwidth}
\tikzstyle{h1} = []
\tikzstyle{h2} = []
\tikzstyle{h3} = []
\only<2>{\tikzstyle{h1} = [wp]}
\only<3>{\tikzstyle{h2} = [wp]}
\only<4>{\tikzstyle{h3} = [wp]}
\begin{tikzpicture}[chainautomata,remember picture]
\node[cp,h1,h3] (c1) {$c_1$};
\node[cp,h1,h2] (c2) {$c_2$};
\node[cp,h3] (c3) {$c_3$};
\node[cp] (c4) {$c_4$};
\node[cp] (c5) {$c_5$};
\draw
(c1) edge node[right] {\tt i = 0} (c2)
(c2) edge node[right] {\tt i ++} (c3)
(c3) edge node[right] {\tt i < 10} (c4)
(c4) edge [bend left=45] node[left] {} (c2)
(c5) edge [bend left=70] node[left] {} (c1)
;
\draw (c3) .. controls +(1.5, 0) and +(1.5,0.0) .. node[right] {\tt i >= 10} (c5);
Wto avec priorités
------------------
\begin{scope}[on background layer]
\uncover<2->{
\draw[component,fill=white!80!green] ($(c5.south west)-(1.5,0.2)$) rectangle ($(c1.north east)+(1.5,0.1)$);
}
\uncover<2,4>{
\draw[component] ($(c4.south west)-(0.4,0.2)$) rectangle ($(c2.north east)+(0.4,0.1)$);
}
\end{scope}
\end{tikzpicture}
\end{minipage}
Avantages (invariants de boucles), inconvénients (boucles non naturelles)
\begin{itemize}
\item Several natural choices of loop heads
\begin{itemize}
\item \only<2>{\bf} First vertex encountered \normalfont
\item \only<3>{\bf} Structured loops with invariants \normalfont
\item \only<4>{\bf} Before loop exits \normalfont
\end{itemize}
\end{itemize}
Les automates interprétés dans Frama-C
======================================
Interpreted Automata in Frama-C
===============================
Frama-C Architecture Changes
----------------------------
......@@ -343,39 +404,471 @@ Frama-C Architecture Changes
\end{center}
Intérêt des automates interprétés
---------------------------------
Why use interpreted automata in Eva
-----------------------------------
Simplification du moteur d'itération
- Leave / Enter, If-then-else, Instanciation du foncteur Dataflow
Ajout de passes de compilation intermédiaires
Réutilisation de la structure pour exporter les informations
- Simplifies the iteration engine
- Simpler control flow, does not go in and out `Dataflow2` functor
- Automatic handling of Enter / Leave transitions
- Total liberty in the order of evaluation
- Widening points are now synchronized with the iteration strategies
- Easy implementation of descending sequences
- Easy implementation of a "reset sub-components before iteration"
- The graph structure can be reused to store the result of the analysis more
precisely (Not done yet)
Ocaml Definition
----------------
Ocaml Definition 1/2
--------------------
\lstset{basicstyle=\scriptsize\ttfamily}
````ocaml
type guard_kind = Then | Else
type assert_kind = Invariant | Assert | Check | Assume
type 'vertex transition =
| Skip
| Return of exp option * stmt
| Guard of exp * guard_kind * stmt
| Prop of assert_kind * identified_predicate * guard_kind *
'vertex labels * Property.t
| Instr of instr * stmt
| Enter of block
| Leave of block
type 'vertex edge = private {
edge_key : int;
edge_kinstr : kinstr;
edge_transition : 'vertex transition;
edge_loc : location;
}
````
Ocaml Definition 2/2
--------------------
\lstset{basicstyle=\scriptsize\ttfamily}
````ocaml
type info = NoneInfo | LoopHead of int (* level *)
type vertex = private {
vertex_key : int;
mutable vertex_start_of : stmt option;
mutable vertex_info : info;
}
Choix architecturaux
module G : Graph.Sig.I
with type V.t = vertex
and type E.t = vertex * vertex edge * vertex
and type V.label = vertex
and type E.label = vertex edge
type automaton = {
graph : G.t;
entry_point : vertex;
return_point : vertex;
stmt_table : (vertex * vertex) Cil_datatype.Stmt.Hashtbl.t;
}
````
Architectural choices
---------------------
- Only one transfer function per edge
- Keep a lot of intermediate vertices: a single statement can lead to several
vertices
- Traceability
- Each vertex is associated to the statement it is the start to
- Each statement is associated to the two vertex starting and ending
the statement
Wto in Frama-C
--------------
- Implementation in `src/utils/wto.ml`
- Original version from Loïc Correnson in 2015
- Functorized version from Matthieu Lemerre in 2016
- Improved with preferences by François Bobot in 2017
````ocaml
type 'n component =
| Component of 'n * 'n partition
| Node of 'n
and 'n partition = 'n component list
````
Fixpoint calculus with WTO
--------------------------
````ocaml
let process_vertex (v : vertex) : bool =
...
let rec iterate_list (l : wto) =
List.iter iterate_element l
and iterate_element = function
| Wto.Node v ->
ignore (process_vertex v)
| Wto.Component (v, w) as component ->
while process_vertex v do
iterate_list w;
done
let wto = Interpreted_automata.get_wto kf in
iterate_list wto
````
Future improvements {.fragile}
-------------------
\lstset{basicstyle=\tiny\ttfamily}
\centering
### Function contracts, logic
\noindent
\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language=C]
/*@ behavior B1:
assimes A1;
requires R1;
behavior B2:
assumes A2;
requires R2; */
void f() {
...
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{.45\textwidth}
\begin{tikzpicture}[chainautomata,scale=0.7,every node/.style={scale=0.7},node distance=2.5mm]
\node[cp] (c1) {$c_1$};
\node[cp] (c2) {$c_2$};
\node[cp] (c3) {$c_3$};
\node[cp] (c4) {$c_4$};
\node[cp] (c5) {$c_5$};
\draw
(c1) edge node[right] {$A_1$} (c2)
edge[bend right] node[left] {$\neg A_1$} (c3)
(c2) edge node[right] {$R_1$} (c3)
(c3) edge node[right] {$A_2$} (c4)
edge[bend right] node[left] {$\neg A_2$} (c5)
(c4) edge node[right] {$R_2$} (c5)
;
\end{tikzpicture}
\end{minipage}
### Exception handling
\noindent
\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language=C]
void f()
{
try {
g();
}
catch (...) {
throw;
}
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{.45\textwidth}
\begin{tikzpicture}[chainautomata,scale=0.7,every node/.style={scale=0.7},node distance=2.5mm]
\node[cp] (c1) {$c_1$};
\node[cp] (c2) {$c_2$};
\node[cp] (c3) {$c_3$};
\node[cp] (c4) {$c_4$};
\node[cp,right=1cm of c3] (c5) {$c_5$};
\draw
(c1) edge node[right] {\tt f()} (c2)
(c2) edge node[right] {\tt catch?} (c3)
edge node[above right] {$\neg$ \tt catch?} (c5)
(c3) edge node[right] {\tt throw e} (c4)
;
\end{tikzpicture}
\end{minipage}
More features using automata indirectly
=======================================
Descending sequences {.fragile}
--------------------
- Une seule fonction de transfert par arc
- Beaucoup de noeuds intermédiaires
\noindent
\begin{minipage}{.35\textwidth}
\begin{lstlisting}[language=C]
void main()
{
int n = 10;
int i = 0;
while (++i < n) {
...
}
}
\end{lstlisting}
\end{minipage}
\begin{onlyenv}<2->
\begin{minipage}{.6\textwidth}
\only<-2>{\tikzstyle{exit} = []}
\only<3->{\tikzstyle{exit} = [thick,draw=blue]}
\begin{tikzpicture}[chainautomata]
\node[cp] (c1) {$c_1$};
\node[cp] (c2) {$c_2$};
\node[wp,exit] (c3) {$c_3$};
\node[cp,exit] (c4) {$c_4$};
\node[cp] (c5) {$c_5$};
\node[cp] (c6) {$c_6$};
\draw
(c1) edge node[right] {\tt n = 10} (c2)
(c2) edge node[right] {\tt i = 0} (c3)
(c3) edge[exit] node[right] {\tt i = i + 1} (c4)
(c4) edge node[right] {\tt i < n} (c5)
(c5) edge [bend left=90] node[left] {} (c3)
;
\draw[exit] (c4) .. controls +(1.2, 0) and +(1.2,0.0) .. node[right] {\tt i >= n} (c6);
\begin{scope}[on background layer]
\draw[component,fill=white!80!green] ($(c5.south west)-(0.7,0.2)$) rectangle ($(c3.north east)+(0.7,0.1)$);
\end{scope}
\node[right=20mm of c1] {1st};
\node[right=20mm of c2] {$[0,0]$};
\node[right=20mm of c3] {$[0,0]$};
\node[right=20mm of c4] {$[1,1]$};
\node[right=20mm of c5] {$[1,1]$};
\node[right=32mm of c1] {2nd};
\node[right=32mm of c3] {$[0,1]$};
\node[right=32mm of c4] {$[1,2]$};
\node[right=32mm of c5] {$[1,2]$};
\node[right=44mm of c1] {3rd};
\node[right=44mm of c3] {$[0,+\infty[$};
\node[right=44mm of c4] {$[1,+\infty[$};
\node[right=44mm of c5] {$[1,9]$};
\node[right=44mm of c6] {$[10,+\infty[$};
\only<3->{
\node[right=56mm of c1] {D};
\node[right=56mm of c3] {$[0,9]$};
\node[right=56mm of c4] {$[1,10]$};
\node[right=56mm of c6] {$[10,10]$};
}
\end{tikzpicture}
\end{minipage}
\end{onlyenv}
\begin{onlyenv}<1>
\lstset{basicstyle=\tiny\ttfamily}
\begin{lstlisting}
test.c:5:[value:alarm] ^\bfseries warning: signed overflow. assert i + 1 $\leq$ 2147483647^;
[value] ====== VALUES COMPUTED ======
[value:final-states] Values at end of function main:
n ^$\in$^ {10}
^\bfseries i $\in$ [10..2147483647]^
\end{lstlisting}
\end{onlyenv}
Calcul de point fixe sur un automate interprété
-----------------------------------------------
Loop unrolling {.fragile}
--------------
\noindent
\begin{minipage}{.4\textwidth}
\begin{lstlisting}[language=C]
/*@ slevel ^\only<1>{5}\only<2>{6}\only<3>{0}^; */
int i = 0;
while (i < 5) {
if (nondet)
break;
i++;
}
int j = 0;
Séquences descendantes
----------------------
^\uncover<3>{/*@ loop unroll 4; */}^
while (j < 4) {
A[j] = 0;
j++;
}
\end{lstlisting}
\end{minipage}
\begin{minipage}{.4\textwidth}
\tikzstyle{as} = [
circle,
fill=orange,
inner sep=0.2pt,
minimum size=6pt,
on chain=2,
node distance=1mm,
font=\tiny]
\tikzstyle{us} = [as,minimum size=8pt]
\tikzstyle{c0} = [fill=orange]
\tikzstyle{c1} = [fill=red]
\tikzstyle{c2} = [fill=purple]
\tikzstyle{c3} = [fill=blue]
\tikzstyle{c4} = [fill=cyan]
\tikzstyle{c5} = [fill=green]
\tikzstyle{c6} = [fill=black,font=\tiny\color{white}]
\begin{tikzpicture}[chainautomata,node distance=3.5mm,start chain=2 going right]
\node[cp] (c1) {$c_1$};
\node[cp] (c2) {$c_2$};
\node[cp] (c3) {$c_3$};
\node[cp] (c4) {$c_4$};
\node[cp] (c5) {$c_5$};
\node[cp] (c6) {$c_6$};
\node[cp] (c7) {$c_7$};
\node[cp] (c8) {$c_8$};
\node[cp,right=28mm of c7] (c9) {$c_9$};
- Intérêt
- Chemins de sortie
\draw
(c1) edge node[right] {\tt i = 0} (c2)
(c2) edge node[right] {\tt i < 5} (c3)
(c3) edge node[right] {\tt nondet} (c4)
(c4) edge[bend left=70] node[left] {\tt i++} (c2)
(c5) edge node[left] {\tt j = 0} (c6)
(c6) edge node[right] {\tt j < 4} (c7)
edge node[above right] {\tt j >= 10} (c9)
(c7) edge node[right] {\tt A[j] = 0} (c8)
(c8) edge [bend left=70] node[left] {\tt j++} (c6)
;
\draw (c3) .. controls +(3.0, 0) and +(3.0,0.0) .. node[right] {$\neg$ \tt nondet} (c5);
\tikzset{node distance=1mm}
\only<1>{
\node[as, c0, right=of c1] {};
\node[as, c0, right=of c2] {};
\node[as, c1] {};
\node[as, c2] {};
\node[as, c3] {};
\node[as, c4] {};
\node[us, c5] {};
\node[as, c0, right=of c3] {};
\node[as, c1] {};
\node[as, c2] {};
\node[as, c3] {};
\node[as, c4] {};
\node[us, c5] {};
\node[as, c0, right=of c4] {};
\node[as, c1] {};
\node[as, c2] {};
\node[as, c3] {};
\node[as, c4] {};
\node[us, c6, right=of c5] {};