Skip to content
Snippets Groups Projects
Commit 9974feac authored by Andre Maroneze's avatar Andre Maroneze
Browse files

[Userman] add documentation about analysis-scripts

parent ea83447d
No related branches found
No related tags found
No related merge requests found
This diff is collapsed.
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
%% --------------------------------------------------------------------------
%% --- Analysis scripts
%% --------------------------------------------------------------------------
\chapter{Analysis Scripts}
\label{user-analysis-scripts}
This chapter describes some tools and scripts shipped with \FramaC to help
users setup and run analyses on large code bases. These scripts can also help
dealing with unfamiliar code and automating analyses.
\section{Requirements}
These analysis scripts are chiefly based on the following tools:
\begin{description}
\item[Python]: most scripts are written using Python 3. Some scripts require
features from specific Python versions, and perform version checks when
starting.
\item[GNU Make]: the main workflow for analysis scripts consists in using
GNU Make (4.0 or newer) to compute dependencies and cache intermediate
results.
\item[Bash]: some scripts are written in Bash.
\end{description}
Besides those, a few tools are occasionally required by the scripts, such as
Perl and GNU Parallel.
\section{Usage}
Most scripts are accessible via the \texttt{frama-c-script} command, installed
along with \FramaC. Running this command without any arguments prints the list
of commands, with a brief description of each of them. Some extra scripts are
available by directly running them; in both cases, the actual scripts
themselves are installed in Frama-C's \texttt{share} directory, underneath
\texttt{analysis-scripts}.
\subsection{General Framework}
{\em Note}: the analysis scripts are intended for usage in a wide variety
of scenarios, with different plug-ins. However, currently the scripts focus
on usage with the \Value plug-in.
The main usage mode of \texttt{analysis-scripts} consists in creating a
Makefile dedicated to the analysis of a C code base.
This Makefile has three main purposes:
\begin{enumerate}
\item To separate the main analysis steps, saving partial results and logging
output messages;
\item To avoid recomputing unnecessary data (e.g. the AST) when modifying
analysis-specific options;
\item To document analysis options and improve replayability, e.g. when
iteratively fine-tuning results.
\end{enumerate}
The intended usage is as follows:
\begin{enumerate}
\item The user identifies a C code base on which they would like to run
Frama-C;
\item The user runs a script to interactively fill a template for
the analysis, with the list of source files and required parameters
(architecture, preprocessing flags, main function);
\item The user edits and runs the generated Makefile, adjusting the
analysis as needed and re-running \texttt{make}.
\end{enumerate}
Ideally, after modifying the source code or re-parametrizing the analysis,
re-running \texttt{make} should be enough to obtain a new result.
Section~\ref{sec:using-generated-makefile} details usage of the Makefile
and presents an illustrative diagram.
\subsection{Alternative Workflows in the Absence of Build Information}
\label{alternative-workflows}
It is sometimes the case that the \FramaC user is not the developer of the
code under analysis, and does not have full build information about it;
or the code contains non-portable features or missing libraries which prevent
\FramaC from parsing it.
In such cases, these analysis scripts provide two alternative workflows,
depending on how one prefers to choose their source files:
{\em one at a time} or {\em all-in}, described below.
\subsubsection{One at a time}
In this workflow, the user starts from the entry point
of the analysis: typically the \texttt{main} function of a program or a test
case. Only the file defining that function is included at first. Then, using
\texttt{make-wrapper} (described in section~\ref{sec:script-descriptions}),
the user iteratively adds new sources as needed,
so that all function definitions are included.
\begin{itemize}
\item Advantages: higher code coverage; faster preprocessing/parsing; and
avoids including possibly unrelated files
(e.g. for an alternative OS/architecture).
\item Drawbacks: the iterative approach recomputes the analysis several times;
also, it may miss files containing global initializers, which are not flagged
as missing.
\end{itemize}
\subsubsection{All-in}
In this workflow, the user adds {\em all} source files to the analysis, and
if necessary removes some of them, e.g. when there are conflicting definitions,
such as when multiple test files define a \texttt{main} function.
\begin{itemize}
\item Advantages: optimistic approach; may not require any extra iterations, if
everything is defined, and only once. Does not miss global initializers, and
may find issues in code which is not reachable (e.g. syntax-related warnings).
\item Drawbacks: preprocesses and parses several files which may end up never
being used; smaller code coverage; if parsing fails, it may be harder to
find the cause (especially if due to unnecessary sources).
\end{itemize}
\subsection{Using a JSON Compilation Database (JCDB)}
Independently of the chosen workflow, some partial information can be retrieved
when CMake or Makefile scripts are available for compiling the sources.
They allow the production of a JSON Compilation Database
(\texttt{compile\_commands.json}, called JCDB for short; see related option
in section~\ref{sec:preprocessing}). This leads to a different workflow:
\begin{enumerate}
\item Run CMake with the flag \texttt{-DCMAKE\_EXPORT\_COMPILE\_COMMANDS=1},
or install Build EAR (\url{https://github.com/rizsotto/Bear} and run
\texttt{bear make <targets>} instead of \texttt{make <targets>}. This will
create a \texttt{compile\_commands.json} file.
\item Run \texttt{frama-c-script list-files}. A list of the compiled files,
along with files defining a \texttt{main} function, will be presented.
\item Run \texttt{frama-c-script make-template} to create a template for
\FramaC/\Value. Answer ``yes'' when asked about using the
\texttt{compile\_commands.json} file.
\end{enumerate}
Ideally, the above approach should result in a working template. In practice,
however, the compilation database may include extraneous sources
(used to compile other files than the target object) and duplicate flags
(e.g. when compiling the same source for different binary targets or test
cases). Manual intervention may be necessary.
\section{Using the generated Makefile}
\label{sec:using-generated-makefile}
The generated Makefile can be used to run one or several analyses.
The diagram in Fig.~\ref{fig:analysis-scripts} summarizes its usage.
Makefile targets and outputs are detailed in this section.
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=\textwidth]{analysis-scripts.pdf}
\caption{Overview of the {\em analysis-scripts} workflow:
the inputs on the left produce a GNUmakefile which is then used
for parsing, analyzing and visualizing results.}
\label{fig:analysis-scripts}
\end{center}
\end{figure}
Each analysis is defined in a target, written by the user as follows:
\texttt{<target>.parse: file1.c file2.c ...}
That is, the target name (chosen by the user), suffixed with \texttt{.parse},
is defined as depending on each of its source files. Changes to any of these
sources will trigger a recomputation of the AST.
{\em Note:} the target name itself {\em cannot} contain slashes or dots.
See also the {\em Technical Notes} section about some current limitations.
Then, for each \texttt{.parse} target, a corresponding \texttt{.eva} target
needs to be added to the \texttt{TARGETS} variable in the Makefile.
This will run \Value on the test case.
Each \texttt{.eva} target depends on its corresponding \texttt{.parse} target;
if the sources change, the analysis must take into account the new AST.
\subsection{Important Variables}
Several Makefile variables are available to customize \FramaC; the main
ones are presented below.
\begin{description}
\item[TARGETS]: as mentioned before, must contain the list of targets,
suffixed with \texttt{.eva}.
\item[CPPFLAGS]: preprocessing options, passed to \FramaC inside option
\texttt{-cpp-extra-args}, when parsing the sources.
\item[FCFLAGS]: extra command-line options given to \FramaC when parsing the
sources and when running analyses. Typically, the options given to the
\FramaC kernel.
\item[EVAFLAGS]: extra command-line options given to \Value when running
its analyses.
\end{description}
These variables are defined globally, but they can also be customized for
each target; for instance, if a given target \texttt{t1} has a main
function \texttt{test1} and requires a global macro \texttt{-DTEST1}, but
target \texttt{t2}'s main function is \texttt{test2} and it requires
\texttt{-DTEST2}, you can locally modify \texttt{FCFLAGS} and \texttt{CPPFLAGS}
as follows:
\begin{lstlisting}
t1.parse: FCFLAGS += -main test1
t1.parse: CPPFLAGS += -DTEST1
t2.parse: FCFLAGS += -main test2
t2.parse: CPPFLAGS += -DTEST2
\end{lstlisting}
\subsection{Predefined targets}
The predefined targets below are the {\em raison d'être} of the generated
Makefile; they speed up analyses, provide self-documentation, and enable
quick iterations during parametrization of the analysis.
\begin{description}
\item[all (default target)]: the default target simply calls
\texttt{<target>.eva}, for each \texttt{<target>} added to variable
\texttt{TARGETS}. Does nothing once the analysis is finished and saved.
\item[<target>.parse]: runs \FramaC on the specified target, preprocessing
and parsing its source files. Produces a directory \texttt{<target>.parse}
containing several logs, a pretty-printed version of the parsed code, and
a \FramaC session file (\texttt{framac.sav}) to be loaded in the GUI or by
other analyses. Does nothing if parsing already happened.
\item[<target>.eva]: loads the parsed result (from the \texttt{.parse} target)
and runs the \Value plug-in, with the options given in \texttt{EVAFLAGS}.
If the analysis succeeds, produces a directory \texttt{<target>.eva} with the
analysis results and a saved session file.
Also creates a timestamped version of \texttt{<target>.eva}, to enable
future comparisons between different parametrizations. The non-timestamped
version corresponds to the latest (successful) analysis.
If the analysis fails, tries to save a partial result in
\texttt{<target>.eva.error} (when possible).
\item[<target>.eva.gui]: loads the result of the corresponding
\texttt{<target>.eva} session and opens it in the GUI. This allows inspecting
the results of \Value. This target always opens the GUI, even when no
changes have happened.
\item[clean]: removes all results produced by the \texttt{.parse} and
\texttt{.eva} targets.
\end{description}
\section{Script Descriptions}
\label{sec:script-descriptions}
The most useful commands are described below.
Run \texttt{frama-c-script help} for more details and optional arguments.
\begin{description}
\item[make-template]: used to create the initial Makefile, based on a template.
This command creates a file named \texttt{GNUmakefile} with some hardcoded
sections, some filled in interactively by the user, and comments indicating
which parts may need change. Once created, it enables the general workflow
mentioned earlier.
\item[make-wrapper <target> <args>]: calls \texttt{make <target> <args>} with
a special wrapper: when running \Value, upon encountering one of a few known
error messages, suggests some actions on how to proceed.
For instance, if a missing function definition is encountered when analyzing
the code with \Value, the wrapper will look for its definition and, if found,
suggest that its source code be added to the analysis. This script is meant
to be used with the {\em one at a time} workflow describe in
section~\ref{alternative-workflows}.
\item[find-fun <fun>]: looks for possible declarations and definitions
of function \texttt{<fun>}. Uses a heuristic that does not depend on \FramaC
being able to parse the sources. Useful to find entry points and missing
includes.
\end{description}
Other commands, only useful in a few cases, are described below.
\begin{description}
\item[configure <machdep>]: used to run a \texttt{configure}
script (based on Autoconf) with some settings to emulate a more portable
system, removing optional code features that could prevent \FramaC from
parsing the sources. Currently still depends partially on the host system,
so many features are not disabled.
\item[make-path] (for \FramaC developers): to be used when Frama-C is not
installed in the PATH; adds a \texttt{frama-c-path.mk} file that is used
by the Makefile generated by \texttt{make-template}.
\item[flamegraph]: opens a {\em flamegraph}\footnote{%
See \url{https://github.com/brendangregg/FlameGraph} for details about
flamegraphs.} to help visualize which functions take most of the time
during analysis with \Value.
\item[summary]: for monitoring the progression of multiple analyses defined
in a single Makefile. Presents a summary of the analyses when done. Mostly
useful for benchmarking or when dealing with several test cases.
\end{description}
The following commands require a JSON Compilation Database.
\begin{description}
\item[list-files]: lists all files in the given JCDB. Useful for filling
out the Makefile template when running \texttt{make-template}.
\item[normalize-jcdb]: converts absolute paths inside a
\texttt{compile\_commands.json} file into relative paths (w.r.t. PWD)
when possible. Used to allow moving/versioning the directory containing
the JCDB file.
\end{description}
Finally, there is the following script, which is {\em not} available as a
command in \texttt{frama-c-script}, since its usage scenario is very
different. It is available at\\
\texttt{\$FRAMAC\_SHARE/analysis-scripts/creduce.sh}.
\begin{description}
\item[creduce.sh]: A script to help running the C-Reduce\footnote{%
See https://embed.cs.utah.edu/creduce for more details.} tool to minify
C programs causing crashes in \FramaC; useful e.g. when submitting a bug
report to \FramaC, without needing to submit potentially confidential data.
The script contains extensive comments about its usage. It is also
described in a post from the \FramaC blog.
\end{description}
\section{Practical Examples: Open Source Case Studies}
The {\em open-source-case-studies} Git repository (OSCS for short),
available at \url{https://git.frama-c.com/pub/open-source-case-studies},
contains several open-source C code bases parametrized with the help of
analysis scripts. Each case study has its own directory, with a
\texttt{GNUmakefile} defining one or more analysis targets.
Due to the variety of test cases, OSCS provide practical usage
examples of the \texttt{GNUmakefile} described in this chapter.
They are periodically synchronized w.r.t. the public \FramaC repository
(daily snapshots), so they may contain features not yet available in the
major \FramaC releases. A few may also contain legacy features which
are no longer used; but overall, they provide useful examples and allow
the user to tweak analysis parameters to test their effects.
\section{Technical Notes}
This section lists known issues and technical details which may help users
understand some unintuitive behaviors.
\paragraph{\em Changes to header files do not trigger a new parsing/analysis.}
Currently, changes to included files (e.g. headers) are {\em not}
tracked by the generated Makefile and may require running \texttt{make}
with \texttt{-B} (to force recomputation of dependencies), or running
\texttt{make clean} before re-running \texttt{make}.
\paragraph{\em Why is the generated Makefile called \texttt{GNUmakefile}?}
GNU Make, by default, searches for a file named \texttt{GNUmakefile} before
searching for a \texttt{Makefile}. Thus, running \texttt{make} without
arguments results in running the Makefile generated by \texttt{make-template}.
You can rename it to \texttt{framac.mk} or something else, and then run it
via \texttt{make -f framac.mk <targets>}.
\paragraph{\em Most scripts are heuristics-based and offer no
correctness/completeness guarantees.} In order to handle files {\em before}
the source preparation step is complete (that is, before \FramaC is able to
parse the sources into a complete AST), most commands use scripts based on
syntactic heuristics, which were found to work well in practice but are
easily disturbed by syntactic changes (e.g. whitespaces).
\paragraph{\em Most commands are experimental.} These analysis scripts are a
recent addition and subject to changes. They are provided on a best-effort
basis.
...@@ -53,6 +53,13 @@ The remainder of this manual is organized in several chapters. ...@@ -53,6 +53,13 @@ The remainder of this manual is organized in several chapters.
the platform. the platform.
\item[Chapter~\ref{user-gui}] gives a detailed description of the graphical \item[Chapter~\ref{user-gui}] gives a detailed description of the graphical
user interface of \FramaC. user interface of \FramaC.
\item[Chapter~\ref{user-report}] describes the \texttt{Report} plug-in, used
for textual consolidation and export of warnings, errors and properties.
\item[Chapter~\ref{user-variadic}] presents the \texttt{Variadic} plug-in,
used to help other plug-ins handle code containing variadic functions, such
as \texttt{printf} and \texttt{scanf}.
\item[Chapter~\ref{user-analysis-scripts}] details several scripts used to
help setup and run analyses on large code bases.
\item[Chapter~\ref{user-errors}] explains how to report errors \via \FramaC's \item[Chapter~\ref{user-errors}] explains how to report errors \via \FramaC's
public Gitlab repository. public Gitlab repository.
\end{description} \end{description}
...@@ -17,7 +17,7 @@ their prototype by an ellipsis (\ldots) after a set of fixed arguments. ...@@ -17,7 +17,7 @@ their prototype by an ellipsis (\ldots) after a set of fixed arguments.
Some functions in the C standard library are variadic, in particular Some functions in the C standard library are variadic, in particular
formatted input/output functions, such as \texttt{printf}/\texttt{scanf}. formatted input/output functions, such as \texttt{printf}/\texttt{scanf}.
Due to the dynamic nature of their arguments, such functions present additional Due to the dynamic nature of their arguments, such functions present additional
challenges to code analysis. The \textttdef{Variadic} helps dealing with some challenges to code analysis. The \textttuse{Variadic} helps dealing with some
of these challenges, reducing or eliminating the need for plug-ins to have of these challenges, reducing or eliminating the need for plug-ins to have
to deal with these special cases. to deal with these special cases.
......
...@@ -64,6 +64,7 @@ Baudin, Mickaël Delahaye, Philippe Hermann, Benjamin Monate and Dillon Pariente ...@@ -64,6 +64,7 @@ Baudin, Mickaël Delahaye, Philippe Hermann, Benjamin Monate and Dillon Pariente
\include{user-gui} \include{user-gui}
\include{user-report} \include{user-report}
\include{user-variadic} \include{user-variadic}
\include{user-analysis-scripts}
\include{user-errors} \include{user-errors}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment