advance.tex

%%% Local Variables:
%%% TeX-master: "developer.tex"
%%% ispell-local-dictionary: "english"
%%% End:

\chapter{Advanced Plug-in Development}\label{chap:advance}
\lstset{language=Ocaml} %% makes Ocaml the default language for listings, eg. \lstinline.

This chapter details how to use services provided by \framac in order to be
fully operational with the development of plug-ins. Each section describes
technical points a developer should be aware of. Otherwise,
one could find oneself in one or more of the following situations
\footnote{It
  is fortunately quite difficult (but not impossible) to fall into the worst
  situation by mistake if you are not a kernel developer.}
(from bad to worse):
\begin{enumerate}
\item reinventing the (\framac) wheel;
\item being unable to do some specific things (\eg saving
  results\index{Saving} of an analysis on disk, see
  Section~\ref{proj:states});
\item introducing bugs in his/her code;
\item introducing bugs in other plug-ins using his/her code;
\item breaking the kernel consistency\index{Consistency} and so potentially
  breaking all \framac plug-ins (\eg if s/he modifies the
  AST\index{AST!Modification} without changing project\index{Project}, see
  Section~\ref{proj:use}).
\end{enumerate}

In this chapter, we suppose that the reader is able to write a minimal plug-in
like \texttt{hello}\index{Hello} described in chapter~\ref{chap:tutorial} and
knows about the software architecture of \framac (chapter~\ref{chap:archi}). Moreover
plug-in development requires the use of advanced features of
\caml (module system, classes and objects, \emph{etc}). Static plug-in
development requires some knowledge of \autoconf and \make.
 Each section summarizes its own prerequisites at its beginning (if any).

Note that the following subsections can be read in no particular
order: their contents are indeed quite independent from one another even if
there are references from one chapter to another one. Pointers to
reference manuals (Chapter~\ref{chap:refman}) are also provided for readers who
want full details about specific parts.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%% \section{File Tree Overview}\label{adv:files}

%% \begin{target}beginners.\end{target}

%% The \framac main directory is split in several sub-directories. The \framac source
%% code is mostly provided in directories \texttt{cil}\codeidx{cil} and
%% \texttt{src}\codeidx{src}. The first one contains the source code of
%% \cil~\cite{cil}\index{Cil} extended with an \acsl~\cite{acsl}
%% implementation\index{ACSL}. The second one is the core implementation of
%% \framac. This last directory contains directories of the \framac
%% kernel\index{Kernel} and directories of the provided \framac plug-in.

%% A pretty complete description of the \framac file tree is provided in
%% Section~\ref{refman:files}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{\framac Configure.in}\label{adv:configure}
\codeidxdef{configure.in}

\begin{target}not for standard plug-ins developers.\end{target}

\begin{prereq}
  Knowledge of \autoconf and shell programming.
\end{prereq}

In this Section, we detail how to modify the file \texttt{configure.in} in
order to configure plug-ins (\framac configuration has been introduced in
Section~\ref{tut2:configure} and~\ref{tut2:makefile}).

First Section~\ref{conf:principle} introduces the general principle and
organisation of \texttt{configure.in}. Then Section~\ref{conf:add} explains how
to configure a new simple plug-in without any dependency. Next we show how to
exhibit dependencies with external libraries and tools
(Section~\ref{conf:dep-lib}) and with other plug-ins
(Section~\ref{conf:dep-plug}). Finally Section~\ref{conf:lib} presents the
configuration of external libraries and tools needed by a new plug-in but not
used anywhere else in \framac.

\subsection{Principle}\label{conf:principle}

When you execute \texttt{autoconf}, file \texttt{configure.in} is used to
generate the \texttt{configure} script. Each \framac user executes this script
to check his/her system and determine the most appropriate configuration: at the
end of this configuration (if successful), the script summarizes the
status of each plug-in, which can be:\index{Plug-in!Status}
\begin{itemize}
\item \emph{available} (everything is fine with this plug-in);
\item \emph{partially available}: either an optional dependency of the plug-in
  is not fully available, or a mandatory dependency of the plug-in is only
  partially available; or\index{Plug-in!Dependency}
\item \emph{not available}: either the plug-in itself is not provided by
  default, or a mandatory dependency of the plug-in is not available.
\end{itemize}

The important notion in the above definitions is
\emph{dependency}\index{Plug-in!Dependency|bfit}. A dependency of a plug-in $p$
is either an external library/tool\index{Library}\index{Tool} or another
\framac plug-in. It is either \emph{mandatory} or \emph{optional}. A mandatory
dependency must be present in order to build $p$, whereas an optional
dependency provides features to $p$ that are additional but not highly required
(especially $p$ must be compilable without any optional dependency).

Hence, for the plug-in developer, the main role of \texttt{configure.in} is to
define the optional and mandatory dependencies of each plug-in. Another standard
job of \texttt{configure.in} is the addition of options \texttt{---enable-$p$}
and \texttt{---disable-$p$} to \texttt{configure} for a plug-in $p$. These
options respectively forces $p$ to be available and disables $p$ (its status is
automatically ``not available'').

Indeed \texttt{configure.in} is organised in different sections specialized in
different configuration checks. Each of them begins with a title delimited by
comments and it is highlighted when \texttt{configure} is executed. These
sections are described in Section~\ref{refman:configure}. Now we focus on
the modifications to perform in order to integrate a new plug-in in \framac.

\subsection{Addition of a Simple Plug-in}\label{conf:add}

In order to add a new plug-in, you have to add a new subsection for the new
plug-in to Section \emph{Plug-in wished}. This action is usually very easy to
perform by copying/pasting from another existing plug-in (\eg
\texttt{occurrence}\index{Plug-in!Occurrence|see{Occurrence}}\index{Occurrence})
and by replacing the plug-in name (here \texttt{occurrence}) by the new plug-in
name in the pasted part. In these sections, plug-ins are sorted according to a
lexicographic ordering.

For instance, Section \emph{Wished Plug-in} introduces a new sub-section for
the plug-in \texttt{occurrence} in the following way.
\codeidxdef{check\_plugin}
\begin{configurecode}
# occurrence
############
check_plugin(occurrence,src/plugins/occurrence,
             [support for occurrence analysis],yes)
\end{configurecode}
\begin{itemize}
\item The first argument is the plug-in name,
\item the second one is the name of directory containing the source files 
  of the plug-in (usually a sub-directory of \texttt{src/plugins}), 
\item the third one is an help message for the \texttt{--enable-occurrence} 
  option of \texttt{configure},
\item the last one indicates if the plug-in is enabled by default
  (\texttt{yes/no}).
\end{itemize}

\begin{important}
The plug-in name must contain only alphanumeric characters and
underscores. It must be the same as the \texttt{name} value given as
argument to the functor \texttt{Plugin.Register} of
section~\ref{adv:plugin-services} (with spaces replaced by underscore). It must
also be the same (modulo upper/lower case) as the \texttt{PLUGIN\_NAME} variable
given in the plugin's Makefile presented in section~\ref{adv:dynamic-make}.
\end{important}

\begin{important}
  The macro \texttt{check\_plugin}\scodeidx{configure.in}{check\_plugin} sets
  the following variables:
  \texttt{FORCE\_OCCURRENCE}\scodeidxdef{configure.in}{FORCE\_$plugin$},
  \texttt{REQUIRE\_OCCURRENCE}\scodeidx{configure.in}{REQUIRE\_$plugin$},
  \texttt{USE\_OCCURRENCE}\scodeidx{configure.in}{USE\_$plugin$}, and
  \texttt{ENABLE\_OCCURRENCE}\scodeidxdef{configure.in}{ENABLE\_$plugin$}.
\end{important}

The first variable indicates if the user explicitly requires the availability of
\texttt{occurrence} \emph{via} setting the option
\texttt{---enable-occurrence}. The second and third variables are used by others
plug-ins in order to handle their dependencies (see
Section~\ref{conf:dep-plug}). The fourth variable \texttt{ENABLE\_OCCURRENCE} indicates
the plug-in status (available, partially available or not available). 
%At the end of these lines of code, it says if the plug-in should be compiled: if
If \texttt{---enable-occurrence} is set, then \texttt{ENABLE\_OCCURRENCE} is \yes
(plug-in available); if \texttt{---disable-occurrence} is set, then its value is
\texttt{no} (plug-in not available). If no option is specified on the command
line of \texttt{configure}, its value is set to the default one (according to
%\texttt{\$default}). 
the value of the fourth argument of \texttt{check\_plugin}).

%%%%
\subsection{Configuration of New Libraries or Tools}
\label{conf:lib}
\index{Library!Configuration}
\index{Tool!Configuration}

Some plug-ins need additional tools or libraries to be fully
functional. The \texttt{configure} script takes care of these in two
steps. First, it checks that an appropriate version of each external
dependency exists on the system. Second, it verifies for each plug-in
that its dependencies are met. Section~\ref{conf:dep-lib} explains how
to make a plug-in depend on a given library (or tool). The present
section deals with the first part, that is how to check for a given
library or tool on a system. Configuration of new libraries and configuration of
new tools are similar.  In this section, we therefore choose to focus
on the configuration of new libraries. This is done by calling a
predefined macro called
\texttt{configure\_library}\scodeidxdef{configure.in}{configure\_library}\scodeidxdef{configure.in}{configure\_tools}\footnote{For
  tools, there is a macro \texttt{configure\_tool} which works in the
  same way as \texttt{configure\_library}.}.
The \texttt{configure\_library} macro takes three arguments. The first one is
the (uppercase) name of the library, the second one is a filename
which is used by the script to check the availability of the library.
In case there are multiple locations possible for the library, this
argument can be a list of filenames. In this case, the argument must
be properly quoted ({\it i.e.} enclosed in a {\tt [}, {\tt]} pair).
Each name is checked in turn. The first one which corresponds to an
existing file is selected.  If no name in the list corresponds to an
existing file, the library is considered to be unavailable. The last
argument is a warning message to display if a configuration problem
appears (usually because the library does not exist). Using these
arguments, the script checks the availability of the library.

Results of this macro are available through two variables which are
substituted in the files generated by \texttt{configure}.
\begin{itemize}
\item \texttt{HAS\_$library$}\scodeidxdef{configure.in}{HAS\_$library$}
  is set to \texttt{yes} or \texttt{no} depending on the availability
  of the library
\item \texttt{SELECTED\_$library$}\scodeidxdef{configure.in}{SELECTED\_$library$}
contains the name of the version selected as described above.
\end{itemize}

When checking for \ocaml{} libraries and object files, remember that
they come in two flavors: bytecode and native
code, which have distinct suffixes. Therefore, you should use the
variables \texttt{LIB\_SUFFIX}\scodeidx{configure.in}{LIB\_SUFFIX}
(for libraries) and
\texttt{OBJ\_SUFFIX}\scodeidx{configure.in}{OBJ\_SUFFIX}
(for object files) to check the presence of a
given file. These variables are initialized at the beginning of the
\texttt{configure} script depending on the availability of a
native-code compiler on the current installation.

\begin{example}
  The library \lablgtksourceview\index{Lablgtksourceview2} (used to have a
  better rendering of C sources in the GUI) is part of
  \lablgtk\index{Lablgtk}~\cite{lablgtk}.
  This is checked through the following command,
  where \texttt{LABLGTKPATH\_FOR\_CONFIGURE} is the path where
  \texttt{configure} has found \lablgtk itself.

\begin{configurecode}
configure_library(
  [GTKSOURCEVIEW],
  [\$LABLGTKPATH_FOR_CONFIGURE/lablgtksourceview2.\$LIB_SUFFIX],
  [lablgtksourceview not found])
\end{configurecode}
\end{example}

\subsection{Addition of Library/Tool Dependencies}\label{conf:dep-lib}
\index{Library!Dependency}\index{Tool!Dependency}

Dependencies upon external tools and libraries are governed by two macros:
\begin{itemize}
\item \texttt{plugin\_require\_external($plugin$,$library$)}%
\scodeidxdef{configure.in}{plugin\_require\_external} indicates that
$plugin$ requires $library$ in order to be compiled.
\item \texttt{plugin\_use\_external($plugin$,$library$)}%
\scodeidxdef{configure.in}{plugin\_use\_external} indicates that
$plugin$ uses $library$, but can nevertheless be compiled if $library$
is not installed (potentially offering reduced functionality).
\end{itemize}

\begin{convention}
  The best place to perform such extensions is just after the addition of $p$
  which sets the value of \texttt{ENABLE\_$p$}.
\end{convention}

\begin{example}
  Plug-in \texttt{gui}\index{Plug-in!GUI} requires
  \lablgtk\index{Lablgtk} and \gnomecanvas\index{GnomeCanvas}. It
  also optionally uses \dottool\index{Dot} for displaying graphs (graph cannot
  be displayed without this tool). So, just after its declaration, there are the
  following lines in \texttt{configure.in}.
\begin{configurecode}
plugin_require_external(gui,lablgtk)
plugin_require_external(gui,gnomecanvas)
plugin_use_external(gui,dot)
\end{configurecode}
\end{example}

\subsection{Addition of Plug-in Dependencies}\label{conf:dep-plug}
\index{Plug-in!Dependency|bfit}

Adding a dependency with another plug-in is quite the same as adding a
dependency with an external library or tool (see
Section~\ref{conf:dep-lib}). For this purpose, \texttt{configure.in}
uses two macros
\begin{itemize}
\item \texttt{plugin\_require($plugin1$,$plugin2$)}%
\scodeidxdef{configure.in}{plugin\_require} states that
$plugin1$ needs $plugin2$.
\item \texttt{plugin\_use($plugin1$,$plugin2$)}%
\scodeidxdef{configure.in}{plugin\_use} states that
$plugin1$ can be used in absence of $plugin2$, but requires $plugin2$
for full functionality.
\end{itemize}

There can be mutual dependencies between plug-ins. This is for instance
the case for plug-ins \texttt{value} and \texttt{from}.

\section{Plug-in Specific Configure.ac}
\codeidxdef{configure.ac}
\label{sec:config-external}

\begin{target}standard plug-ins developers.\end{target}

\begin{prereq}
  Knowledge of \autoconf and shell programming.
\end{prereq}

External plug-ins can have their own configuration file, and can rely on the
macros defined for \framac. In addition, as mentioned in
section~\ref{dynamic-make:comp-same-time}, those plug-ins can be
compiled directly from \framac's own Makefile. In order for them to
integrate well in this setting, they should follow a particular layout,
described below.
First, they need to be able to refer to the auxiliary \texttt{configure.ac}
file defining \framac-specific macros when they are used as stand-alone plug-ins.
This can be done by the following code
\begin{configurecode}
m4_define([plugin_file],Makefile)

m4_define([FRAMAC_SHARE_ENV],
          [m4_normalize(m4_esyscmd([echo $FRAMAC_SHARE]))])

m4_define([FRAMAC_SHARE],
	  [m4_ifval(FRAMAC_SHARE_ENV,[FRAMAC_SHARE_ENV],
                                     [m4_esyscmd(frama-c -print-path)])])

m4_ifndef([FRAMAC_M4_MACROS],
         [m4_include(FRAMAC_SHARE/configure.ac)]
        )
\end{configurecode}
%$ emacs-highlighting is a bit lost otherwise...

\texttt{plugin\_file} is the file which must be present to ensure that
\texttt{autoconf} is called in the appropriate directory (see documentation for
the \texttt{AC\_INIT} macro of autoconf). \texttt{configure.ac} can be found in
two ways: either by relying on the \texttt{FRAMAC\_SHARE} shell variable (when
\framac{} is not installed, {\it i.e.} when configuring the plug-in together
with the main \framac), or by calling an installed \framac (when installing
the plug-in separately). The inclusion of \texttt{configure.ac} needs to be
guarded to prevent multiple inclusions, as the configuration file of the plug-in
might itself be included by \texttt{configure.in}
(see section~\ref{dynamic-make:comp-same-time} for more details).
The configuration of the plug-in itself or related libraries and tools can then
proceed as described in Sections~\ref{conf:add} and \ref{conf:lib}. References
to specific files in the plug-in source directory should be guarded with the
following macro:
\begin{configurecode}
PLUGIN_RELATIVE_PATH(file)
\end{configurecode}

If the external plug-in has some dependencies as described in
sections~\ref{conf:dep-lib}~and~\ref{conf:dep-plug},
the configure script \texttt{configure} must check that all dependencies
are met. This is done with the following macro:\scodeidxdef{configure.in}{check\_plugin\_dependencies}
\begin{configurecode}
check_plugin_dependencies
\end{configurecode}
An external plug-in can
have dependencies upon previously installed plug-ins.
However two separately installed plug-ins can not be
mutually dependent on each other. Nevertheless, they can be compiled together
with the main \framac sources using the \texttt{---enable-external} option
of \texttt{configure} (see section~\ref{dynamic-make:comp-same-time}
for more details).

Finally, the configuration must end with the following command:
\begin{configurecode}
write_plugin_config(files)
\end{configurecode}
where \texttt{files} are the files that must be processed by configure
(as in \texttt{AC\_CONFIG\_FILES} macro). \texttt{PLUGIN\_RELATIVE\_PATH} is
unneeded here.

\begin{important}
For technical reasons, the macros
\texttt{configure\_library}, \texttt{configure\_tool}, 
\texttt{check\_plugin\_dependencies}, and \texttt{write\_plugin\_config}
must not be inside a conditional part of the \texttt{configure}
script. More precisely,
they are using the diversion mechanism of \texttt{autoconf} in order to
ensure that the tests are performed after all dependencies information
has been gathered from all existing plugins. Diversion is a very primitive,
text-to-text transformation. Using those macros within a
conditional (or anything that alters the control-flow of the script) is likely
to result in putting some unrelated part of the script in the same branch of
the conditional.
\end{important}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{\framac Makefile}\label{adv:make}\codeidxdef{Makefile}

\begin{target}not for standard plug-in developers.\end{target}

\begin{prereq}
  Knowledge of \make.
\end{prereq}

In this section, we detail the use of \texttt{Makefile} dedicated to \framac
compilation. This file is split in several sections which are described in
Section~\ref{make:sections}. By default, executing \texttt{make} only displays
an overview of commands. For example, here is the output of the compilation of
source file \texttt{src/kernel\_services/plugin\_entry\_points/db.cmo}.
\begin{shell}
\$ make src/kernel_services/plugin_entry_points/db.cmo
Ocamlc      src/kernel_services/plugin_entry_points/db.cmo
\end{shell}
If you wish the exact command line, you have to set variable
\texttt{VERBOSEMAKE}\codeidxdef{VERBOSEMAKE} to \texttt{yes} like below.
\begin{shell}
\$ make VERBOSEMAKE=yes src/kernel_services/plugin_entry_points/db.cmo
\end{shell}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Plug-in Specific Makefile}\label{adv:dynamic-make}
\codeidxdef{Makefile.dynamic}

\begin{prereq}
  Knowledge of \make.
\end{prereq}

\subsection{Using \texttt{Makefile.dynamic}}
\label{dynamic-make:dynamic}

In this section, we detail how to write a Makefile for a given plug-in. Even if
it is still possible to write such a Makefile from scratch, \framac provides a
generic Makefile, called \texttt{Makefile.dynamic}\codeidx{Makefile.dynamic},
which helps the plug-in developer in this task. This file is installed in the
\framac share directory. So for writing your plug-in specific Makefile, you
have to:
\begin{enumerate}
\item set some variables for customizing your plug-in;
\item include \texttt{Makefile.dynamic}.
\end{enumerate}

\begin{example}
A minimal \texttt{Makefile} is shown below. That is the Makefile of the plug-in
\texttt{Hello World} presented in the tutorial (see
Section~\ref{tut2:hello}). Each variable set in this example has to be set
by any plug-in.
\codeidx{Makefile.dynamic}
\codeidx{FRAMAC\_SHARE}
\codeidx{FRAMAC\_LIBDIR}
\codeidx{PLUGIN\_CMO}
\codeidx{PLUGIN\_NAME}
\makefileinput{./tutorial/hello/generated/makefile_multiple/Makefile}
\texttt{FRAMAC\_SHARE} must be set to the \framac share directory while
\texttt{FRAMAC\_LIBDIR} must be set to the \framac lib directory.
\texttt{PLUGIN\_NAME} is the capitalized name of your plug-in while
\texttt{PLUGIN\_CMO} is the list of the files $.cmo$ generated from your \caml
sources.
Note that, by default, during compilation of your plug-in all warnings are
disabled; it is recommended to define environment variable
\texttt{DEVELOPMENT=yes} to enable the standard set of compilation warnings.
\end{example}

\begin{important}
To run your specific Makefile, you must have properly installed \framac
before, or compile your plugin together with Frama-C, as described in
section~\ref{dynamic-make:comp-same-time}.
\end{important}

You may possibly need to do \texttt{make depend} before running
\texttt{make}.

Which variable can be set and how they are useful is explained
Section~\ref{make:plugin}. Furthermore, Section~\ref{make:dynamic} explains
the specific features of \texttt{Makefile.dynamic}.

%% \subsection{Calling a Plug-in Specific Makefile from the \framac Makefile}
%% \label{dynamic-make:integrating}
%% [JS 2012/05/09] nobody does this from a while.

%% \begin{target}
%%   kernel-integrated plug-in developers using the \svn repository of \framac.
%% \end{target}

%% Even if you are writing a kernel-integrated plug-in, it is useful to have your
%% plug-in specific Makefile. For instance, it allows you to easily release your
%% plug-in independently of \framac. However, if your version of \framac is
%% changing frequently, it is useful to compile together \framac and your plug-in
%% without installing \framac each time. To reach this goal, you have to mix the
%% integration in \texttt{Makefile} described in Section~\ref{adv:make} and the
%% solution presented in this section: in the section ``Plug-ins'' of
%% \texttt{Makefile} you just have to set few variables before including your
%% plug-in specific Makefile

%% \begin{example} For compiling the plug-in
%%   \texttt{Ltl\_to\_acsl}\index{Ltl\_to\_acsl}, the following lines are added
%%   into \texttt{Makefile}.
%% \codeidx{PLUGIN\_ENABLE}
%% \codeidx{PLUGIN\_DIR}
%% \codeidx{PLUGIN\_DYNAMIC}
%% \codeidx{DISTRIB\_FILES}
%% \codeidx{@ENABLE\_$plugin$}
%% \begin{makefilecode}
%% PLUGIN_ENABLE	:=@ENABLE_LTL_TO_ACSL@
%% PLUGIN_DIR	:=src/ltl_to_acsl
%% DISTRIB_FILES += $(PLUGIN_DIR)/Makefile
%% include $(PLUGIN_DIR)/Makefile
%% \end{makefilecode}
%% \end{example}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Compiling \framac and external plug-ins at the same time}
\label{dynamic-make:comp-same-time}
\begin{target}
  plug-in developers using the \svn repository of \framac.
\end{target}
It is also possible to get a completely independent plug-in
recompiled and tested together with \framac's kernel. For that,
\framac must be aware of the existence of the plug-in. This can be
done in two ways:
\begin{itemize}
\item All sub-directories of \verb+src/plugins+ directory in \framac sources
  which are not known to \framac's kernel are assumed to be external
  plug-ins.
\item One can use the \verb+--enable-external+ option of configure
  which takes as argument the path to the plug-in
\end{itemize}

In the first case, the plug-in behaves as any other built-in plug-in:
\verb+autoconf+ run in \framac's main directory will take care of it
and it can be \verb+enabled+ or \verb+disabled+ in the same way as the
others. If the plug-in has its own \verb+configure.in+ or
\verb+configure.ac+ file, the configuration instructions contained
in it (in particular additional dependencies) will be read as well.

In the second case, the plug-in is added to the list of external
plug-ins at configure time. If the plug-in has its own configure, it
is run as well.

Provided it properly uses the variables set by \texttt{Makefile.dynamic},
the plug-in's \texttt{Makefile} does not require
specific adaptations depending on whether it is compiled together with
the kernel or with respect to an already-existing \framac installation.
It is however possible to check the compilation mode with the
\codeidx{FRAMAC\_INTERNAL}\texttt{FRAMAC\_INTERNAL} variable, which is set to
\texttt{yes} when compiling together with \framac kernel and to \texttt{no}
otherwise.

\codeidx{clean-install}
\codeidx{install}
In addition, if a plug-in wishes to install custom files to
\texttt{FRAMAC\_LIBDIR} through the \texttt{install::} target,
this target must depend on \texttt{clean-install}. 
Indeed, \framac's main \texttt{Makefile} removes
all existing files in this directory before performing a fresh installation,
in order to avoid potential interference with an obsolete (and usually
incompatible) module from a previous installation. Adding the dependency thus
ensures that the removal will take place before any new file has been installed
in the directory.

\begin{example} If a plug-in wants to install \texttt{external/my\_lib.cm*}
in addition to the normal plugin files, it should use the following code:
\begin{makefilecode}
install:: clean-install
        $(PRINT_INSTALL) "My beautiful library"
        $(MKDIR) $(FRAMAC_LIBDIR)
        $(CP) external/my_lib.cm* $(FRAMAC_LIBDIR)
\end{makefilecode}
\end{example} %$ %Fix emacs highlighting


\section{Testing}\label{adv:ptests}\index{Test|bfit}

In this section, we present \ptests, a tool provided by \framac in order to
perform non-regression and unit tests.

\ptests\index{Ptests|bfit} runs the \framac toplevel on each specified test
(which are usually \C files). Specific directives can be used for each
test. Each result of the execution is compared from the previously saved result
(called the \emph{oracle}\index{Oracle|bfit}). A test is successful if and only
if there is no difference.  Actually the number of results is twice that the
number of tests because standard and error outputs are compared separately.

First Section~\ref{ptests:use} shows how to use \ptests. Next
Section~\ref{ptests:configuration} introduces how to use predefined directives
to configure tests, while Section~\ref{ptests:alternative} explains
how to set up various testing goals for the same test base. Last
Section~\ref{ptests:options} details \ptests' options, while
Section~\ref{ptests:directives} describes \ptests' directive.

\subsection{Using \ptests}\label{ptests:use}

If you're using a \texttt{Makefile} written following the principles given in
section~\ref{adv:dynamic-make}, the simplest way of using \ptests is through 
\texttt{make tests} which is
roughly equivalent to
\begin{shell}
\$ time ./bin/ptests.opt
\end{shell}
or
\begin{shell}
\$ time ptests.opt
\end{shell}
depending on whether you're inside \framac's sources or compiling a plug-in 
against an already installed \framac distribution.

A specific target \texttt{\$(PLUGIN\_NAME)\_TESTS} will specifically run the tests
of the plugin. One can add new tests as dependencies of this target.
The default tests are run by the target \texttt{\$(PLUGIN\_NAME)\_DEFAULT\_TESTS}.

\texttt{ptests.opt} runs tests belonging to a sub-directory 
of directory \texttt{tests}\codeidx{tests} that is specified in
\ptests configuration file. This configuration file,
\texttt{tests/ptests\_config}, is automatically generated by \framac's
\texttt{Makefile} from the options you set in your plugin's \texttt{Makefile}.
\ptests also accepts specific \emph{test suites}
in arguments.  A test suite\index{Test!Suite|bfit} is either the name of a
sub-directory in directory \texttt{tests} or a filename (with its path relative
to the current directory).

\begin{example}
  If you want to test plug-in
  \texttt{sparecode}\index{Plug-in!Sparecode|see{Sparecode}}\index{Sparecode}
  and specific test \texttt{tests/pdg/variadic.c}, just run
\begin{shell}
\$ ./bin/ptests.opt sparecode tests/pdg/variadic.c
\end{shell}
which should display (if there are 7 tests in directory
\texttt{tests/sparecode})
\begin{shell}
% Dispatch finished, waiting for workers to complete
% Comparisons finished, waiting for diffs to complete
% Diffs finished. Summary:
Run = 8
Ok  = 16 of 16
\end{shell}
\end{example}

\ptests accepts different options which are used to customize test
sequences. These options are detailed in Section~\ref{ptests:options}.

\begin{example}
If the code of plug-in \texttt{plug-in} has changed, a typical sequence of tests
is the following one.
\begin{shell}
\$ ./bin/ptests.opt plug-in
\$ ./bin/ptests.opt -update plug-in
\$ make tests
\end{shell}
So we first run the tests suite corresponding to \texttt{plug-in} in order to
display what tests have been modified by the changes. After checking the
displayed differences, we validate the changes by updating the
oracles\index{Oracle}. Finally we run all the test suites\index{Test!Suite} in
order to ensure that the changes do not break anything else in \framac.
\end{example}
\begin{example}
For adding a new test, the typical sequence of command is the following.
\begin{shell}
\$ ./bin/ptests.opt -show tests/plug-in/new_test.c
\$ ./bin/ptests.opt -update tests/plug-in/new_test.c
\$ make tests
\end{shell}
We first ask \ptests to print the output of the test on the command line, check
that it corresponds to what we expect, and then take it as the initial oracle.
If some changes have been made to the code in order to let \texttt{new\_test.c}
pass, we must of course launch the whole test suite and check that all
existing tests are alright.
\end{example}

\begin{important}
If you're creating a whole new test suite \texttt{suite}, 
don't forget to create the sub-directories \texttt{suite/result} and
\texttt{suite/oracle} where \ptests will store the current results and
the oracles for all the tests in \texttt{suite}
\end{important}

\subsection{Configuration}\label{ptests:configuration}
\index{Test!Configuration|bfit}

In order to exactly perform the test that you wish, some directives can be set
in three different places. We indicate first these places and next the possible
directives.

The places are:
\begin{itemize}
\item inside file \texttt{tests/test\_config};\codeidx{test\_config}
\item inside file \texttt{tests/$subdir$/test\_config} (for each sub-directory
  $subdir$ of \texttt{tests}); or
\item inside each test file, in a special comment of the
  form\index{Test!Header}
\begin{listing-nonumber}
/* run.config
   ... directives ...
*/
\end{listing-nonumber}
\end{itemize}

In each of the above case, the configuration is done by a list of
directives\index{Test!Directive}. Each directive has to be on one line and to
have the form
\begin{listing-nonumber}
CONFIG_OPTION:value
\end{listing-nonumber}
There is exactly one directive by line. The different directives (\emph{i.e.}
possibilities for \texttt{CONFIG\_OPTION}) are detailed in
Section~\ref{ptests:directives}.

\begin{important}
Note that some specific configurations require dynamic linking, which
is not available on all platforms for native code. {\tt ptests} takes
care of reverting to bytecode when it detects that the {\tt OPT},
{\tt EXECNOW}, or \texttt{EXEC} options of a test require dynamic linking. This
occurs currently in the following cases:
\begin{itemize}
\item {\tt OPT} contains the option {\tt -load-script}
\item {\tt OPT} contains the option {\tt -load-module}
\item {\tt EXECNOW} and \texttt{EXEC} use {\tt make} to create a {\tt .cmxs}
\end{itemize}
\end{important}

\begin{important}
  \textbf{Concurrency issues:}
  tests using compiled modules ({\tt -load-script} or {\tt -load-module}) may
  lead to concurrency issues when the same module is used in different test
  files, or in different test cases within the same file. One way to avoid
  issues is to serialize tests via \texttt{EXECNOW} directives, e.g. by using
  \texttt{make} to compile a \texttt{.cmxs} from the \texttt{.ml} file, and
  then loading the \texttt{.cmxs} in the test cases, as in the example below.

  \begin{listing-nonumber}
    EXECNOW: make -s @PTEST_DIR@/@PTEST_NAME@.cmxs
    STDOPT: #"-load-module @PTEST_DIR@/@PTEST_NAME.cmxs" ...
    STDOPT: #"-load-module @PTEST_DIR@/@PTEST_NAME.cmxs" ...
  \end{listing-nonumber}

  In addition, if the same script {\tt tests/suite/script.ml}
  is shared by several test files, the {\tt EXECNOW} directive should be put
  into {\tt tests/suite/test\_config}.

\end{important}

\begin{example}
  Test \texttt{tests/sparecode/calls.c} declares the following directives.
  \sscodeidx{Test}{Directive}{OPT}
\begin{listing-nonumber}
/* run.config
   OPT: -sparecode-analysis
   OPT: -slicing-level 2 -slice-return main -slice-print
*/
\end{listing-nonumber}
These directives state that we want to test sparecode and slicing analyses on
this file. Thus running the following instruction executes two test cases.
\begin{shell}
\$ ./bin/ptests.opt tests/sparecode/calls.c
% Dispatch finished, waiting for workers to complete
% Comparisons finished, waiting for diffs to complete
% Diffs finished. Summary:
Run = 2
Ok  = 4 of 4
\end{shell}
\end{example}

\subsection{Alternative Testing}\label{ptests:alternative}

You may want to set up different testing goals for the same test
base. Common cases include:
\begin{itemize}
\item checking the result of an analysis with or without an option;
\item checking a preliminary result of an analysis, in particular if the
      complete analysis is costly;
\item checking separately different results of an analysis.
\end{itemize}

This is possible with option \texttt{-config} of \ptests, which takes as
argument the name of a special test configuration, as in
\begin{shell}
\$ ./bin/ptests.opt -config <special_name> plug-in
\end{shell}

Then, the directives for this test can be found:
\begin{itemize}
\item inside file \texttt{tests/test\_config\_<special\_name>};
\item inside file \texttt{tests/$subdir$/test\_config\_<special\_name>} (for
  each sub-directory $subdir$ of \texttt{tests}); or
\item inside each test file, in a special comment of the
  form\index{Test!Header}
\begin{listing-nonumber}
/* run.config_<special_name>
   ... directives ...
*/
\end{listing-nonumber}
Multiple configurations may share the same set of directives:
\begin{listing-nonumber}
/* run.config, run.config_<special_name> [, ...]
   ... common directives ...
*/
\end{listing-nonumber}
The following wildcard is also supported, and accepts any
configuration: \lstinline+/* run.config* +.
\end{itemize}

All operations for this test configuration should take option
\texttt{-config} in argument, as in
\begin{shell}
\$ ./bin/ptests.opt -update -config <special_name> plug-in
\end{shell}

\begin{important}
In addition, option \texttt{-config <special\_name>} requires subdirectories
\texttt{result\_<special\_name>} and \texttt{oracle\_<special\_name>} to store
results and oracle of the specific configuration.
\end{important}

\subsection{Detailed options}\label{ptests:options}

Figure~\ref{fig:ptests-options} details the options of \ptests.
\begin{figure}[ht]
\begin{center}
\begin{tabular}{|c|c|p{6cm}|c|}
\hline
\textbf{kind} & \textbf{Name} & \textbf{Specification} & \textbf{Default}\\
\hline
\hline \multirow{3}{16mm}{\centering{Toplevel}}
& \texttt{-add-options} & 
  Additional options appended to the normal toplevel command-line & \\
& \texttt{-add-options-pre} &
   Additional options prepended to the normal toplevel command line & \\
& \texttt{-byte} & Use bytecode toplevel & no \\
& \texttt{-opt} & Use native toplevel & yes \\
& \texttt{-gui} & Use GUI instead of console-based toplevel & no \\
\hline \multirow{4}{16mm}{\centering{Behavior}}
& \texttt{-run} & Delete current results; run tests and examine results & yes
\\
& \texttt{-examine} & Only examine current results; do not run tests & no \\
& \texttt{-show} & Run tests and show results, but do not examine
                   them; implies \texttt{-byte} &
no \\
& \texttt{-update} & Take current results as new oracles\index{Oracle}; do not
run tests & no \\
\hline \multirow{4}{16mm}{\centering{Misc.}}
& \texttt{-exclude suite} & Do not consider the given \texttt{suite} &
\\
& \texttt{-diff cmd} & Use \texttt{cmd} to show differences between
results and oracles when examining results
& \texttt{diff -u} \\
& \texttt{-cmp cmd} & Use \texttt{cmd} to compare results against
oracles when examining results
& \texttt{cmp -s} \\
& \texttt{-use-diff-as-cmp} & Use the same command for diff and cmp &
no \\
& \texttt{-j n} & Set level of parallelism to \texttt{n} & 4 \\
& \texttt{-v} & Increase verbosity (up to twice) & 0 \\
& \texttt{-help} & Display helps & no \\
\hline
\end{tabular}
\end{center}
\caption{\ptests options.}\label{fig:ptests-options}
\end{figure}

The commands provided through the \texttt{-diff} and \texttt{-cmp}
options play two related but distinct roles. \texttt{cmp} is always
used for each test (in fact it is used twice: one for the standard output
and one for the error output). Only its exit code is taken into
account by \ptests and the output of \texttt{cmp} is discarded.
An exit code of \texttt{1} means that the two files have
differences. The two files will then be analyzed by \texttt{diff},
whose role is to show the differences between the files. An exit code
of \texttt{0} means that the two files are identical. Thus, they won't
be processed by \texttt{diff}. An exit code of \texttt{2} indicates an
error during the comparison (for instance because the corresponding
oracle does not exist). Any other exit code results in a fatal
error. It is possible to use the same command for both \texttt{cmp}
and \texttt{diff} with the \texttt{-use-diff-as-cmp} option, which
will take as \texttt{cmp} command the command used for \texttt{diff}.

The \texttt{-exclude} option can take as argument a whole suite or an
individual test. It can be used with any behavior.

The \texttt{-gui} option will launch Frama-C's GUI instead of the console-based
toplevel. It can be combined with \texttt{-byte} to launch the bytecode GUI. 
In this mode the default level of parallelism is reduced to 1.

\subsection{Detailed directives}\label{ptests:directives}

Figure~\ref{fig:test-directives} shows all the directives that can be used in
the configuration header of a test (or a test suite).
\begin{figure}[ht]
\begin{center}
\begin{tabular}{|c|c|p{5cm}|l|}
\hline
\textbf{Kind} & \textbf{Name} & \textbf{Specification} & \textbf{default}\\
\hline
\hline \multirow{4}{23mm}{\centering{Command}}
& \texttt{CMD}\sscodeidxdef{Test}{Directive}{CMD}
& Program to run
& \texttt{./bin/toplevel.opt}
\\
& \texttt{OPT}\sscodeidxdef{Test}{Directive}{OPT}
& Options given to the program
& \texttt{-val -out -input -deps}
\\
& \texttt{STDOPT}\sscodeidxdef{Test}{Directive}{STDOPT}
& Add and remove options from the default set
& \textit{None}
\\
& \texttt{LOG}\sscodeidxdef{Test}{Directive}{LOG}
& Add an additional file to compare against an oracle
& \textit{None}
\\ 
& \texttt{EXECNOW}\sscodeidxdef{Test}{Directive}{EXECNOW}
& Run a command before the following commands. When specified in a configuration
file, run it only once.
& \textit{None}
\\
& \texttt{EXEC}\sscodeidxdef{Test}{Directive}{EXEC}
& Similar to \texttt{EXECNOW}, but run it once per testing file.
& \textit{None}
\\
& \texttt{MACRO}\sscodeidxdef{Test}{Directive}{MACRO}
& Define a new macro
& \textit{None}
\\
& \texttt{FILTER}\sscodeidxdef{Test}{Directive}{FILTER}
& Command used to filter results
& \textit{None}
\\
\hline \multirow{2}{23mm}{\centering{Test suite}}
& \texttt{DONTRUN}\sscodeidxdef{Test}{Directive}{DONTRUN}
& Do not execute this test
& \textit{None}
\\
& \texttt{FILEREG}\sscodeidxdef{Test}{Directive}{FILEREG}
& selects the files to test
& \texttt{.*\bss.\bss(c|i\bss)}
\\
\hline \multirow{2}{23mm}{\centering{Miscellaneous}}
& \texttt{COMMENT}\sscodeidxdef{Test}{Directive}{COMMENT}
& Comment in the configuration
& \textit{None}
\\
& \texttt{GCC}\sscodeidxdef{Test}{Directive}{GCC}
& Unused (compatibility only)
& \textit{None}
\\
\hline
\end{tabular}
\end{center}
\caption{Directives in configuration headers of test
  files.}\label{fig:test-directives}
\end{figure}
Any directive can identify a file using a relative path.
The default directory considered for \texttt{.} is always the parent
directory of directory \texttt{tests}\codeidx{tests}. The
\texttt{DONTRUN} directive does not need to have any content, but it
is useful to provide an explanation of why the test should not be run
({\it e.g} test of a feature that is currently developed and not fully
operational yet). If a test file is explicitly given on the command
line of \ptests, it is always executed, regardless of the presence of
a \texttt{DONTRUN} directive.

As said in Section~\ref{ptests:configuration}, these directives can be
found in different places:
\begin{enumerate}
\item default value of the directive (as specified in
  Fig.~\ref{fig:test-directives});
\item inside file \texttt{tests/test\_config};\codeidx{test\_config}
\item inside file \texttt{tests/$subdir$/test\_config} (for each sub-directory
  $subdir$ of \texttt{tests}); or
\item inside each test file
\end{enumerate}
As presented in Section~\ref{ptests:alternative}, alternative directives for
test configuration <special\_name> can be found in slightly different places:
\begin{itemize}
\item default value of the directive (as specified in
  Fig.~\ref{fig:test-directives});
\item inside file \texttt{tests/test\_config\_<special\_name>};
\item inside file \texttt{tests/$subdir$/test\_config\_<special\_name>} (for each sub-directory
  $subdir$ of \texttt{tests}); or
\item inside each test file.
\end{itemize}
For a given test \texttt{tests/suite/test.c}, each existing file in
the sequence above is read in order and defines a configuration level
(the default configuration level always exists).

\begin{itemize}
\item
  \texttt{CMD} allows to change the command that is used for the
  following \texttt{OPT} directives (until a new \texttt{CMD}
  directive is found). No new test case is generated
  if there is no further \texttt{OPT} directive. At a given
  configuration level, the default value for directive
  \texttt{CMD}\sscodeidxdef{Test}{Directive}{CMD} is the last
  \texttt{CMD} directive of the preceding configuration level.
\item \texttt{LOG} adds a file to be compared against an oracle in the
  next \texttt{OPT} directive. Several files can be monitored from a single
  \texttt{OPT} directive, through several \texttt{LOG} directives. 
  These files must be generated in the result
  directory of the corresponding suite (and potentially alternative
  configuration).
  After an \texttt{OPT} or \texttt{STDOPT} directive is encountered,
  the set of additional \texttt{LOG} files is reset to its default.
\item If there are several directives \texttt{OPT} in the same
  configuration level, they correspond to different test cases. The
  \texttt{OPT} directive(s) of a given configuration level replace(s)
  the ones of the preceding level.
\item The \texttt{STDOPT}\sscodeidxdef{Test}{Directive}{STDOPT}
  directive takes as default set of options
  the last \texttt{OPT} directive(s) of the preceding configuration
  level. If the preceding configuration level contains several
  \texttt{OPT} directives, hence several test cases, \texttt{STDOPT}
  is applied to each of them, leading to the same number of test cases.
  The syntax for this directive is the following.
  \begin{code}
    STDOPT: [[+#-]"opt" ...]
  \end{code}
options are always given between quotes. An option following a \texttt{+}
(resp. \texttt{\#} is added to the end (resp. start) of current set of options
while an option following a \texttt{-} is removed from it. The directive can be
empty (meaning that the corresponding test will use the standard set of
options). As with \texttt{OPT}, each \texttt{STDOPT} corresponds to a different
(set of) test case(s). \texttt{LOG} directives preceding an \texttt{STDOPT}
are taken into account.
\item The syntax for directives
  \texttt{EXECNOW}\sscodeidxdef{Test}{Directive}{EXECNOW}
  and \texttt{EXEC}\sscodeidxdef{Test}{Directive}{EXEC} is the following.
  \begin{code}
    EXECNOW: [ [ LOG file | BIN file ] ... ] cmd
  \end{code}
or
  \begin{code}
    EXEC: [ [ LOG file | BIN file ] ... ] cmd
  \end{code}
  Files after \texttt{LOG} are log files generated by command \texttt{cmd} and
  compared from oracles, whereas files after \texttt{BIN} are binary files also
  generated by \texttt{cmd} but not compared from oracles. Full access path to
  these files have to be specified only in \texttt{cmd}. All the commands
  described by directives \texttt{EXECNOW} or \texttt{EXEC} are executed in
  order and
  before running any of the other directives. If the execution of
  one \texttt{EXECNOW} or \texttt{EXEC} directive fails ({\it i.e.} has a
  non-zero return
  code), the remaining actions are not executed.
  \texttt{EXECNOW} or \texttt{EXEC}
  directives from a given level are added to the directives of the following
  levels.
  The distinction between \texttt{EXECNOW} and \texttt{EXEC} only occurs when
  the command is put in a test configuration file:
  \texttt{EXECNOW} executes the command only once for the test suite, while
  \texttt{EXEC} executes it once per test file of the test suite.
\item The \texttt{MACRO}\sscodeidxdef{Test}{Directive}{MACRO} directive
  has the following syntax:
  \begin{code}
    MACRO: macro-name content
  \end{code}
  where \texttt{macro-name} is any sequence of characters containing neither
  a blank nor an \texttt{@}, and \texttt{content} extends until the end of the
  line. Once such a directive has been encountered, each occurrence of
  \texttt{@macro-name@} in a \texttt{CMD}, \texttt{LOG}, \texttt{OPT},
  \texttt{STDOPT} or \texttt{EXECNOW} directive at this configuration level
  or in any level below it will be replaced by \texttt{content}. Existing 
  pre-defined macros are listed in section~\ref{sec:ptests-macros}.
\item The \texttt{FILEREG}\sscodeidxdef{Test}{Directive}{FILEREG}
  directive contains a regular expression indicating which files in
  the directory containing the current test suite are actually part of
  the suite. This directive is only
  usable in a \texttt{test\_config}\codeidx{test\_config}
  configuration file.
\end{itemize}

\begin{important}
  \textbf{Summary: ordering of test executions}

  There is no total ordering between the tests in a test file header.
  The only guaranteed order between test executions is the following:

  \begin{enumerate}
  \item \texttt{EXEC} and \texttt{EXECNOW} commands are executed sequentially,
    from top to bottom.
  \item Then, all \texttt{OPT}/\texttt{STDOPT} commands execute in an
    unspecified order (possibly simultaneously).
  \end{enumerate}

  A consequence of this ordering is that, if you need a test to produce output
  that will be consumed by another test, the producer \emph{must} be defined via
  \texttt{EXEC}/\texttt{EXECNOW} (e.g. using \texttt{@frama-c@} and explicitly
  giving its arguments), while the consumer can be either in a later
  \texttt{EXEC}/\texttt{EXECNOW}, or in a \texttt{OPT}/\texttt{STDOPT}
  directive.
\end{important}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Plug-in General Services}\label{adv:plugin-services}

Module \texttt{Plugin}\codeidxdef{Plugin} provides an access to some general
services available for all plug-ins. The goal of this module is twofold. First,
it helps developers to use general \framac services. Second, it provides to the
end-user a set of features common to all plug-ins. To access to these services,
you have to apply the functor
\texttt{Plugin.Register}\scodeidx{Plugin}{Register}.

\begin{important}
Each plug-in must apply this functor exactly once.
\end{important}

\begin{example}
Here is how the plug-in \texttt{From} applies the functor
\texttt{Plugin.Register} for its own use.
\begin{ocamlcode}
include Plugin.Register
  (struct
     let name = "from analysis"
     let shortname = "from"
     let help = "functional dependencies"
   end)
\end{ocamlcode}
\end{example}

Applying this functor mainly provides two different services. First it gives
access to functions for printing messages in a \framac-compliant way (see
Section~\ref{adv:log}). Second it allows to define plug-in specific parameters
available as options on the \framac command line to the end-user (see
Section~\ref{adv:cmdline})\index{Command Line!Option}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Logging Services}\label{adv:log}
\index{Logging|see{Messages}}
\index{Messages}

Displaying results of plug-in computations to users, warning them of the
hypothesis taken by the static analyzers, reporting incorrect inputs, all
these tasks are easy to think about, but turn out to be difficult to
handle in a readable way. As soon as your plug-in is registered (see
Section~\ref{adv:plugin-services} above), though, you automatically
benefit from many logging facilities provided by the kernel. What is more,
when logging through these services, messages from your plug-in
combine with other messages from other plug-ins, in a consistent
and user-friendly way.

As a general rule, you should \emph{never} write to standard output
and error channels through \ocaml standard libraries. For instance,
you should never use \texttt{Pervasives.stdout} and
\texttt{Pervasives.stderr} channels, nor \texttt{Format.printf}-like
routines.

Instead, you should use \texttt{Format.fprintf} to implement
pretty-printers for your own complex data, and only the
\texttt{printf}-like routines of
\texttt{Log.Messages}\scodeidx{Log}{Messages} to display messages to
the user. All these routines are immediately available from your
plug-in general services.

\begin{example}
A minimal example of a plug-in using the logging services:
\scodeidx{Plugin}{Register}
\sscodeidx{Db}{Main}{extend}
\begin{ocamlcode}
module Self = Plugin.Register
  (struct
     let name = "foo plugin"
     let shortname = "foo"
     let help = "illustration of logging services"
   end)

let pp_dg out n =
  Format.fprintf out
    "you have at least debug %d" n

let run () =
  Self.result "Hello, this is Foo Logs !";
  Self.debug ~level:0 "Try higher debug levels (%a)" pp_dg 0;
  Self.debug ~level:1 "If you read this, %a." pp_dg 1;
  Self.debug ~level:3 "If you read this, %a." pp_dg 3;

let () = Db.Main.extend run ()
\end{ocamlcode}
\end{example}

Running this example, you should see:
\begin{shell}
\$ frama-c -foo-debug 2
[foo] Hello, this is Foo Logs !
[foo] Try higher debug levels (you have at least debug 0).
[foo] If you read this, you have at least debug 1.
\end{shell}

Notice that your plug-in automatically benefits from its own debug
command line parameter, and that messages are automatically prefixed
with the name of the plug-in. We now get into more details for an
advanced usage of logging services.

\subsection{From \texttt{printf} to \texttt{Log}}

Below is a simple example of how to make a \texttt{printf}-based code
towards being \texttt{Log}-compliant. The original code, extracted
from the \textsf{Occurrence} plug-in in \framac-\textsf{Lithium}
version is as follows:

\begin{ocamlcode}
let print_one v l =
  Format.printf "variable %s (%d):@\n" v.vname v.vid;
  List.iter
    (fun (ki, lv) ->
       Format.printf "  sid %a: %a@\n" d_ki ki d_lval lv) 
    l

let print_all () =
  compute ();
  Occurrences.iter print_one
\end{ocamlcode}

The transformation is straightforward. First you add to all your
pretty-printing functions an additional \texttt{Format.formatter}
parameter, and you call \texttt{fprintf} instead of \texttt{printf}:

\begin{ocamlcode}
let print_one fmt v l =
  Format.fprintf fmt "variable %s (%d):@\n" v.vname v.vid;
  List.iter
    (fun (ki, lv) ->
       Format.fprintf fmt "  sid %a: %a@\n" d_ki ki d_lval lv) 
    l
\end{ocamlcode}

Then, you delegate toplevel calls to \texttt{printf} towards an
appropriate logging routine, with a formatting string containing the
necessary \lstinline{"%t"} and \lstinline{"%a"} formatters:

\begin{ocamlcode}
let print_all () =
  compute ();
  result "%t" (fun fmt -> Occurrences.iter (print_one fmt))
\end{ocamlcode}

%That's all!

\subsection{Log Quick Reference}

The logging routines for your plug-ins consist in an implementation of
the \lstinline{Log.Messages} interface, which is included in the
\lstinline{Plugin.S} interface returned by the registration of your
plug-in. The main routines of interest are:

\newcommand{\ppcmd}[2]%
{{\small\tt {\bf #1}#2}}

\newcommand{\ppcmdprintf}[2]%
{{\small{\tt {\bf #1}#2} $<${\it options}$>$ {\tt "..."}}}

\newcommand{\logprintf}[2]{%
\item[\ppcmdprintf{#1}{#2}]%
\sscodeidxdef{Log}{Messages}{#1}\mbox{}\\%
}
\newcommand{\logprintfvariant}[2]{%
\item[\begin{tabular}{l@{ }l}%
\ppcmdprintf{#1}{}\\\ppcmdprintf{#2}{}%
\end{tabular}]%
\sscodeidx{Log}{Messages}{#1}\sscodeidx{Log}{Messages}{#2}\mbox{}\\%
}

\newcommand{\logroutine}[2]{%
\item[\ppcmd{#1}{#2}]%
\scodeidx{Log}{#1}\mbox{}\\%
}

\scodeidx{Log}{Messages}
\begin{description}

%%item
  \logprintf{result}{}%
  Outputs most of your messages with this routine. You may specify
  \lstinline{~level:n} option to discard too detailed messages in
  conjunction with the \emph{verbose} command line option. The default
  level is \lstinline{1}.

%%item
  \logprintf{feedback}{}%
  Reserved for \emph{short} messages that gives feedback about the
  progression of long computations. Typically, entering a function
  body or iterating during fixpoint computation.  The level option can
  be used as for \lstinline{result}.

%%item
  \logprintf{debug}{}%
  To be used for plug-in development messages and internal error
  diagnosis. You may specify \lstinline{~level:n} option to discard
  too detailed messages in conjunction with the \emph{debug} command
  line option. The default message level is~\lstinline{1}, and the
  default debugging level is~\lstinline{0}. Hence, without any option,
  \lstinline{debug} discards all its messages.

%%item
  \logprintf{warning}{}%
  For reporting to the user an important information about the
  validity of the analysis performed by your plug-in. For instance, if
  you locally assume non arithmetic overflow on a given statement,
  \emph{etc}. Typical options include \lstinline{~current:true} to
  localize the message on the current source location.

%%item
  \logprintfvariant{error}{abort}%
  Use these routines for reporting to the user an error in its
  inputs. It can be used for non valid parameters, for instance. It
  should \emph{not} be used for some not-yet implemented feature,
  however.

  The \lstinline{abort} routine is a variant that raises an exception
  and thus immediately aborts the computation\footnote{The raised
  exception is not supposed to be caught by anything else than the
  main entry point of Frama-C.}.
  If you use \lstinline{error},
  execution will continue until the end of current stage or current
  group of the running phase (see section~\ref{adv:init}).

%%item
  \logprintfvariant{failure}{fatal}%
  Use these routines for reporting to the user that your plug-in is
  now in inconsistent state or can not continue its
  computation. Typically, you have just discovered a bug in your plug-in!

  The \lstinline{fatal} routine is a variant that raises an exception.
  \lstinline{failure} has a behavior similar to \lstinline{error} above,
  except that it denotes an internal error instead of a user error.

%%item
  \logprintf{verify}{ (condition)}%
  First the routine evaluates the condition and the formatting
  arguments, then, discards the message if the condition holds and
  displays a message otherwise. Finally, it returns the condition value.

  A typical usage is for example:
  \begin{ocamlcode}
    assert (verify (x>0) "Expected a positive value (%d)" x)
  \end{ocamlcode}

\end{description}

\subsection{Logging Routine Options}

Logging routines have optional parameters to modify their general
behavior. Hence their involved type in \texttt{Log.mli}.

\paragraph{Level Option.} A minimal level of verbosity or debugging can be
specified for the message to be emitted. For the result and feedback
channels, the verbosity level is used ; for the debug channel, the
debugging level is used.
\begin{itemize}
\item[] \lstinline{~level:$n$} minimal level required is $n$.
\end{itemize}

\paragraph{Category Option} Debug, result, and feedback
output can be associated to a debugging key with the optional argument
\lstinline{~dkey} which takes an argument of abstract type
\lstinline|category|.
Each category must be registered through the
\lstinline|register_category| function. You can define subcategories
by putting colons in the registered name. For instance \lstinline|a:b:c| defines
a subcategory \lstinline|c| of \lstinline|a:b|, itself a subcategory of 
\lstinline|a|. User can then choose to
output debugging messages belonging to a given category (and its subcategories)
with the \lstinline{-plugin-msg-key <category>} option. 

In order to decide whether a message should be output, both level and category
options are examined:
\begin{itemize}
\item if neither \lstinline|~level| nor \lstinline|~dkey|, the effect is the
  same as having a level of \lstinline|1| and no category.
\item if only \lstinline|~level| is provided, the message is output if the
  corresponding verbosity or debugging level is sufficient
\item if only \lstinline|~dkey| is used, the message is output if the
  corresponding category is in used 
  (even if the verbosity or debugging level is \lstinline|0|)
\item if both \lstinline|~level| and \lstinline|~dkey| are present,
  the message is output if the two conditions above
  (sufficient verbosity or debugging level and appropriate category in use)
  hold. As a rule of thumb, you should refrain from using both mechanisms to
  control the output of a message. If some messages of a category do not have
  the same importance, use subcategories to give the user a finer control on
  what they want to see on the output.
\end{itemize}

\paragraph{Warning Category Option} Warning output can also
be associated with a category, via the \lstinline|~wkey| optional argument
that takes a value of abstract type
\sscodeidxdef{Log}{Messages}{warn\_category}
\lstinline|warn_category|. Warning categories
are distinct from plain categories, and must be registered with the
\sscodeidxdef{Log}{Messages}{register\_warn\_category}
\lstinline|register_warn_category| function. As explained in the user
manual~\cite{userman}, each category can be associated with a status that
controls what will happen when a warning is triggered, from completely ignoring
it to aborting execution. The default is to emit the warning, but this can
be changed by using the
\sscodeidxdef{Log}{Messages}{set\_warn\_status}
\lstinline|set_warn_status| function.

\paragraph{Source Options.} By default, a message is not localized. You may
specify a source location, either specifically or by using the current
location of an \texttt{AST} visitor.
\begin{itemize}
\item[] \lstinline{~source:$s$} use the source location $s$ (see \texttt{Log.mli})
\item[] \lstinline{~current:true} use the current source location
  managed by \texttt{Cil.CurrentLoc}.
\end{itemize}

\paragraph{Emission Options.} By default, a message is echoed to the
user \emph{after} its construction, and it is sent to registered
callbacks when emitted. See Section~\ref{adv:log:events} below for
more details on how to globally modify such a behavior. During the
message construction, you can locally modify the emission process with
the following options:
\begin{itemize}
\item[] \lstinline{~emitwith:$f$} suppresses the echo and sends the
  emitted event \emph{only} to the callback function $f$. Listeners
  are not fired at all.
\item[] \lstinline{~once:true} finally discards the message if the
  same one was already emitted before with the \lstinline{~once}
  option.
\end{itemize}

\paragraph{Append Option.} All logging routines have the
\lstinline{~append:$f$} optional parameter, where $f$ is function
taking a \lstinline{Format.formatter} as parameter and returning
unit. This function $f$ is invoked to append some text to the logging
routine. Such continuation-passing style is sometime necessary for
defining new polymorphic formatting functions. It has been introduced
for the same purpose than standard \lstinline{Format.kfprintf}-like
functions.

\subsection{Advanced Logging Services}

\subsubsection{Message Emission}
\label{adv:log:events}

During message construction, the message content is echoed in the
terminal. This echo may be delayed until message completion when
\lstinline{~once} has been used. Upon message
completion, the message is \emph{emitted} and sent to all globally
registered hook functions, unless the
\lstinline{~emitwith} option has been used.

\scodeidx{Log}{add\_listener}
\scodeidx{Log}{set\_echo}
To interact with this general procedure, the plug-in developer can use
the following functions defined in module \texttt{Log}:

\begin{ocamlcode}
val set_echo:     ?plugin:string -> ?kinds:kind list -> bool -> unit
val add_listener: ?plugin:string -> ?kinds:kind list -> (event -> unit) -> unit
\end{ocamlcode}

\subsubsection{Continuations}

The logging routines take as argument a (polymorphic) formatting
string, followed by the formatting parameters, and finally return
unit. It is also possible to catch the generated message, and to pass
it to a continuation that finally returns a value different than unit.

For this purpose, you must use the \texttt{with\_<log>} routines
variants. These routines take a continuation $f$ for additional
parameter. After emitting the corresponding message in the normal
way, the message is passed to the continuation $f$. Hence, $f$ has
type $\mathit{event}\rightarrow\alpha$, and the log routine returns
$\alpha$.

For instance, you typically use the following code fragment to return
a degenerated value while emitting a warning:
\begin{ocamlcode}
  let rec fact n =
    if (n>12) then
      with_warning (fun _ -> 0) "Overflow for %d, return 0 instead" x
    else if n<=1 then 1 else n * fact (n-1)
\end{ocamlcode}

\subsubsection{Generic Routines}

The \lstinline{Log.Messages} interface provides two generic routines
that can be used instead of the basic ones:

\begin{description}
%%item
  \logprintf{log}{ ?kind ?verbose ?debug}%
  Emits a message with the given kind, when the verbosity and/or
  debugging level are sufficient.

%%item
  \logprintf{with\_log}{ f ?kind}%
  Emits a message like \texttt{log}, and finally pass the generated
  message to the continuation $f$, and returns its result.
\end{description}

The default kind is \lstinline{Result}, but all the other kind of
message can be specified. For verbosity and debugging levels, the
message is emitted when:

\begin{tabular}{ll}
  \lstinline|log "..."| & verbosity is at least $1$ \\
  \lstinline|log ~verbose:$n$| & verbosity is at least $n$ \\
  \lstinline|log ~debug:$n$| & debugging is at least $n$ \\
  \lstinline|log ~verbose:$v$ ~debug:$d$|
  & \emph{either} verbosity is at least $v$ \\
  & \emph{or} debugging is at least $d$.
\end{tabular}

\subsubsection{Channel Management}

The logging services are build upon \emph{channels}, which are
basically buffered formatters to standard output extended with
locking, delayed echo, and notification services.

The very safe feature of logging services is that recursive calls
\emph{are} protected. A message is only echoed upon termination,
and a channel buffer is stacked only if necessary to preserve memory.

Services provided at plug-in registration are convenient shortcuts to
low-level logging service onto channels. The \lstinline{Log} interface
allows you to create such channels for your own purposes.

Basically, \emph{channels} ensure that no message emission interfere
with each others during echo on standard output. Hence the forbidden
direct access to \lstinline{Pervasives.stdout}. However,
\lstinline{Log} interface allows you to create such channels on your
own, in addition to the one automatically created for your plug-in.

\begin{description}
%%item
  \logroutine{new\_channel}{ {\rm\it name}}%
  This creates a new channel. There is only one channel \emph{per}
  name, and the function returns the existing one if
  any. Plug-in channels are registered under their short-name, and
  the kernel channel is registered under \lstinline{Log.kernel_channel_name}.

%%item
  \logroutine{log\_channel}{ channel ?kind ?prefix}%
  This routine is similar to the \lstinline{log} one.

%%item
  \logroutine{with\_log\_channel}{ channel f ?kind ?prefix}%
  This routine is similar to the \lstinline{with_log} one.
\end{description}

With both logging routines, you may specify a prefix to be used during
echo. The available switches are:
\begin{itemize}
\item[] \lstinline|Label $t$|: use the string $t$ as a prefix for the
  first echoed line of text, then use an indentation of same length
  for the next lines.
\item[] \lstinline|Prefix $t$|: use the string $t$ as a prefix for all
  lines of text.
\item[] \lstinline|Indent $n$|: use an indentation of $n$ spaces for
  all lines of text.
\end{itemize}

When left unspecified, the prefix is computed from the message kind
and the channel name, like for plug-ins.

\subsubsection{Output Management}

It is possible to ask \lstinline{Log} to redirect its output to
another channel:
\begin{description}
  %% item
  \logroutine{set\_output}{ out flush}%
  The parameters are the same than those of
  \lstinline{Format.make_formatter}: \lstinline{out} outputs a
  (sub)-string and \lstinline{flush} actually writes the buffered
  text to the underlying device.
\end{description}

It is also possible to have a momentary direct access to
\lstinline{Pervasives.stdout}, or whatever its redirection is:
\begin{description}
%%item
  \logroutine{print\_on\_output}{ "..."}%
  The routine immediately locks the output of \lstinline{Log} and 
  prints the provided message. All message echoes are delayed until
  the routine actually returns. Notification to listeners is not
  delayed, however.

%%item
  \logroutine{print\_delayed}{ "..."}%
  This variant locks the output \emph{only} when the first character
  would be written to output. This gives a chance to a message to be echoed before
  your text is actually written.
\end{description}

Remark that these two routines can \emph{not} be recursively invoked,
since they have a lock to a non-delayed output channel. This
constraint is verified at runtime to avoid incorrect interleaving, and
you would get a fatal error if the situation occurs.

\paragraph{Warning:} these routine are dedicated to \emph{expensive}
output only. You get the advantage of not buffering your text before
printing. But on the other hand, if you have messages to be echoed
during printing, they must be stacked until the end of your printing.

\scodeidx{Kernel}{CodeOutput}%
\scodeidx{Log}{print\_delayed}
\index{Command Line!-ocode@\texttt{-ocode}}%
You get a similar functionality with
\lstinline{Kernel\_function.CodeOutput.output}. This routine prints your
text by calling \lstinline{Log.print_delayed}, unless the command
line option \texttt{-ocode} has been set. It this case, your text is
written to the specified file.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{The \texttt{Datatype} library: Type Values and
  Datatypes}\label{adv:datatype}
\index{Type!Dynamic}\index{Datatype!Library}

\emph{Type values} and \emph{datatypes} are key notions of \framac. They are
both provided by the \texttt{Datatype} library. An overview as well as technical
details may also be found in a related article in
French~\cite{signoles:jfla11}. A short summary focusing on (un)marshaling is
described in another article~\cite{cuoq:ml11}. First,
Section~\ref{type:type-value} introduces type values. Then
Section~\ref{type:datatype} introduces datatypes built on top of type values.

\subsection{Type Value}\label{type:type-value}\index{Type!Value|bfit}

A \emph{type value} is an \caml value which dynamically represents a static
monomorphic \caml type $\tau$. It gets the type $\tau$
\texttt{Type.t}\scodeidx{Type}{t}. There is at most one type value which
represents the type $\tau$. Type values are used by \framac to ensure safety
when dynamic typing is required (for instance to access to a dynamic plug-in
API, see Section~\ref{adv:dynamic-registration}).

Type values for standard \caml monomorphic types are provided in module
\texttt{Datatype}\codeidx{Datatype}.
\begin{example}
The type value for type \texttt{int} is
\texttt{Datatype.int}\scodeidx{Datatype}{int} while the one for type
\texttt{string} is \texttt{Datatype.string}\scodeidx{Datatype}{string}. The
former has type \texttt{int Type.t} while the latter has type \texttt{string
  Type.t}.
\end{example}

Type values are created when building datatypes (see
Section~\ref{type:datatype}). There is no type value for polymorphic
types. Instead, they have to be created for each instance of a polymorphic
type. Functions for accessing such type values for standard \caml polymorphic
types are provided in module\texttt{Datatype}\codeidx{Datatype}.
\begin{example}
The type value for type \texttt{int list} is \texttt{Datatype.list
  Datatype.int}\scodeidx{Datatype}{int}\scodeidx{Datatype}{list} while the one
for type \texttt{string \fl char \fl bool} is \texttt{Datatype.func2
  Datatype.string Datatype.char
  Datatype.bool}\scodeidx{Datatype}{string}\scodeidx{Datatype}{char}%
\scodeidx{Datatype}{bool}\scodeidx{Datatype}{func2}. The former has type
\texttt{int list Type.t} while the latter has type \texttt{(string \fl char \fl
  bool) Type.t}.
\end{example}

\subsection{Datatype}\label{type:datatype}\index{Datatype|bfit}

A \emph{datatype} provides in a single module a monomorphic type and usual
values over it. Its signature is \texttt{Datatype.S}\scodeidx{Datatype}{S}. It
contains the type itself, the type value corresponding to this type, its name,
functions \texttt{equal}, \texttt{compare}, \texttt{hash} and \texttt{pretty}
which may respectively be used to check equality, to compare, to hash and to
pretty print values of this type. It also contains some other values (for
instance required when marshaling or journalizing). Whenever possible, a
datatype implements an extensible version of \texttt{Datatype.S}, namely
\texttt{Datatype.S\_with\_collections}\scodeidx{Datatype}{S\_with\_collections}.
For a type $\tau$, this extended signature additionally provides modules
\texttt{Set}, \texttt{Map} and \texttt{Hashtbl} respectively implementing sets
over $\tau$, maps and hashtables indexed by elements of $\tau$.

Datatypes for \caml types from the standard library are provided in module
\texttt{Datatype}\codeidx{Datatype}, while those for AST's types are provided in
module \texttt{Cil\_datatype}\codeidx{Cil\_datatype}. Furthermore, when a kernel
module implements a datastructure, it usually implements
\texttt{Datatype.S}\scodeidx{Datatype}{S}.

\begin{example}
The following line of code pretty prints whether two statements are equal.
\sscodeidx{Cil\_datatype}{Stmt}{pretty}
\sscodeidx{Cil\_datatype}{Stmt}{equal}
\begin{ocamlcode}
(* assuming the type of [stmt1] and [stmt2] is Cil_types.stmt *)
Format.fprintf 
  fmt (* a formatter previously defined somewhere *)
  "statements %a and %a are %sequal" 
  Cil_datatype.Stmt.pretty stmt1 
  Cil_datatype.Stmt.pretty stmt2
  (if Cil_datatype.Stmt.equal stmt1 stmt2 then "" else "not ")
\end{ocamlcode}
\end{example}

\begin{example}
Module \texttt{Datatype.String}\scodeidx{Datatype}{String} implements
\texttt{Datatype.S\_with\_collections}. Thus you can initialize a set of strings
in the following way.
\sscodeidx{Datatype}{String}{Set}
\begin{ocamlcode}
let string_set = 
  List.fold_left 
    (fun acc s -> Datatype.String.Set.add s acc)
    Datatype.String.Set.empty 
    [ "foo"; "bar"; "baz" ]
\end{ocamlcode}
\end{example}

\subsubsection{Building Datatypes}

For each monomorphic type, the corresponding datatype may be created by applying
the functor \texttt{Datatype.Make}\scodeidx{Datatype}{Make}. In addition to the
type \texttt{t} corresponding to the datatype, several values must be provided
in the argument of the functor. These values are properly documented in the
\framac API. The following example introduces them in a practical way.
\begin{example}
Here is how to define in the more precise way the datatype corresponding to a
simple sum type.
\scodeidx{Datatype}{Make}
\scodeidx{Datatype}{identity}
\scodeidx{Datatype}{never\_any\_project}
\scodeidx{Type}{par}
\sscodeidx{Type}{precedence}{Basic}
\sscodeidx{Type}{precedence}{Call}
\sscodeidx{Structural\_descr}{t}{Structure}
\sscodeidx{Structural\_descr}{structure}{Sum}
\scodeidx{Structural\_descr}{p\_int}
\index{Journalisation}
\index{Marshaling}
\index{Project}
\begin{ocamlcode}
type ab = A | B of int
module AB =
  Datatype.Make
    (struct
       (* the type corresponding to the datatype *)
       type t = ab
       (* the unique name of the built datatype; usually the name of the
          type *)
       let name = "ab"
       (* representants of the type: a non-empty list of values of this type. It
          is only used for safety check: the best the list represents the 
          different possible physical representation of the type, the best the
          check is. *)
       let reprs = [ A; B 0 ]
       (* structural descriptor describing the physical representation of the
          type. It is used when marshaling. *)
       let structural_descr = 
         Structural_descr.Structure 
           (Structural_desr.Sum [| [| Structural_descr.p_int |] |])
       (* equality, compare and hash are the standard OCaml ones *)
       let equal (x:t) y = x = y
       let compare (x:t) y = Pervasives.compare x y
       let hash (x:t) = Hashtbl.hash x
       (* the type ab is a standard functional type, thus copying and rehashing
          are simply identity. Rehashing is used when marshaling. *)
       let copy = Datatype.identity
       let rehash = Datatype.identity
       (* the type ab does never contain any value of type Project.t *)
       let mem_project = Datatype.never_any_project
       (* pretty printer *)
       let pretty fmt x = 
         Format.pp_print_string fmt
          (match x with A -> "a" | B n -> "b" ^ string_of_int n)
       (* printer which must produce a valid OCaml value in a given
          context. It is used when journalising. *)
       let internal_pretty_code prec_caller fmt = function
         | A ->
           Type.par 
             prec_caller 
             Type.Basic 
             fmt 
             (fun fmt -> Format.pp_print_string fmt "A")
         | B n ->
           Type.par 
             prec_caller 
             Type.Call 
             fmt 
             (fun fmt -> Format.fprintf fmt "B %d" n)
       (* A good prefix name to use for an OCaml variable of this type. *)
       let varname v = "ab" ^ (match v with A -> "_a_" | B -> "_b_")
     end)
\end{ocamlcode}
\end{example}
Only providing an effective implementation for the values \texttt{name} and
\texttt{reprs} is mandatory. For instance, if you know that you never journalize
a value of a type \texttt{t}, you can define the function
\texttt{internal\_pretty\_code} equal to the predefined function
\texttt{Datatype.pp\_fail}\scodeidx{Datatype}{pp\_fail}. Similarly, if you never
use values of type \texttt{t} as keys of hashtable, you can define the function
hash equal to the function
\texttt{Datatype.undefined}\scodeidx{Datatype}{undefined} , and so on. To ease
this process, you can also use the predefined structure
\texttt{Datatype.Undefined}\scodeidx{Datatype}{Undefined}.
\begin{example}
  Here is a datatype where only the function \texttt{equal} is provided.
\scodeidx{Datatype}{Make}
\scodeidx{Datatype}{Undefined}
\begin{ocamlcode}
(* the same type than the one of the previous example *)
type ab = A | B of int
module AB =
  Datatype.Make
    (struct
       type t = ab
       let name = "ab"
       let reprs = [ A; B 0 ]
       include Datatype.Undefined
       let equal (x:t) y = x = y
    end)
\end{ocamlcode}
\end{example}

One weakness of \texttt{Datatype.Undefined} is that it cannot be used in a
projectified state (see Section~\ref{proj:states}) because its values cannot be
serializable. In such a case, you can use the very useful predefined structure
\texttt{Datatype.Serializable\_undefined}%
\scodeidxdef{Datatype}{Serializable\_undefined} which behaves as
\texttt{Datatype.Undefined} but defines the values which are relevant for
(un)serialization.

\subsubsection{Datatypes of Polymorphic Types}

As for type values, it is not possible to create a datatype corresponding to
polymorphic types, but it is possible to create them for each of their
monomorphic instances. 

\begin{important}
For building such instances, you \emph{must} not apply the functor
\texttt{Datatype.Make}\scodeidx{Datatype}{Make} since it will create two type
values for the same type (and with the same name): that is forbidden.
\end{important}

Instead, you must use the functor
\texttt{Datatype.Polymorphic}\scodeidx{Datatype}{Polymorphic} for types with one
type parameter and the functor
\texttt{Datatype.Polymorphic2}\scodeidx{Datatype}{Polymorphic2} for types with
two type parameters\footnote{%
\texttt{Polymorphic3}\scodeidx{Datatype}{Polymorphic3} and 
\texttt{Polymorphic4}\scodeidx{Datatype}{Polymorphic4}
also exist in case of polymorphic types with 3 or 4 type parameters.}.
These functors takes as argument how to build the datatype
corresponding each monomorphic instance.

\begin{example}
Here is how to apply \texttt{Datatype.Polymorphic} corresponding to the type
\texttt{'a t} below.
\scodeidx{Datatype}{Polymorphic}
\scodeidx{Type}{name}
\sscodeidx{Structural\_descr}{t}{Structure}
\sscodeidx{Structural\_descr}{structure}{Sum}
\scodeidx{Structural\_descr}{pack}
\scodeidx{Structural\_descr}{p\_int}
\scodeidx{Datatype}{Int}
\sscodeidx{Datatype}{String}{Hashtbl}
\scodeidx{Datatype}{List}
\scodeidx{Type}{par}
\begin{ocamlcode}
type 'a ab = A of 'a | B of int
module Poly_ab = 
  Datatype.Polymorphic
    (struct
       type 'a t = 'a ab
       let name ty = Type.name ty ^ " ab"
       let module_name = "Ab"
       let reprs ty = [ A ty ]
       let structural_descr d = 
         Structural_descr.Structure 
          (Structural_descr.Sum
            [| [| Structural_descr.pack d |]; [| Structural_descr.p_int |] |]
       let mk_equal f x y = match x, y with
         | A x, A y -> f x y
         | B x, B y -> x = y
         | A _, B _ | B _, A _ -> false
       let mk_compare f x y =  match x, y with
         | A x, A y -> f x y
         | B x, B y -> Pervasives.compare x y
         | A _, B _ -> 1
         | B _, A _ -> -1
      let mk_hash f = function A x -> f x | B x -> 257 * x
      let map f = function A x -> A (f x) | B x -> B x
      let mk_internal_pretty_code f prec_caller fmt = function
        | A x ->
          Type.par 
            prec_caller 
            Type.Basic 
            fmt
            (fun fmt -> Format.fprintf fmt "A %a" (f Type.Call) x)
        | B n ->
          Type.par 
            prec_caller 
            Type.Call 
            fmt 
            (fun fmt -> Format.fprintf fmt "B %d" n)
      let mk_pretty f fmt x = 
        mk_internal_pretty_code (fun _ -> f) Type.Basic fmt x
      let mk_varname _ = "ab"
      let mk_mem_project mem f = function
        | A x -> mem f x
        | B _ -> false
     end)
module Ab = Poly_AB.Make

(* datatype corresponding to the type [int ab] *)
module Ab_int = Ab(Datatype.Int)

(* datatype corresponding to the type [int list ab] *)
module Ab_Ab_string = Ab(Datatype.List(Datatype.Int))

(* datatype corresponding to the type [(string, int) Hashtbl.t ab] *)
module HAb = Ab(Datatype.String.Hashtbl.Make(Datatype.Int))
\end{ocamlcode}
\end{example}
Clearly it is a bit painful. However you probably will never apply this
functor yourself. It is already applied for the standard \ocaml polymorphic
types like list and function (respectively
\texttt{Datatype.List}\scodeidx{Datatype}{List} and
\texttt{Datatype.Function}\scodeidx{Datatype}{Function}).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Plug-in Registration and Access}\label{adv:plugin-registration}

In this section, we present how to register plug-ins and how to access them.
Actually there are three different ways, but the recommended one is through a
\texttt{.mli} file.

Section~\ref{adv:standard-registration} indicates how to register and access a
plug-in through a \texttt{.mli} file. Section~\ref{adv:static-registration}
indicates how to register and access a \emph{kernel-integrated} plug-in while
Section~\ref{adv:dynamic-registration} details how to register and access a
\emph{standard} plug-in.

\subsection{Registration through a \texttt{.mli} File}
\label{adv:standard-registration}

\begin{target}plug-in developers.\end{target}
\begin{prereq}
  Basic knowledge of \make.
\end{prereq}

Each plug-in is compiled into a module of name indicated by the variable
\texttt{PLUGIN\_NAME}\codeidx{PLUGIN\_NAME} of its \texttt{Makefile} (say
\texttt{Plugin\_A}. Its developer has to provide a \texttt{.mli} for this
plug-in (following the previous example, a file \texttt{Plugin\_A.mli}). This
\texttt{.mli} file may thus contains the API of the plug-in.

Another plug-in may then access to \texttt{Plugin\_A} as it accesses any
other \caml module, but it has to declare in its \texttt{Makefile} that it
depends on \texttt{Plugin\_A} through the special variable
\texttt{PLUGIN\_DEPENDENCIES}.

\begin{example}
\texttt{Plugin\_A} declares and provides access to a function \texttt{compute}
in the following way. 

\listingname{File plugin\_a/my\_analysis\_a.ml}
\begin{ocamlcode}
let compute () = ...
\end{ocamlcode}

\listingname{File plugin\_a/Plugin\_A.mli}
\begin{ocamlcode}
module My_analysis_a: sig val compute: unit -> unit
\end{ocamlcode}

\codeidx{PLUGIN\_NAME}
\codeidx{PLUGIN\_CMO}
\codeidx{Makefile.dynamic}
\listingname{File plugin\_a/Makefile}
\begin{makefilecode}
PLUGIN_NAME:=Plugin_A
PLUGIN_CMO:=... my_analysis_a ...
...
include Makefile.dynamic
\end{makefilecode}

Then, \texttt{Plugin\_B} may use this function \texttt{Compute} as follows.

\listingname{File plugin\_b/my\_analysis\_b.ml}
\begin{ocamlcode}
let compute () = ... Plugin_A.My_analysis_a.compute () ...
\end{ocamlcode}

\codeidx{PLUGIN\_NAME}
\codeidx{PLUGIN\_CMO}
\codeidx{PLUGIN\_DEPENDENCIES}
\codeidx{Makefile.dynamic}
\listingname{File plugin\_b/Makefile}
\begin{makefilecode}
PLUGIN_NAME:=Plugin_B
PLUGIN_CMO:=... my_analysis_b ...
PLUGIN_DEPENDENCIES:=Plugin_A
...
include Makefile.dynamic
\end{makefilecode}
\end{example}

\subsection{Kernel-integrated Registration and Access}
\label{adv:static-registration}
\index{Plug-in!Kernel-integrated!Registration}
\index{Plug-in!Kernel-integrated!Access}

\begin{target}kernel-integrated plug-in developers.\end{target}

\begin{prereq}
  Accepting to modify the \framac kernel. Otherwise, you can still register
  your plug-in as any standard plug-in (see
  Section~\ref{adv:dynamic-registration} for details).
\end{prereq}

A database, called \texttt{Db}\codeidxdef{Db} (in directory
\texttt{src/kernel\_services/plugin\_entry\_points}), groups together
the API of all kernel-integrated
plug-ins. So it permits easy plug-in collaborations. Each kernel-integrated
plug-in is only visible through \texttt{Db}. For example, if a plug-in
\texttt{A} wants to know the results of another plug-in \texttt{B}, it uses the
part of \texttt{Db} corresponding to \texttt{B}. A consequence of this design
is that each plug-in has to register in \texttt{Db} by setting a function
pointer to the right value in order to be usable from others plug-ins.

\begin{example}
Plug-in \texttt{Impact} registers function \texttt{compute\_pragmas} in the
following way.

\listingname{src/plugins/impact/register.ml}
\scodeidx{Db}{Impact.compute\_pragmas}
\begin{ocamlcode}
let compute_pragmas () = ...
let () = Db.Impact.compute_pragmas := compute_pragmas
\end{ocamlcode}
Thus each developer who wants to use this function calls it by pointer
dereferencing like this.
\begin{ocamlcode}
let () = !Db.Impact.compute_pragmas ()
\end{ocamlcode}
\end{example}

If a kernel-integrated plug-in has to export some datatypes usable by other
plug-ins, such datatypes have to be visible from module
\texttt{Db}\codeidx{Db}. Thus they cannot be declared in the plug-in
implementation itself like any other plug-in declaration because postponed type
declarations are not possible in \ocaml.

Such datatypes are called \emph{plug-in types}\index{Plug-in!Types|bfit}. The
solution is to put these plug-ins types in some files linked before
\texttt{Db}; hence you have to put them in another directory than the plug-in
directory. The best way is to create a directory dedicated to types.

\begin{convention}
  The suggested name for this directory is
  \texttt{$p$\_types}\index{plug-in\_types@\texttt{$plugin$\_types}|bfit} for a
  plug-in $p$.
\end{convention}

If you add such a directory, you also have to modify \texttt{Makefile} by
extending variable \texttt{FRAMAC\_SRC\_DIRS}\codeidx{FRAMAC\_SRC\_DIRS} (see
Section~\ref{make:plugin}).
\begin{example}
Suppose you are writing a plug-in \texttt{p} which exports a specific type
\texttt{t} corresponding to the result of the plug-in analysis. The standard way
to proceed is the following.

\listingname{src/plugins/p\_types/p\_types.mli}
\begin{ocamlcode}
type t = ...
\end{ocamlcode}

\listingname{src/kernel\_services/plugin\_entry\_points/db.mli}
\begin{ocamlcode}
module P : sig
  val run_and_get: (unit -> P_types.t) ref
     (** Run plugin analysis (if it was never launched before).
         @return result of the analysis. *)
end
\end{ocamlcode}

\listingname{share/Makefile.common}
\begin{makefilecode}
FRAMAC_SRC_DIRS= ... plugins/p_types
   # Extend this variable with the new directory
\end{makefilecode}
\end{example}

This design choice has a side effect : it reveals exported types.
You can always hide them using a module to encapsulate the types
(and provide corresponding getters and setters to access them).

At this point, part of the plug-in code is outside the plug-in
implementation. This code should be linked before \texttt{Db}
\codeidx{Db}\footnote{A direct consequence is that you cannot use the whole
\framac functionalities, such as module \texttt{Db}, inside this code.}.

To this effect, the files containing the external plug-in code must be added to
the \texttt{Makefile} variable
\texttt{PLUGIN\_TYPES\_CMO}\codeidxdef{PLUGIN\_TYPES\_CMO} (see
Section~\ref{make:plugin}).

\subsection{Dynamic Registration and Access}
\label{adv:dynamic-registration}
\index{Plug-in!Registration}\index{Plug-in!Access}\index{Plug-in!API}

\begin{target}standard plug-ins developers.\end{target}

Registration of kernel-integrated plug-ins requires to modify module
\texttt{Db}\codeidx{Db} which belongs to the \framac kernel. Such a
modification is not possible for standard plug-ins which are fully independent
of \framac. Consequently, the \framac kernel provides another way for
registering a plug-in through the module \texttt{Dynamic}\codeidxdef{Dynamic}.

In short, you have to use the function
\texttt{Dynamic.register}\scodeidxdef{Dynamic}{register} in order to register a
value from a dynamic plug-in and you have to use function
\texttt{Dynamic.get}\scodeidxdef{Dynamic}{get} in order to apply a function
previously registered with \texttt{Dynamic.register}.

\subsubsection{Registering a value}

The signature of \texttt{Dynamic.register}\scodeidx{Dynamic}{register} is as
follows.\scodeidx{Type}{t}
\begin{ocamlcode}
val register: plugin:string -> string -> 'a Type.t -> journalize:bool -> 'a ->
unit
\end{ocamlcode}
The first argument is the name of the plug-in registering the value and the
second one is a binding name of the registered \caml value. The pair (plug-in
name, binding name) must not be used for value registration anywhere else in
\framac. It is required in order for another plug-in to access to this value
(see next paragraph). The third argument is the \emph{type value}%
\index{Type!Value} of the registered value (see
Section~\ref{type:type-value}). It is required for safety reasons when accessing
to the registered value (see the next paragraph). The labeled fourth argument
\texttt{journalize} indicates whether a total call to this function must be
written in the journal\index{Journalization} (see also
Section~\ref{adv:journalization}). The usual value for this argument is
\texttt{true}. The fifth argument is the value to register.

\begin{example}
  Here is how the function \texttt{run} of the plug-in
  \texttt{hello} of the tutorial is registered. The type of this function is
  \texttt{unit \fl~unit}\scodeidx{Dynamic}{register}%
  \scodeidx{Datatype}{func}\scodeidx{Datatype}{unit}.
\begin{ocamlcode}
let run () : unit = ...
let () =
  Dynamic.register 
    ~plugin:"Hello" 
    "run" 
    (Datatype.func Datatype.unit Datatype.unit)
    ~journalize:true
    run
\end{ocamlcode}
If the string \texttt{"Hello.run"} is already used to register a
dynamic value, then the exception
\texttt{Type.AlreadyExists}\scodeidx{Type}{AlreadyExists} is raised during
plug-in initialization (see Section~\ref{adv:init})\index{Initialization}.

The function call \texttt{Datatype.func Datatype.unit Datatype.unit} returns the
type value\index{Type!Value} representing \texttt{unit \fl~unit}. Note that,
because of the type of \texttt{Dynamic.register} and the types of its arguments,
the \caml type checker complains if the third argument (here the value
\texttt{run}) has not the type \texttt{unit \fl~unit}.
\end{example}

\subsubsection{Accessing to a registered value}

The signature of function
\texttt{Dynamic.get}\scodeidxdef{Dynamic}{get}\scodeidx{Type}{t} is as
follows.
\begin{ocamlcode}
val get: plugin:string -> string -> 'a Type.t -> 'a
\end{ocamlcode}
The arguments must be the same than the ones used at value registration time
(with \texttt{Dynamic.register}\scodeidx{Dynamic}{register}). Otherwise,
depending on the case, you will get a compile-time or a runtime error.

\begin{example}
  Here is how the previously registered function \texttt{run} of
  \texttt{Hello} may be applied.
\begin{ocamlcode}
let () = 
  Dynamic.get
    ~plugin:"Hello" 
    "run" 
    (Datatype.func Datatype.unit Datatype.unit) 
    ()
\end{ocamlcode}
The given strings and the given type value\index{Type!Value} must be the same
than the ones used when registering the function. Otherwise, an error occurs at
runtime. Furthermore, the \caml type checker will complain either if the third
argument (here \texttt{()}) is not of type \texttt{unit} or if the returned
value (here \texttt{()} also) is not of type \texttt{unit}.
\end{example}

The above-mentioned mechanism requires access to the type
value\index{Type!Value} corresponding to the type of the registered value. Thus
it is not possible to access a value of a plug-in-defined type. For solving
this issue, \framac provides a way to access type values\index{Type!Value} of
plug-in-defined types in an abstract way through the functor
\texttt{Type.Abstract}\scodeidx{Type}{Abstract}.

\begin{example}
There is no current example in the \framac open-source part, but consider a
plug-in which provides a dynamic API for callstacks as follows.
\scodeidx{Plugin}{Register}
\scodeidx{Kernel\_function}{t}
\scodeidx{Kernel\_function}{ty}
\scodeidx{Kernel\_function}{pretty}
\scodeidx{Kernel\_function}{dummy}
\sscodeidx{Cil\_datatype}{Stmt}{t}
\sscodeidx{Cil\_datatype}{Stmt}{ty}
\sscodeidx{Cil\_datatype}{Stmt}{pretty}
\scodeidx{Cil}{dummyStmt}
\scodeidx{Datatype}{Serializable\_undefined}
\ocamlinput{./examples/generated/callstack.ml}

You have to use the functor \texttt{Type.Abstract} to access to the type value
corresponding to the type of callstacks (and thus to access to the above
dynamically registered functions).
\scodeidx{Kernel\_function}{ty}
\sscodeidx{Cil\_datatype}{Stmt}{ty}
\scodeidx{Type}{Abstract}
\scodeidx{Dynamic}{get}
\scodeidx{Datatype}{func}
\scodeidx{Datatype}{func3}
\scodeidx{Datatype}{unit}
\ocamlinput{./examples/generated/use_callstack.ml}
\end{example}

\section{Journalization}\label{adv:journalization}

\todo

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Project Management System}\label{adv:project}
\begin{prereq}
  Knowledge of the \caml module system and labels.
\end{prereq}

In \framac, a key notion detailed in this section is the one of
\emph{project}. An overview as well as technical details may also be found in a
related article in French~\cite{project}. Section~\ref{proj:principle} first
introduces the general principle of project. Section~\ref{proj:states}
introduces the notion of \emph{states}. State registration is detailed in
Sections~\ref{proj:computation} and~\ref{proj:lowlevel}. The former is dedicated
to standard (high-level) registration, while the latter is dedicated to
low-level registration. Then Section~\ref{proj:use} explains how to use
projects. Finally Section~\ref{proj:selection} details state selections.

\subsection{Overview and Key Notions}\label{proj:principle}

A \emph{project}\index{Project|bfit} groups together an AST with the set of
global values attached to it. Such values are called
\emph{states}\index{State|bfit}. Examples of states are
parameters\index{Parameter} (see Section~\ref{adv:cmdline}) and
results of analyses (\framac extensively uses
memoization\index{Memoization}~\cite{michie67,michie68} in order to prevent
running analyses twice).

In a \framac session, several projects (and thus several ASTs) can exist at the
same time. The project library ensures project non-interference: modifying the
value of a state in a project does not impact any value of any state in any
other project. To ensure this property, each state must be registered in the
project library as explained in Sections~\ref{proj:computation}
and~\ref{proj:lowlevel}. Relations between states and projects are
summarized in Figure~\ref{fig:project}.
\begin{figure}[htbp]
\begin{center}
\begin{tabular}{|c|c|c|c|}
\hline
\diagbox{\textbf{States}}{\textbf{Projects}}
& Project $p_1$ & \dots & Project $p_n$ \\
\hline
AST $a$ & value of $a$ in $p_1$ & \dots & value of $a$ in $p_n$ \\
data $d_1$ & value of $d_1$ in $p_1$ & \dots & value of $d_1$ in $p_n$ \\
\dots & \dots & \dots & \dots \\
data $d_m$ & value of $d_m$ in $p_1$ & \dots & value of $d_m$ in $p_n$ \\
\hline
\end{tabular}
\end{center}
  \caption{Representation of the \framac State.\label{fig:project}}
\end{figure}

To ease development, \framac maintains a current project
\index{Project!Current|bfit} (\texttt{Project.current
  ()})\scodeidxdef{Project}{current}: all operations are automatically performed
on it. For instance, calling \texttt{Ast.get ()}\scodeidxdef{Ast}{get} returns the
\framac AST of the current project. It is also possible to access values in
others projects as explained in Section~\ref{proj:use}.

\subsection{State: Principle}\label{proj:states}
\index{State|bfit}

\begin{important}
  If some data should be part of the state of \framac, you must register
  it in the project library\index{State!Registration|bfit} (see
  Sections~\ref{proj:computation} and~\ref{proj:lowlevel}).
\end{important}

Here we first explain what are the functionalities of each state and
then we present the general principle of registration.

\subsubsection{State Functionalities}
\index{State!Functionalities|bfit}
Whenever you want to attach some data (\eg a table containing results of
an analysis) to an AST\index{AST}, you have to register it as an internal
state. The main functionalities provided to each internal state are the
following.
\begin{itemize}
\item It is automatically updated whenever the current project
  changes\index{Project!Current}. Your data are thus always
  consistent\index{Consistency} with the current project. More precisely, you
  still work with your global data (for instance, a hashtable or a reference) as
  usual in \ocaml. The project library silently changes the data when required
  (usually when the current project is changing). The extra cost due to the
  project system is usually an extra
  indirection. Figure~\ref{fig:proj-mechanism} summarizes these interactions
  between the project library and your state.
  \begin{figure}[htbp]
    \centering
    \includegraphics[viewport=0 0 440 246,width=0.99\textwidth]{mecanism.pdf}
    \caption{Interaction between the project library and your registered global
      data.}\label{fig:proj-mechanism}
  \end{figure}

\item It is part of the information saved on disk\index{Saving} for restoration
  in a later session\index{Loading}.
\item It may be part of a \emph{selection}\index{Selection} which is a
  consistent set of states\index{State!Selection|see{Selection}}.
  With such a selection,
  you can control on which states project operations are consistently applied
  (see Section~\ref{proj:selection}). For example, it is possible to clear all
  the states which depend on Value Analysis results.
\item It is possible to ensure inter-analysis consistency\index{Consistency} by
  setting state dependencies\index{State!Dependency}. For example, if the entry
  point\index{Entry Point} of the analyzed program is changed (using
  \texttt{Globals.set\_entry\_point}\scodeidx{Globals}{set\_entry\_point}), all
  the results of analyses depending on it (like value analysis' results) are
  automatically reset. If such a reset were not performed, the results of the
  value analysis would not be consistent anymore with the current entry point,
  leading to incorrect results.
\begin{example}
\sscodeidx{Db}{Value}{is\_computed}
~

\begin{ocamlcode}
!Db.Value.compute();
Kernel.feedback "%B" (Db.Value.is_computed ()); (* true *)
Globals.set_entry_point "f" true;
Kernel.feedback "%B" (Db.Value.is_computed ()); (* false *)
\end{ocamlcode}
As the value analysis has been automatically reset when setting the entry point,
the above code outputs
\begin{shell}
[kernel] true
[kernel] false
\end{shell}
\end{example}
\end{itemize}

\subsubsection{State Registration: Overview}
\index{State!Registration|bfit}

For registering a new state, functor
\texttt{State\_builder.Register}\scodeidx{State\_builder}{Register} is
provided. Its use is described in Section~\ref{proj:lowlevel} but it is a
low-level functor which is usually difficult to apply in a correct
way. Higher-level functors are provided to the developer in modules
\texttt{State\_builder}\codeidx{State\_builder} and
\texttt{Cil\_state\_builder}\codeidx{Cil\_state\_builder} that allow the
developer to register states in a simpler way. They internally apply the
low-level functor in the proper way. Module \texttt{State\_builder} provides
state builders for standard \caml datastructures like
hashtables\index{Hashtable} whereas \texttt{Cil\_state\_builder} does the same
for standard \cil datastructures (like hashtables indexed by AST
statements)\footnote{These datastructures are only mutable datastructures (like
  hashtables, arrays and references) because global states are always
  mutable.}. They are described in Section~\ref{proj:computation}.

\begin{important}
  Registering a new state must be performed when the plugin is
  initialized. Thus, using \caml \texttt{let module} construct to
  register the new state is forbidden (except if you really know what
  you are doing).
\end{important}

\subsection{Registering a New State}\label{proj:computation}
\index{State!Registration|bfit}

Here we explain how to register and use a state. Registration through the use of
the low-level functor \texttt{State\_builder.Register} is postponed in
Section~\ref{proj:lowlevel} because it is more tricky and rarely useful.

In most non-\framac applications, a state is a global mutable value. One can use
it to store results of analyses. For example, using this mechanism inside
\framac to create a \texttt{state} which would memoize\index{Memoization} some
information attached to statements would result in the following piece of code.
\scodeidx{Kernel\_function}{t} \scodeidx{Cil\_types}{varinfo}
\sscodeidx{Cil\_datatype}{Stmt}{Hashtbl} \sscodeidx{Db}{Value}{compute}
\begin{ocamlcode}
open Cil_datatype
type info = Kernel_function.t * Cil_types.varinfo
let state : info Stmt.Hashtbl.t = Stmt.Hashtbl.create 97
let compute_info (kf,vi) = ...
let memoize s =
  try Stmt.Hashtbl.find state s
  with Not_found -> Stmt.Hashtbl.add state s (compute_info s)
let run () = ... !Db.Value.compute (); ... memoize some_stmt ...
\end{ocamlcode}

However, if one puts this code inside \framac, it does not work because this
state is not registered as a \framac state. For instance, it is never saved on
the disk\index{Saving} and its value is never changed when setting the current
project to a new one. For this purpose, one has to transform the above code into
the following one. \scodeidx{Cil\_state\_builder}{Stmt\_hashtbl}
\scodeidx{Datatype}{Pair} \codeidx{Kernel\_function}
\scodeidx{Cil\_datatype}{Varinfo} \sscodeidx{Db}{Value}{self}
\sscodeidx{Db}{Value}{compute} \codeidx{memo}
\begin{ocamlcode}
module State =
  Cil_state_builder.Stmt_hashtbl
    (Datatype.Pair(Kernel_function)(Cil_datatype.Varinfo))
    (struct
      let size = 97
      let name = "state"
      let dependencies = [ Db.Value.self ]
     end)
let compute_info (kf,vi) = ...
let memoize = State.memo compute_info
let run () = ... !Db.Value.compute (); ... memoize some_stmt ...
\end{ocamlcode}
A quick look on this code shows that the declaration of the state itself is more
complicated (it uses a functor application) but its use is
simpler. Actually what has changed?
\begin{enumerate}
\item To declare a new internal state, apply one of the predefined functors in
  modules \texttt{State\_builder}\codeidx{State\_builder} or
  \texttt{Cil\_state\_builder}\codeidx{Cil\_state\_builder} (see interfaces of
  these modules for the list of available modules). Here we use
  \texttt{Cil\_state\_builder.Stmt\_hashtbl} which provides a hashtable indexed
  by statements. The type of values associated to statements is a pair of
  \texttt{Kernel\_function.t} and
  \texttt{Cil\_types.varinfo}\scodeidx{Cil\_types}{varinfo}. The first argument
  of the functor is then the datatype\index{Datatype} corresponding to this type
  (see Section~\ref{type:datatype}). The second argument provides some
  additional information: the initial size of the hashtable\index{Hashtable} (an
  integer similar to the argument of \texttt{Hashtbl.create}), an unique
  name\index{State!Name} for the resulting state and its
  dependencies\index{State!Dependency|bfit}. This list of dependencies is built
  upon values \texttt{self}\codeidxdef{self} which are called \emph{state
    kind}\index{State!Kind|see{Kind}} (or simply \emph{kind}\index{Kind|bfit})
  and are part of any state's module (part of the signature of the low-level
  functor \texttt{State\_builder.Register}%
  \scodeidx{State\_builder}{Register}). This value represents the state itself
  as first-class value (like type values for \caml types, see
  Section~\ref{type:type-value}).
\item From outside, a state actually hides its internal representation in order
  to ensure some invariants: operations on states implementing hashtables do
  not take hashtables as arguments because they implicitly use the hidden
  hashtable. In our example, a predefined memo function is used in order to
  memoize\index{Memoization} the computation of \texttt{compute\_info}. This
  memoization function implicitly operates on the hashtable hidden in the
  internal representation of \texttt{State}.
\end{enumerate}

%\todo{Computation.apply\_once (?)}

\paragraph{Postponed dependencies}\index{State!Dependency!Postponed|bfit}

Sometimes, you want to access a state kind before defining it. That is
usually the case when you have two mutually-dependent states: the dependencies
of the first one provided when registering it must contain the state kind of
the second one which is created by registering it. But this second registration
also requires a list of dependencies containing the first state kind.

For solving this issue, it is possible to postpone the addition of a
state kind to dependencies until all modules have been initialized. However,
dependencies must be correct before anything serious is computed by \framac. So
the right way to do this is the use of the function
\texttt{Cmdline.run\_after\_extended\_stage}% 
\scodeidx{Cmdline}{run\_after\_extended\_stage} (see
Section~\ref{adv:init} for advanced explanation about the way \framac is
initialized).
\begin{example}
Plug-in \texttt{from}\index{From} puts a reference to its state kind in the
following way. This reference is initialized at module initialization time.

\listingname{src/kernel\_services/plugin\_entry\_points/db.mli}
\codeidx{State}
\scodeidx{State}{dummy}
\scodeidx{Db}{From.self}
\begin{ocamlcode}
module From = struct
  ...
  val self: State.t ref
end
\end{ocamlcode}

\listingname{src/kernel\_services/plugin\_entry\_points/db.ml}
\begin{ocamlcode}
module From = struct
  ...
  val self = ref State.dummy (* postponed *)
end
\end{ocamlcode}

\scodeidx{Kernel\_function}{Make\_Table}
\sscodeidx{Db}{Value}{self}
\scodeidx{Db}{From.self}
\listingname{src/plugins/from/functionwise.ml}
\begin{ocamlcode}
module Tbl =
  Kernel_function.Make_Table
    (Function_Froms)
    (struct
       let name = "functionwise_from"
       let size = 97
       let dependencies = [ Db.Value.self ]
     end)
let () =
  (* performed at module initialization runtime. *)
  Db.From.self := Tbl.self
\end{ocamlcode}
Plug-in \texttt{pdg}\index{Plug-in!Pdg|see{Pdg}}\index{Pdg} uses
\texttt{from}\index{From} for computing its own internal state. So it declares
this dependency as follows.

\listingname{src/plugins/pdg/register.ml}
\scodeidx{Kernel\_function}{Make\_Table}
\scodeidx{State\_dependency\_graph}{S.add\_codependencies}
\scodeidx{Cmdline}{run\_after\_extended\_stage}
\begin{ocamlcode}
module Tbl =
  Kernel_function.Make_Table
    (PdgTypes.Pdg)
    (struct
       let name = "Pdg.State"
       let dependencies = [] (* postponed because !Db.From.self may
                                not exist yet *)
       let size = 97
    end)
let () =
  Cmdline.run_after_extended_stage
    (fun () ->
       State_dependency_graph.add_codependencies
         ~onto:Tbl.self
         [ !Db.From.self ])
\end{ocamlcode}
\end{example}

\paragraph{Dependencies over the AST}

Most internal states depend directly or indirectly on the AST of the current
project. However, the AST plays a special role as a state. Namely, it can be
changed in place, bypassing the project mechanism. In particular, it is possible
to add globals. Plugins that perform such changes should inform the kernel
when they are done using
\texttt{Ast.mark\_as\_changed}\scodeidxdef{Ast}{mark\_as\_changed} or 
\texttt{Ast.mark\_as\_grown}\scodeidxdef{Ast}{mark\_as\_grown}. The latter
must be used when the only changes are additions, leaving existing nodes
untouched, while the former must be used for more intrusive changes.
In addition, it is possible to tell the kernel that a state is ``monotonic''
with respect to AST changes, in the sense that it does not need to be cleared
when nodes are added (the information that should be associated to the new 
nodes will be computed as needed). This is done with the function
\texttt{Ast.add\_monotonic\_state}\scodeidxdef{Ast}{add\_monotonic\_state}.
\texttt{Ast.mark\_as\_grown} will not touch such a state, while
\texttt{Ast.mark\_as\_changed} will clear it.

\subsection{Direct Use of Low-level Functor \texttt{State\_builder.Register}}
\label{proj:lowlevel}

Functor \texttt{State\_builder.Register}%
\scodeidxdef{State\_builder}{Register} is the only functor which really
registers a state. All the others internally use it. In some cases (\eg if you
define your own mutable record used as a state), you have to use it. Actually,
in the \framac kernel\index{Kernel}, there are only three direct uses of this
functor over thousands of state registrations: so you will certainly never use
it.

This functor takes three arguments. The first and the third ones respectively
correspond to the datatype\index{Datatype} and to information (name and
dependencies) of the states\index{State!Name}%
\index{State!Dependency}: they are similar to the corresponding
arguments of the high-level functors (see Section~\ref{proj:computation}).

The second argument explains how to handle the \emph{local
  version}\index{State!Local Version|bfit} of the state under
registration. Indeed here is the key point: from the outside, only this local
version is used for efficiency purposes (remember
Figure~\ref{fig:proj-mechanism}). It would work even if projects do not
exist. Each project knows a \emph{global version}\index{State!Global
  Version|bfit}. The project management system \emph{automatically} 
switches\index{Context Switch} the local version when the current
project\index{Project!Current} changes in order to conserve a physical
equality\index{Equality!Physical} between local version and current global
version. So, for this purpose, the second argument provides a type \texttt{t}
(type of values of the state) and five functions \texttt{create} (creation of a
new fresh state), \texttt{clear} (cleaning a state), \texttt{get} (getting a
state), \texttt{set} (setting a state) and \texttt{clear\_some\_projects} (how
to clear each value of type \texttt{project} in the state if any).

\begin{important}
The following invariants must
hold\index{Equality!Physical}\index{Equality!Structural}:\footnotemark
\begin{gather}
\texttt{create ()} \mbox{ returns a fresh value} \label{eq:create}\\
\forall \texttt{p} \mbox{ of type } \texttt{t},\,
\texttt{create () = (clear p; set p; get ())}\label{eq:clear}\\
\forall \texttt{p} \mbox{ of type } \texttt{t},\, \texttt{copy p} \mbox{
  returns a fresh value}
\label{eq:copy}\\
\forall \texttt{p1},\texttt{p2} \mbox{ of type } \texttt{t}
\mbox{ such that } \texttt{p1 != p2},\, \texttt{(set p1; get ()) != p2}
\label{eq:independance}
\end{gather}
\end{important}
\footnotetext{As usual in \caml, \texttt{=} stands for \emph{structural}
  equality while \texttt{==} (resp. \texttt{!=}) stands for \emph{physical}
  equality (resp. disequality).}

Invariant~\ref{eq:create} ensures that there is no sharing%
\index{State!Sharing} with any value of a same state: so each new project has
got its own fresh state. Invariant~\ref{eq:clear} ensures that cleaning
a state\index{State!Cleaning} resets it to its initial
value. Invariant~\ref{eq:copy} ensures that there is no sharing with any
copy. Invariant~\ref{eq:independance} is a local independence criterion which
ensures that modifying a local version does not affect any other version
(different from the global current one) by side effects\index{Side-Effect}.
\begin{example}
  To illustrate this, we show how functor
  \texttt{State\_builder.Ref}\scodeidx{State\_builder}{Ref} (registering a state
  corresponding to a reference) is implemented.
\scodeidx{Datatype}{S}
\begin{ocamlcode}
module Ref
  (Data: Datatype.S)
  (Info: sig include Info val default: unit -> Data.t end) =
struct
  type data = Data.t
  let create () = ref Info.default
  let state = ref (create ())
\end{ocamlcode}
Here we use an additional reference: our local version%
\index{State!Local Version} is a reference on the right value. We can
use it in order to safely and easily implement \texttt{get} and \texttt{set}
required by the registration.
\scodeidx{State\_builder}{Register}
\scodeidx{Datatype}{Ref}
\begin{ocamlcode}
  include Register
    (Datatype.Ref(Data))
    (struct
      type t = data ref (* we register a reference on the given type *)
      let create = create
      let clear tbl = tbl := Info.default
      let get () = !state
      let set x = state := x
      let clear_some_projects f x =
        if Data.mem_project f !x then begin clear x; true end else false
    end)
   (Info)
\end{ocamlcode}
For users of this module, we export ``standard'' operations which hide the
local indirection required by the project management system.
\begin{ocamlcode}
  let set v = !state := v
  let get () = !(!state)
  let clear () = !state := Info.default
end
\end{ocamlcode}
As you can see, the above implementation is error prone; in particular it uses
a double indirection (reference of reference). So be happy that higher-level
functors like \texttt{State\_builder.Ref} are provided which hide such
implementations from you.
\end{example}

\subsection{Using Projects}\label{proj:use}\index{Project!Use|bfit}

As said before, all operations are done by default on the current
project\index{Project!Current}. But sometimes plug-in developers have to
explicitly use another project, for example when the AST is
modified\index{AST!Modification} (usually through the use of a copy
visitor\index{Visitor!Copy}, see Section~\ref{adv:visitors}) or replaced (\eg if
a new one is loaded\index{Loading} from disk).

\begin{important}
  An AST must never be modified inside a
  project\index{AST!Modification}. If such an operation is required, you
  must either create a new project with a new AST, usually by using
  \texttt{File.init\_project\_from\_cil\_file}%
  \scodeidxdef{File}{init\_project\_from\_cil\_file}
  or \texttt{File.init\_project\_from\_visitor}%
  \scodeidxdef{File}{init\_project\_from\_visitor};
  or write the following line of code (see Section~\ref{proj:selection}):
  \scodeidx{Project}{clear}
  \scodeidx{State\_selection}{only\_dependencies}
  \scodeidx{Ast}{self}
  \begin{alltt}
  let selection = State_selection.only_dependencies Ast.self in
  Project.clear ~selection ()
  \end{alltt}
\end{important}

Operations over projects are grouped together in module
\texttt{Project}\codeidxdef{Project}. A project has type
\texttt{Project.t}\scodeidxdef{Project\_skeleton}{t}. Function
\texttt{Project.set\_current}\scodeidxdef{Project}{set\_current} sets the
current project on which all operations are implicitly performed.

\begin{example}\label{ex:set_current}
  Suppose that you saved\index{Saving} the current project into file
  \texttt{foo.sav} in a previous \framac session\index{Session|bfit}\footnote{A
    \emph{session} is one execution of \framac (through \texttt{frama-c} or
    \texttt{frama-c-gui}).} thanks to the following
  instruction.\scodeidx{Project}{save}
\begin{ocamlcode}
Project.save "foo.sav"
\end{ocamlcode}
In a new \framac session, executing the following lines of code (assuming the
value analysis has never been computed previously)
\sscodeidx{Db}{Value}{is\_computed}
\scodeidx{Project}{current}
\scodeidx{Project}{load}
\scodeidx{Project}{set\_current}
\sscodeidx{Db}{Value}{compute}
\scodeidx{Project}{IOError}
\begin{ocamlcode}
let print_computed () =
  Kernel.feedback "%B" (Db.Value.is_computed ())
in
print_computed ();   (* false *)
let old = Project.current () in
try
  let foo = Project.load ~name:"foo" "foo.sav" in
  Project.set_current foo;
  !Db.Value.compute ();
  print_computed ();   (* true *)
  Project.set_current old;
  print_computed ()    (* false *)
with Project.IOError _ ->
  Kernel.abort "error while loading"
\end{ocamlcode}
displays
\begin{shell}
[kernel] false
[kernel] true
[kernel] false
\end{shell}
This example shows that the value analysis has been computed only in project
\texttt{foo} and not in project \texttt{old}.
\end{example}

\begin{important}
An important invariant of \framac is: if $p$ is the current project before
running an analysis, then $p$ will be the current project after running it. It
is the responsibility of any plug-in developer to enforce this invariant for
his/her own analysis.
\end{important}

To be sure to enforce the above-mentioned invariant, the project library
provides an alternative to the use of
\texttt{Project.set\_current}\scodeidx{Project}{set\_current}:
\texttt{Project.on}\scodeidxdef{Project}{on} applies an operation on a
given project without changing the current project (\emph{i.e.} locally switch
the current project in order to apply the given operation and, afterwards,
restore the initial context)\index{Context Switch}.
\begin{example}
The following code is equivalent to the one given in
Example~\ref{ex:set_current}.
\begin{ocamlcode}
let print_computed () =
  Value_parameters.feedback "%B" (Db.Value.is_computed ())
in
print_computed ();   (* false *)
try
  let foo = Project.load ~name:"foo" "foo.sav" in
  Project.on foo
    (fun () -> !Db.Value.compute (); print_computed () (* true *)) ();
  print_computed ()    (* false *)
with Project.IOError _ ->
  exit 1
\end{ocamlcode}
It displays
\begin{shell}
false
true
false
\end{shell}
\end{example}

\subsection{Selections}\label{proj:selection}\index{Selection|bfit}

Most operations working on a single project (\eg
\texttt{Project.clear}\scodeidx{Project}{clear} or
\texttt{Project.on}\scodeidx{Project}{on}) have an optional parameter
\texttt{selection} of type
\texttt{State\_selection.t}\codeidx{State\_selection}. This parameter allows the
developer to specify on which states\index{State} the operation
applies. A \emph{selection} is a set of states which allows the developer to
consistently handle state dependencies\index{State!Dependency}.
\begin{example}
  The following statement clears all the results of the value analysis and all
  its dependencies in the current project.\index{State!Cleaning}
  \scodeidx{Project}{clear}
  \scodeidx{State\_selection}{with\_dependencies}
  \sscodeidx{Db}{Value}{self}
\begin{ocamlcode}
let selection = State_selection.with_dependencies Db.Value.self in
Project.clear ~selection ()
\end{ocamlcode}
The selection explicitly indicates that we also want to clear all the states
which depend on the value analysis' results\index{State!Dependency}.
\end{example}

\begin{important}
  Use selections carefully: if you apply a function $f$ on a selection $s$ and
  $f$ handles a state which does not belong to $s$, then the computed result by
  \framac is potentially incorrect\index{Consistency}.
\end{important}

\begin{example}
The following statement applies a function \texttt{f} in the project \texttt{p}
(which is not the current one). For efficiency purposes, we restrict the
considered states to the command line options (see Section~\ref{adv:cmdline}).
\scodeidx{Project}{on}
\scodeidxdef{Parameter\_state}{get\_selection}
\begin{ocamlcode}
Project.on ~selection:(Parameter_state.get_selection ()) p f ()
\end{ocamlcode}
This statement only works if \texttt{f} only handles values of the command line
options\index{Command Line!Option}. If it tries to get the value of another
state, the result is unspecified \emph{and all actions using any state of the
  current project\index{Project!Current} and of project \texttt{p} also become
  unspecified}.
\end{example}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Command Line Options}\label{adv:cmdline}
\index{Command Line!Option|bfit}

\begin{prereq}
  Knowledge of the \caml module system.
\end{prereq}

Values associated with command line options are called
\emph{parameters}\index{Parameters|bfit}. The parameters of the \framac kernel
are stored in module \texttt{Kernel}\codeidxdef{Kernel} while the
plug-in specific ones have to be defined in the plug-in source
code.

\subsection{Definition}\label{options:definition}

In \framac, a parameter is represented by a value of type
\texttt{Typed\_parameter.t}\scodeidxdef{Typed\_parameter}{t} and by a module
implementing the signature
\texttt{Parameter\_sig.S}\scodeidxdef{Parameter\_sig}{S}. The first
representation is a low-level one required by emitters\index{Emitter} (see
Section~\ref{adv:annotations}) and the GUI. The second one provides a high-level
API: each parameter is indeed a state\index{State} (see
Section~\ref{proj:states}).  Several signatures extending
\texttt{Parameter\_sig.S} are provided in order to deal with the usual parameter
types. For example, there are signatures
\texttt{Parameter\_sig.Int}\scodeidxdef{Parameter\_sig}{Int} and
\texttt{Parameter\_sig.Bool}\scodeidxdef{Parameter\_sig}{Bool} for integer and
boolean parameters. Mostly, these signatures provide getters and setters for
modifying parameter values.

Implementing such an interface is very easy thanks to a set of functors provided
by the output module of
\texttt{Plugin.Register}\scodeidx{Plugin}{Register}. Indeed, you have just to
choose the right functor according to your option type and potentially
the wished default value. Below are some examples of such functors
(see the signature
\texttt{Parameter\_sig.Builder}\scodeidx{Parameter\_sig}{Builder} for an
exhaustive list).
\begin{enumerate}
\item \texttt{False}\sscodeidxdef{Parameter\_sig}{Builder}{False}
  (resp. \texttt{True}\sscodeidxdef{Parameter\_sig}{Builder}{True}) builds a
  boolean option initialized to \texttt{false} (resp. \texttt{true}).
\item \texttt{Int}\sscodeidxdef{Parameter\_sig}{Builder}{Int}
  (resp. \texttt{Zero}\sscodeidxdef{Parameter\_sig}{Builder}{Zero}) builds an
  integer option initialized to a specified value (resp. to \texttt{0}).
\item \texttt{String}\sscodeidxdef{Parameter\_sig}{Builder}{String}
  (resp. \texttt{Empty\_string}
  \sscodeidxdef{Parameter\_sig}{Builder}{Empty\_string}) builds a string option
  initialized to a specified value (resp. to the empty string \texttt{""}).
\item \texttt{String\_set}\sscodeidxdef{Parameter\_sig}{Builder}{String\_set}
  builds an option taking a set of strings in argument (initialized to the empty
  set).
\item
  \texttt{Kernel\_function\_set}%
\sscodeidxdef{Parameter\_sig}{Builder}{Kernel\_function\_set} builds an option
taking a set of kernel functions in argument (initialized to the empty set).
\end{enumerate}
Each functor takes as argument (at least) the name of the command line option
corresponding to the parameter and a short description for this option.

\begin{example}
  The parameter corresponding to the option \texttt{-occurrence} of the plug-in
  \texttt{occurrence} is the module \texttt{Print} (defined in the file
  \texttt{src/plugins/occurrence/options.ml}). It is implemented as follows.
  \sscodeidx{Parameter\_sig}{Builder}{False}
\begin{ocamlcode}
module Print =
  False
    (struct
       let option_name = "-occurrence"
       let help = "print results of occurrence analysis"
     end)
\end{ocamlcode}
So it is a boolean parameter initialized by default to
\texttt{false}. The declared interface for this module is simply
\scodeidx{Parameter\_sig}{Int}
\begin{ocamlcode}
module Print: Parameter_sig.Int
\end{ocamlcode}

Another example is the parameter corresponding to the option
\texttt{-impact-pragma} of the plug-in \texttt{impact}. This parameter is
defined by the module \texttt{Pragma} (defined in the file
\texttt{src/plugins/impact/options.ml}). It is implemented as follows.
\sscodeidx{Parameter\_sig}{Builder}{Kernel\_function\_set}
\begin{ocamlcode}
module Pragma =
  Kernel_function_set
    (struct
       let option_name = "-impact-pragma"
       let arg_name = "f1, ..., fn"
       let help = "use the impact pragmas in the code of functions f1,...,fn"
     end)
\end{ocamlcode}
Thus it is a set of \texttt{kernel\_function}s initialized by default to
the empty set. \framac uses
the field \texttt{arg\_name} in order to print the name of the argument when
displaying help. The field \texttt{help} is the help message itself. The
Interface for this module is simple: 
\scodeidx{Parameter\_sig}{Kernel\_function\_set}
\begin{ocamlcode}
module Pragma: Parameter_sig.Kernel_function_set
\end{ocamlcode}
\end{example}

\begin{convention}
Parameters of a same plug-in \texttt{plugin} should belong to a module called
\texttt{Options}, \texttt{Plugin\_options}, \texttt{Parameters} or
\texttt{Plugin\_parameters} inside the plug-in directory.
\end{convention}

Using a kernel parameters or a parameter of your own plug-in is very simple:
you have simply to call the function \texttt{get} corresponding to your
parameter.
\scodeidx{Kernel}{Unicode}
\begin{example}
To know whether \framac uses unicode, just write
\begin{ocamlcode}
Kernel.Unicode.get ()
\end{ocamlcode}

Inside the plug-in \texttt{From}, just write
\scodeidx{From\_parameters}{ForceCallDeps}
\begin{ocamlcode}
From_parameters.ForceCallDeps.get ()
\end{ocamlcode}
in order to know whether callsite-wise dependencies have been required.
\end{example}

Using a parameter of a plug-in $p$ in another plug-in $p'$ requires the use of
module \texttt{Dynamic.Parameter}\scodeidx{Dynamic}{Parameter}: since the
module defining the parameter is not visible from the outside of its plug-in,
you have to use the dynamic API of plug-in $p$ in which $p$'s parameters are
automatically registered (see Section~\ref{adv:dynamic-registration}). The
module \texttt{Dynamic.Parameter} defines sub-modules which provide easy access
to parameters according to their \caml types.
\sscodeidx{Dynamic}{Parameter}{Bool}
\begin{example}
Outside the plug-in \texttt{From}, just write
\begin{ocamlcode}
Dynamic.Parameter.Bool.get "-calldeps" ()
\end{ocamlcode}
in order to know whether callsite-wise dependencies have been required.
\end{example}

\subsection{Tuning}\label{options:tuning}

It is possible to modify the default behavior of command line options in several
ways by applying functions just before or just after applying the functor
defining the corresponding parameter. 

Functions which can be applied afterwards are defined in the output signature of
the applied functor.
\begin{example}
Here is how the option "-slicing-level" restricts the range of its argument to
the interval $[0;3]$.
\begin{ocamlcode}
module Calls =
  Int
    (struct
       let option_name = "-slicing-level"
       let default = 2
       let arg_name = ""
       let help = "..." (* skipped here *)
       end)
let () = Calls.set_range ~min:0 ~max:3
\end{ocamlcode}
\end{example}

Functions which can be applied before applying the functor are defined in the
module \texttt{Parameter\_customize}\codeidxdef{Parameter\_customize}.
\begin{example}
Here is how the opposite of option "-safe-arrays" is renamed into
"-unsafe-arrays" (otherwise, by default, it would be "-no-safe-arrays").
\scodeidx{Parameter\_customize}{set\_negative\_option\_name} 
\scodeidx{Kernel}{SafeArrays}
\begin{ocamlcode}
let () = Parameter_customize.set_negative_option_name "-unsafe-arrays"
module SafeArrays =
  True
    (struct
       let module_name = "SafeArrays"
       let option_name = "-safe-arrays"
       let help = "for arrays that are fields inside structs, assume that €
                   accesses are in bounds"
     end)
\end{ocamlcode}
\end{example}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Initialization Steps}\label{adv:init}
\index{Initialization|bfit}
\index{Module Initialization|see{Initialization}}
\index{Plug-in!Initialization|see{Initialization}}

\begin{prereq}
  Knowledge of linking of \caml files.
\end{prereq}

\begin{important}
  In a standard way, \framac modules are initialized in the link
  order\index{Linking} which remains mostly unspecified, so you have to use
  side-effects\index{Side-Effect} at module initialization time carefully.
\end{important}

This section details the different stages of the \framac boot process
to help advanced plug-in developers interact more deeply with
the kernel process. It can also be useful for debugging initialization
problems.

As a general rule, plug-in routines must never be executed at link time. Any
useful code, be it for registration, configuration or \textsf{C}-code analysis,
should be registered as \emph{function hooks} to be executed at a proper time
during the \framac boot process. In general, registering and executing a hook is
tightly coupled with handling the command line parameters.

The parsing of the command line parameters is performed in several
\emph{phases} and \emph{stages}
\index{Command Line!Parsing}, 
each one dedicated to specific operations. For instance,
journal replays should be performed after loading dynamic plug-ins, and so
on. Following the general rule stated at the beginning of this section, even
the kernel services of \framac are internally registered as hooks routines to
be executed at a specific stage of the initialization process, among plug-ins
ones.

From the plug-in developer point of view, the hooks are registered by
calling the \texttt{run\_after\_$xxx$\_stage} routines in
\texttt{Cmdline}\codeidx{Cmdline} module and \texttt{extend} routine
in the \texttt{Db.Main}\sscodeidx{Db}{Main}{extend} module.

The initialization phases and stages of \framac are described below, in
their execution order.
\begin{enumerate}[A --]

\item \textbf{The Initialization Stage:}\index{Initialization} this stage
  initializes \framac compilation units, following some \emph{partially}
  specified order. More precisely:
  \begin{enumerate}[1.]
  \item the architecture dependencies depicted on
    Figure~\ref{fig:architecture} (cf.~p.~\pageref{archi:general}) are
    respected. In particular, the kernel services are linked first,
    \emph{then} the kernel integrated types for plug-ins, and
    \emph{finally} the plug-ins are linked in unspecified order;
  \item when the GUI\index{Plug-in!GUI} is present, for any plug-in
    $p$, the non-gui modules of $p$ are always linked \emph{before}
    the gui modules of $p$;
  \item finally, the module \texttt{Boot}\codeidxdef{Boot} is linked at the
    very end of this stage.
  \end{enumerate}

  Plug-in developers cannot customize this stage. In particular, the module
  \texttt{Cmdline}\codeidx{Cmdline} (one of the first linked modules, see
  Figure~\ref{fig:architecture}) performs a very early configuration stage,
  such as checking if journalization has to be activated
  (cf.~Section~\ref{adv:journalization}\index{Journalization}), or setting the
  global verbosity and debugging levels.

\item \label{stage:early} \textbf{The Early Stage:} this stage initializes the
  kernel services. More precisely:
  \begin{enumerate}[(a)]
  \item first, the journal name is set to its right value (according to the
    option \texttt{-journal-name}) and the default project is created;
  \item then, the parsing of command line options registered for the
    \texttt{Cmdline.Early}\sscodeidxdef{Cmdline}{stage}{Early} stage;
  \item finally, all functions registered through
    \texttt{Cmdline.run\_after\_early\_stage}%
    \scodeidxdef{Cmdline}{run\_after\_early\_stage} are executed in an
    unspecified order.
  \end{enumerate}

\item \label{stage:extending} \textbf{The Extending Stage:}\index{Linking} the searching and
  loading of dynamically linked plug-ins, of journal, scripts and
  modules is performed at this stage. More precisely:

  \begin{enumerate}[(a)]
  \item the command line options registered for the
    \texttt{Cmdline.Extending}\sscodeidxdef{Cmdline}{stage}{Extending} stage
    are treated, such as \texttt{-load-script} and
    \texttt{-add-path};

  \item the hooks registered through
    \texttt{Cmdline.run\_during\_extending\_stage}
    \scodeidxdef{Cmdline}{run\_during\_extending\_stage} are executed. Such
    hooks include kernel function calls for searching, loading and linking the
    various plug-ins, journal and scripts compilation units, with respect to
    the command line options parsed during stages~\ref{stage:early}
    and~\ref{stage:extending}.
  \end{enumerate}

\item \textbf{The Running Phase:} the command line is split into several
  groups of command line arguments, each of them separated by an option
  \texttt{-then} or an option \texttt{-then-on $p$} (thus if there is $n$
  occurrences of \texttt{-then} or \texttt{-then-on $p$}, then there are $n+1$
  groups). For each group, the following stages are executed in sequence: all
  the stages are executed on the first group provided on the command line, then
  they are executed on the second group, and so on.

\begin{enumerate}[1.]

\item \textbf{The Extended Stage:} this step is reserved for commands which
  require that all plug-ins are loaded but which must be executed very
  early. More precisely:

  \begin{enumerate}[(a)]
  \item the command line options registered for the
    \texttt{Cmdline.Extended}\sscodeidxdef{Cmdline}{stage}{Extended} stage
    are treated, such as \texttt{-verbose-*} and \texttt{-debug-*};

  \item the hooks registered through
    \texttt{Cmdline.run\_after\_extended\_stage}%
    \scodeidxdef{Cmdline}{run\_after\_extended\_stage}.
    Most of these registered hooks come from postponed internal-state
    dependencies\index{State!Dependency!Postponed} (see
    Section~\ref{proj:computation}).
  \end{enumerate}

  Remark that both statically and dynamically linked plug-ins have
  been loaded at this stage. Verbosity and debug level for each
  plug-in are determined during this stage.

\item \textbf{The Exiting Stage:} this step is reserved for commands
  that makes \framac exit before starting any analysis at all, such as
  printing help information:
  \begin{enumerate}[(a)]

  \item the command line options registered for the
    \texttt{Cmdline.Exiting}\sscodeidxdef{Cmdline}{stage}{Exiting} stage
    are treated;

  \item the hooks registered through
    \texttt{Cmdline.run\_after\_exiting\_stage}
    \scodeidxdef{Cmdline}{run\_after\_exiting\_stage} are executed in
    an unspecified order. All these functions should do nothing (using
    \texttt{Cmdline.nop}\scodeidxdef{Cmdline}{nop}) or raise
    \texttt{Cmdline.Exit}\scodeidxdef{Cmdline}{Exit} for stopping \framac
    quickly.

  \end{enumerate}

\item \textbf{The Loading Stage:} this is where the initial state of \framac
  can be replaced by another one. Typically, it would be loaded from disk
  through the \texttt{-load} option\index{Loading} or computed by running a
  journal (see Section~\ref{adv:journalization})\index{Journalization}. As
  for the other stages:
  \begin{enumerate}[(a)]

  \item first, the command line options registered for the
    \texttt{Cmdline.Loading}\sscodeidxdef{Cmdline}{stage}{Loading} stage
    are treated;

  \item then, the hooks registered through
    \texttt{Cmdline.run\_after\_loading\_stage}
    \scodeidxdef{Cmdline}{run\_after\_loading\_stage} are executed in
    an unspecified order. These functions actually change the initial
    state of \framac with the specified one. The \framac kernel
    verifies as far as possible that only one new-initial state has
    been specified.
  \end{enumerate}

  Normally, plug-ins should never register hooks for this stage unless they
  actually set a different initial state than the default one. In such a case:

  \begin{important}
    They must call the function
    \texttt{Cmdline.is\_going\_to\_load}
    \scodeidxdef{Cmdline}{is\_going\_to\_load} while initializing.
  \end{important}

\item \textbf{The Configuring Stage:} this is the usual place for
  plug-ins to perform special initialization routines if necessary,
  \emph{before} having their main entry points executed. As for
  previous stages:

  \begin{enumerate}[(a)]
  \item first, the command line options registered for the
    \texttt{Cmdline.Configuring}\sscodeidxdef{Cmdline}{stage}{Configuring} stage
    are treated. Command line parameters that do not begin by an hyphen
    (character \texttt{'-'}) are \emph{not} options and are treated as
    \textsf{C} files. Thus they are added to the list of files to be
    preprocessed or parsed for building the \texttt{AST} (on demand);
  \item then, the hooks registered through
    \texttt{Cmdline.run\_after\_configuring\_stage}
    \scodeidxdef{Cmdline}{run\_after\_configuring\_stage} are executed
    in an unspecified order.
  \end{enumerate}

\item \textbf{The Setting Files Stage:} this stage sets the \C files to analyze
  according to those indicated on the command line. More precisely:

  \begin{enumerate}[(a)]
  \item first, each argument of the command line which does not begin by an
    hyphen (character \texttt{'-'}) is registered for later analysis;
  \item then, the hooks registered through
    \texttt{Cmdline.run\_after\_setting\_files}
    \scodeidxdef{Cmdline}{run\_after\_setting\_files} are executed in an
    unspecified order.
  \end{enumerate}

\item \textbf{The Main Stage:} this is the step where plug-ins actually run
  their main entry points registered through
  \texttt{Db.Main.extend}\sscodeidx{Db}{Main}{extend}. For all intents and purposes, you should consider 
  that this stage is the one where these hooks are executed.

\end{enumerate}

\end{enumerate}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Customizing the AST creation}\label{sec:customizing-ast}
\begin{prereq}
  None.
\end{prereq}

Plug-ins may modify the way source files are transformed into the AST
over which the analyses are performed. Customization of the front-end of
\framac can be done at several stages.

\begin{enumerate}[A --]
\item\textbf{Parsing:} this stage takes care of converting an individual source
file into a parsed AST (a.k.a Cabs, which differs from the type-checked AST on
which most analyses operate). By default, source files are treated as C files,
possibly needing a pre-processing phase. It is possible to tell Frama-C to use
another parser for files ending with a given suffix by registering this
parser with the \texttt{File.new\_file\_type}\scodeidxdef{File}{new\_file\_type}
function. Suffixes \texttt{.h}, \texttt{.i}, \texttt{.c} and \texttt{.ci}
are reserved for
Frama-C kernel. The registered parser is supposed to return a pair consisting
of a type-checked AST (\verb+Cil_types.file+\scodeidx{Cil\_types}{file}) and
a parsed AST (\verb+Cabs.file+\scodeidx{Cabs}{file}). The former can be obtained
from the latter with the \verb+Cabs2cil.convFile+\scodeidx{Cabs2cil}{convFile}
function, which guarantees that the resulting \verb+Cil_types.file+ respects
all invariants expected by the Frama-C kernel.
\item\textbf{Type-checking:} a normal \verb+Cabs.file+ ({\it i.e.} not obtained
through a custom parsing function) can be transformed before being 
type-checked. Transformation hooks are registered through
\verb+Frontc.add_syntactic_transformation+%
\scodeidxdef{Frontc}{add\_syntactic\_transformation}.
\item\textbf{After linking:} Once all source files have been processed, they
  are all linked together in a single AST. Transformations can be performed
  on the resulting AST at two stages:
  \begin{enumerate}[1.]
  \item before clean-up ({\it i.e.} removal of useless temporary variables
    and prototypes that are never called). At that stage, global tables indexing
    information related to the AST have not yet been filled.
  \item after clean-up. At this stage, index tables are filled, and can thus
    be used. On the other hand, the transformation must take care itself
    of keeping in sync the AST and the tables
  \end{enumerate}
  Registering a transformation for this stage is done through the function
  \verb+File.add_code_transformation_before_cleanup+%
  \scodeidxdefsmall{File}{add\_code\_transformation\_before\_cleanup}
  (respectively \verb+File.add_code_transformation_after_cleanup+%
  \scodeidxdefsmall{File}{add\_code\_transformation\_after\_cleanup}). If such a
  transformation modify the control-flow graph of a function \texttt{f}, in
  particular by adding statements, it must call 
  \verb|File.must_recompute_cfg|\scodeidxdef{File}{must\_recompute\_cfg}, in
  order to have the graph recomputed afterwards.
\end{enumerate}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Customizing the machine model}\label{sec:customizing-machdep}
\index{Machine model}

\begin{prereq}
  None.
\end{prereq}

Several aspects of the C standard that are implementation-defined, such as
the width of standard integer types, endianness, signedness of the
\texttt{char} type, etc., as well as a few compiler and architecture specific
features, can be customized using a \texttt{machdep} configuration,
defining a new machine model.

To create a new machine model, define an instance of \verb+Cil_types.mach+%
\scodeidxdef{Cil\_types}{mach}. You can base it on the examples available in
\verb+tests/misc/custom_machdep/custom_machdep.ml+ and
\verb+src/kernel_internals/runtime/machdeps.ml+.
The new definition can be added to \framac's database using
\verb+File.new_machdep+\scodeidxdef{File}{new\_machdep}.

\begin{example}
  A custom machine description may be implemented as follows
  (the meaning of each field is presented later in this section):
\begin{ocamlcode}
let my_machine =
{
  version          = "generic C compiler for my machine";
  compiler         = "generic";
  cpp_arch_flags   = [];
  sizeof_short     = 2;
  sizeof_int       = 4;
  sizeof_long      = 4;
  sizeof_longlong  = 8;
  sizeof_ptr       = 4;
  sizeof_float     = 4;
  sizeof_double    = 8;
  sizeof_longdouble  = 12;
  sizeof_void      = 1;
  sizeof_fun       = 1;
  size_t = "unsigned long";
  wchar_t = "int";
  ptrdiff_t = "int";
  alignof_short = 2;
  alignof_int = 4;
  alignof_long = 4;
  alignof_longlong = 4;
  alignof_ptr = 4;
  alignof_float = 4;
  alignof_double = 4;
  alignof_longdouble = 4;
  alignof_str = 1;
  alignof_fun = 1;
  alignof_aligned = 16;
  char_is_unsigned = false;
  const_string_literals = true;
  little_endian = true;
  underscore_name = false ;
  has__builtin_va_list = true;
  __thread_is_keyword = true;
}

let () = File.new_machdep "my_machine" my_machine
\end{ocamlcode}
\end{example}

\index{Command Line!-machdep@\texttt{-machdep}}%
After this code is loaded, \framac can be instructed to use the new machine
model using the \texttt{-machdep} command line option.

If you intend to use \framac's standard library headers, you must also do the
following:

\begin{itemize}
\item define constant \verb+__FC_MACHDEP_<CUSTOM>+, replacing \verb+<CUSTOM>+
  with the name (in uppercase letters) of your created machdep;
  this can be done via \verb+-cpp-extra-args="-D__FC_MACHDEP_<CUSTOM>"+;
\item provide a header file with macro definitions corresponding to your \caml
  definitions. For the most part, these are macros prefixed by \verb+__FC_+,
  corresponding to standard C macro definitions, {\it e.g.},
  \verb+__FC_UCHAR_MAX+. These definitions are used by \framac's
  \verb+<limits.h>+ and other headers to provide the standard C definitions.
  The test file \verb+tests/misc/custom_machdep/__fc_machdep_custom.h+
  (reproduced below)
  contains a complete example of the required definitions. Other examples can
  be found in \verb+share/libc/__fc_machdep.h+.
\end{itemize}
Make sure that your custom header defines the \verb+__FC_MACHDEP+
include guard, and that the program you are analyzing includes this header
before all other headers. One way to ensure this without having to modify any
source files is to use an option such as \verb+-include+ in GCC.

\begin{example}
  Contents of \verb+tests/misc/custom_machdep/__fc_machdep_custom.h+, used as
  example for creating custom machdeps. Notice the unusual size for \verb+int+
  (3 bytes), selected for testing purposes only, and inconsistent with the
  the chosen values for \verb+INT_MIN+ and \verb+INT_MAX+, which do not fit
  in 3 bytes.
\lstinputlisting{../../tests/misc/custom_machdep/__fc_machdep_custom.h}
\end{example}

An example of the complete command-line is presented below, for a custom
machdep called \texttt{myarch}, defined in file \verb+my_machdep.ml+ and
with stdlib constants defined in \verb+machdep_myarch.h+:

\begin{listing-nonumber}
  frama-c -load-script my_machdep.ml -machdep myarch \
    -cpp-extra-args="-D__FC_MACHDEP_MYARCH -include machdep_myarch.h"
\end{listing-nonumber}

\section{Machdep record fields}\label{sec:machdep-fields}

Each field in the machdep record is succintly described in the \verb+Cil_types+
module. We present below a thorough description of each field.

\begin{description}
\item[\texttt{version}]: human-readable textual description of the machdep.
\item[\texttt{compiler}]: defines whether special compiler-specific extensions
  will be enabled. It should be one of the strings below:
  \begin{itemize}
  \item[\texttt{msvc}]: enables \verb+Cil.msvcMode+, that is,
    MSVC (Visual Studio)-specific extensions;
  \item[\texttt{gcc}]: enables \verb+Cil.gccMode+, that is,
    GCC-specific extensions;
  \item[\texttt{generic}] (or any other string): no special
    compiler-specific extensions.
  \end{itemize}
  Note that some compiler extensions, such as attributes, are always enabled.
\item[\texttt{cpp\_arch\_flags}]: list of arguments to be added to the
  command-line when invoking the C preprocessor. Typically used to ensure that
  multiarch compilers apply the appropriate predefined macros\footnote{Note that
    the sizes of standard integer types are already defined in the machdep,
    so they do not depend on these flags.}.
  E.g. use \verb+["-m32"]+ for a 32-bit machdep when preprocessing with a
  64-bit multiarch GCC.
  Note that, in practice, very few programs rely on such predefined macros,
  such as \verb+__x86_64+ and \verb+__i386+.
  \item[\texttt{sizeof\_short}]: size (in bytes) of the \verb+short+ type.
  \item[\texttt{sizeof\_int}]: size (in bytes) of the \verb+int+ type.
  \item[\texttt{sizeof\_long}]: size (in bytes) of the \verb+long+ type.
  \item[\texttt{sizeof\_longlong}]: size (in bytes) of the \verb+long long+
    type.
    Note that machdeps (for compiler \verb+"gcc"+ in particular) must always
    have at least one type that is 8 bytes wide, which is typically
    \verb+long long+.
  \item[\texttt{sizeof\_ptr}]: size (in bytes) of an object (non-function)
    pointer.
  \item[\texttt{sizeof\_float}]: size (in bytes) of a single-precision floating
    point. In implementations compliant with ISO/IEC/IEEE 60559 -IEEE 754,
    this is always 4.
  \item[\texttt{sizeof\_double}]: size (in bytes) of a double-precision floating
    point. In implementations compliant with ISO/IEC/IEEE 60559 - IEEE 754,
    this is always 8.
  \item[\texttt{sizeof\_longdouble}]: size (in bytes) of a \verb+long double+
    floating point.
    Note: type \verb+long double+ is currently not supported by existing
    \framac plugins,  but this field exists for future expansion, and
    to compute \verb+sizeof+ of aggregates properly.
  \item[\texttt{sizeof\_void}]: the result of evaluating \verb+sizeof(void)+
    by the compiler (or 0 if unsupported).
  \item[\texttt{sizeof\_fun}]: the result of evaluating \verb+sizeof(f)+, where
    \verb+f+ is a function ({\em not} a function pointer) by the compiler
    (or negative if unsupported).
  \item[\texttt{size\_t}]: a string containing the actual type that
    \verb+size_t+ expands to, e.g. \verb+"unsigned long"+.
  \item[\texttt{wchar\_t}]: a string containing the actual type that
    \verb+wchar_t+ expands to. If unsupported, you can use \verb+int+.
  \item[\texttt{ptrdiff\_t}]: a string containing the actual type that
    \verb+ptrdiff_t+ expands to. If unsupported, you can use \verb+int+.
  \item[\texttt{alignof\_short}]: the result of evaluating
    \verb+_Alignof(short)+.
  \item[\texttt{alignof\_int}]: the result of evaluating \verb+_Alignof(int)+.
  \item[\texttt{alignof\_long}]: the result of evaluating \verb+_Alignof(long)+.
  \item[\texttt{alignof\_longlong}]: the result of evaluating
    \verb+_Alignof(long long)+.
  \item[\texttt{alignof\_ptr}]: the result of evaluating \verb+_Alignof(char*)+
    (or any other pointer, including function pointers).
  \item[\texttt{alignof\_float}]: the result of evaluating
    \verb+_Alignof(float)+.
  \item[\texttt{alignof\_double}]: the result of evaluating
    \verb+_Alignof(double)+.
  \item[\texttt{alignof\_longdouble}]: the result of evaluating
    \verb+_Alignof(long double)+.
  \item[\texttt{alignof\_str}]: the result of evaluating \verb+_Alignof("a")+
    (a literal string).
  \item[\texttt{alignof\_fun}]: the result of evaluating \verb+_Alignof(f)+,
    where \verb+f+ is a function (or negative if unsupported).
  \item[\texttt{alignof\_aligned}]: the default alignment of a type having the
    \verb+aligned+ attribute (or 1 if unsupported). This corresponds to the
    default alignment when using \verb+#pragma packed()+ without a numeric
    argument.
  \item[\texttt{char\_is\_unsigned}]: whether type \verb+char+ is unsigned.
  \item[\texttt{const\_string\_literals}]: whether string literals have const
    chars, or are writable. If \verb+true+, the following code has undefined
    behavior, otherwise it is defined: \verb+char *s = "no"; s[0] = 'g';+.
  \item[\texttt{little\_endian}]: whether the machine is little endian.
  \item[\texttt{underscore\_name}]: whether the compiler generates assembly
    labels by prepending \verb+_+ to the identifier. That is, will function
    \verb+foo()+ have the label \verb+foo+, or \verb+_foo+?
  \item[\texttt{has\_\_builtin\_va\_list}]: whether \verb+__builtin_va_list+
    is a (built-in) type known by the preprocessor.
  \item[\texttt{\_\_thread\_is\_keyword}]: whether \verb+__thread+ is a
    keyword (otherwise, it can be used as a standard identifier).
\end{description}

\paragraph{Writing a new machdep}

Writing a machdep for a new architecture is not trivial, due to the fact
that some steps are hard to automate. If you have a working compiler for the
target architecture, you can use it to produce an executable that will print
the contents of expressions such as \verb+sizeof(long)+, \verb+_Alignof(int)+,
etc. You can also use the compiler to test for unsupported features.
In case \verb+printf+ is not available, you can use the exit code of the
program (return code of \verb+main+). In case you can only preprocess, but not
compile and run the program, the assembly code may provide some useful data.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Visitors}\label{adv:visitors}\index{Visitor|bfit}

\begin{prereq}
  Knowledge of \caml object programming.
\end{prereq}

Module \texttt{Cil}\codeidx{Cil} offers a visitor\index{Visitor!Cil|bfit},
\verb+Cil.cilVisitor+\scodeidxdef{Cil}{cilVisitor}, 
that allows to traverse (parts of) an
AST\index{AST}. It is a class with one method per type of the AST, whose
default behavior is simply to call the method corresponding to its
children. This is a convenient way to perform local transformations over a
whole \verb+Cil_types.file+\scodeidx{Cil\_types}{file} by inheriting from it
and redefining a few methods. However, the original \cil visitor is of course
not aware of the internal state of \framac itself\index{State}. Hence,
there exists another visitor, \verb+Visitor.generic_frama_c_visitor+%
\scodeidxdef{Visitor}{generic\_frama\_c\_visitor}, which handles
projects\index{Project} in a transparent way for the user. There are very few
cases where the plain \cil visitor should be used.

\begin{important}
  Basically, as soon as the initial project\index{Project!Initial} has been
  built from the C source files (\emph{i.e.} one of the functions
  \texttt{File.init\_$*$}\scodeidx{File}{init\_from\_c\_files}%
  \scodeidx{File}{init\_project\_from\_cil\_file}%
  \scodeidx{File}{init\_project\_from\_visitor}%
  \scodeidx{File}{init\_from\_cmdline} has been applied), only the \framac
  visitor should occur.
\end{important}

There are a few differences between the two (the \framac visitor
inherits from the \cil one).  These differences are summarized in
Section~\ref{adv:sec:diff-betw-cil}, which the reader already familiar with
\cil is invited to read carefully.

\subsection{Entry Points}

Module \texttt{Cil}\codeidx{Cil} offers various entry points for the visitor%
\index{Visitor!Cil!Entry Point}. They are functions called
\verb+Cil.visitCil+\emph{AstType}\scodeidxdef{Cil}{visitCil$AstType$} where
\emph{astType} is a node type in the \cil's AST. Such a function takes as
argument an instance of a \verb+cilVisitor+\scodeidx{Cil}{cilVisitor} and an
\emph{astType} and gives back an \emph{astType} transformed according to the
visitor. The entry points for visiting a whole
\verb+Cil_types.file+\scodeidx{Cil\_types}{file} (\verb+Cil.visitCilFileCopy+%
\scodeidxdef{Cil}{visitCilFileCopy}, \verb+Cil.visitCilFile+%
\scodeidxdef{Cil}{visitCilFile} and \verb+visitCilFileSameGlobals+%
\scodeidxdef{Cil}{visitCilFileSameGlobals}) are slightly different and do not
support all kinds of visitors.  See the documentation attached to them in
\verb+cil.mli+ for more details.

\subsection{Methods}\label{adv:sec:methods}

As said above, there is a method for each type in the \cil AST\index{AST}
(including for logic annotation\index{Annotation}). For a given type
\emph{astType}, the method is called \texttt{v}\emph{astType}\footnote{This
  naming convention is not strictly enforced. For instance the method
  corresponding to \texttt{offset}\scodeidx{Cil\_types}{offset} is
  \texttt{voffs}\sscodeidxdef{Cil}{cilVisitor}{voffs}.}, and has type
\mbox{\emph{astType}$\rightarrow$\emph{astType'}~\texttt{visitAction}}, where
\emph{astType'} is either \emph{astType} or \emph{astType}~\texttt{list} (for
instance, one can transform a \verb+global+\scodeidx{Cil\_types}{global} into
several ones). \texttt{visitAction}\scodeidxdef{Cil}{visitAction} describes
what should be done for the children of the resulting AST node, and is
presented in the next section. In addition, some types have two modes
of visit: one for the declaration and one for use. This is the case for
\verb+varinfo+\scodeidx{Cil\_types}{varinfo}
(\verb+vvdec+\sscodeidxdef{Cil}{cilVisitor}{vvdec} and
\verb+vvrbl+\sscodeidxdef{Cil}{cilVisitor}{vvrbl}),
\verb+logic_var+\scodeidx{Cil\_types}{logic\_var}
(\verb+vlogic_var_decl+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_var\_decl} and
\verb+vlogic_var_use+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_var\_use})
\verb+logic_info+\scodeidx{Cil\_types}{logic\_info}
(\verb+vlogic_info_decl+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_info\_decl} and
\verb+vlogic_info_use+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_info\_use}),
\verb+logic_type_info+\scodeidx{Cil\_types}{logic\_type\_info}
(\verb+vlogic_type_info_decl+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_type\_info\_decl} and
\verb+vlogic_type_info_use+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_type\_info\_use}), and
\verb+logic_ctor_info+\scodeidx{Cil\_types}{logic\_ctor\_info}
(\verb+vlogic_ctor_info_decl+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_ctor\_info\_decl} and
\verb+vlogic_ctor_info_use+\sscodeidxdef{Cil}{cilVisitor}{vlogic\_ctor\_info\_use}).
More detailed information can be found in \verb+cil.mli+.

\begin{important}
  For the \framac visitor, two methods,
  \verb+vstmt+\sscodeidx{Cil}{cilVisitor}{vstmt} 
  and \verb+vglob+\sscodeidx{Cil}{cilVisitor}{vglob} take
  care of maintaining the coherence\index{Consistency} between the transformed
  AST\index{AST!Modification} and the internal state of \framac%
  \index{State}. Thus they must not be redefined. One should redefine
  \verb+vstmt_aux+\sscodeidxdef{Visitor}{frama\_c\_visitor}{vstmt\_aux} and
  \verb+vglob_aux+\sscodeidxdef{Visitor}{frama\_c\_visitor}{vglob\_aux} instead.
\end{important}

\subsection{Action Performed}\label{adv:sec:action-performed}

The return value of visiting methods indicates what should be done
next. There are six possibilities:
\begin{itemize}
\item \verb+SkipChildren+\sscodeidx{Cil}{visitAction}{SkipChildren} the visitor
  does not visit the children;
\item \verb+ChangeTo v+\sscodeidx{Cil}{visitAction}{ChangeTo} the old node is
  replaced by \verb+v+ and the visit stops;
\item \verb+DoChildren+\sscodeidx{Cil}{visitAction}{DoChildren} the visit goes
  on with the children; this is the default behavior;
\item \verb+JustCopy+\sscodeidx{Cil}{visitAction}{JustCopy} is only meaningful
  for the copy visitor. Indicates that the visit should go on with the children,
  but only perform a fresh copy of the nodes
\item \verb+ChangeToPost(v,f)+\sscodeidx{Cil}{visitAction}{ChangeToPost} the old
  node is replaced by \verb+v+, and \verb+f+ is applied to the result. This is
  however not exactly the same thing as returning \verb+ChangeTo(f(v))+. Namely,
  in the case of \verb+vglob_aux+, \verb+f+ will be applied to \verb+v+ only
  \emph{after} the operations needed to maintain the consistency of \framac's
  internal state with respect to the AST have been performed.  Thus,
  \verb+ChangeToPost+ should be used with extreme caution, as \verb+f+ could
  break some invariants of the kernel.
\item \verb+DoChildrenPost f+\sscodeidx{Cil}{visitAction}{DoChildrenPost}
  visit the children and apply the given function to the result.
\item \verb+JustCopyPost(f)+\sscodeidx{Cil}{visitAction}{JustCopyPost}
  is only
  meaningful for the copy visitor. Performs a fresh copy of the nodes
  and all its children and applies \verb+f+ to the copy.
\item
  \verb+ChangeDoChildrenPost(v,f)+%
  \sscodeidx{Cil}{visitAction}{ChangeDoChildrenPost}
  the old node is replaced by \verb+v+, the visit goes on with the children of
  \verb+v+, and when it is finished, \verb+f+ is applied to the result. In the
  case of \verb+vstmt_aux+, \verb+f+ is called after the annotations in the
  annotations table have been visited, but \emph{before} they are attached to
  the new statement, that is, they will be added to the result of
  \verb+f+. Similarly, \verb+vglob_aux+ will consider the
   result of \verb+f+ when filling the table of globals. Note that
   \verb+ChangeDoChildrenPost(x,f)+ where \verb+x+ is the current node
   is \textit{not} equivalent to \verb+DoChildrenPost f+, as in the latter
   case, the visitor mechanism knows that it still deals with the original node.
\end{itemize}

\subsection{Visitors and Projects}\label{sec:visitors-projects}\index{Project}

Copy visitors\index{Visitor!Copy} (see next section) implicitly take an
additional argument, which is the project in which the transformed
AST\index{AST!Modification} should be put in.

Note that the tables of the new project are not filled immediately. Instead,
actions are queued, and performed when a whole
\verb+Cil_types.file+\scodeidx{Cil\_types}{file} has been visited. One can
access the queue with the
\verb+get_filling_actions+\sscodeidxdef{Cil}{cilVisitor}{get\_filling\_actions}
method, and perform the associated actions on the new project with the
\verb+fill_global_tables+\sscodeidxdef{Cil}{cilVisitor}{fill\_global\_tables}
method.

In-place visitors\index{Visitor!In-Place}  always operate on the
current project\index{Project!Current} (otherwise, two projects would
risk sharing\index{Sharing} the same AST\index{AST!Sharing|see{Sharing}}).

\subsection{In-place and Copy Visitors}\label{adv:sec:place-copy-visitors}
\index{Visitor!Behavior|bfit}

The visitors take as argument a
\verb+visitor_behavior+\scodeidx{Cil}{visitor\_behavior}, which comes in two
flavors: \verb+inplace_visit+\scodeidx{Cil}{inplace\_visit}%
\index{Visitor!In-Place|bfit} and \verb+copy_visit+%
\scodeidx{Cil}{copy\_visit}\index{Visitor!Copy|bfit}. In the in-place mode,
nodes are visited in place, while in the copy mode, nodes are copied and the
visit is done on the copy\index{AST!Copying}. For the nodes
shared\index{Sharing} across the AST
(\verb+varinfo+\scodeidx{Cil\_types}{varinfo},
\verb+compinfo+\scodeidx{Cil\_types}{compinfo},
\verb+enuminfo+\scodeidx{Cil\_types}{enuminfo},
\verb+typeinfo+\scodeidx{Cil\_types}{typeinfo},
\verb+stmt+\scodeidx{Cil\_types}{stmt},
\verb+logic_var+\scodeidx{Cil\_types}{logic\_var},
\verb+logic_info+\scodeidx{Cil\_types}{logic\_info} and
\verb+fieldinfo+\scodeidx{Cil\_types}{fieldinfo}), sharing is of course
preserved, and the mapping between the old nodes and their copy can be
manipulated explicitly through the following functions:
\begin{itemize}
\item
  \verb+reset_behavior_+\emph{name}
  \scodeidxdef{Cil}{reset\_behavior\_varinfo}
  resets the mapping corresponding to the type \emph{name}.
\item \verb+get_original_+\emph{name}\scodeidxdef{Cil}{get\_original\_varinfo}
  gets the original value corresponding to a copy (and behaves as the identity
  if the given value is not known).
\item \verb+get_+\emph{name}\scodeidxdef{Cil}{get\_varinfo} gets the copy
  corresponding to an old value. If the given value is not known, it behaves as
  the identity.
\item \verb+set_+\emph{name}\scodeidxdef{Cil}{set\_varinfo} sets a copy for a
  given value. Be sure to use it before any occurrence of the old value has
  been copied, or sharing will be lost.
\end{itemize}

\begin{important}
  \verb+get_original_+\emph{name} functions allow to retrieve additional
  information tied to the original AST nodes. Its result must not be modified
  in place\index{AST!Modification} (this would defeat the purpose of operating
  on a copy to leave the original AST untouched). Moreover, note that whenever
  the index used for \emph{name} is modified in the copy, the internal state of
  the visitor behavior\index{Visitor!Behavior} must be updated accordingly
  (\emph{via} the \verb+set_+\emph{name} function) for
  \verb+get_original_+\emph{name} to give correct results.\index{Consistency}
\end{important}

The list of such indices is given Figure~\ref{fig:idx-visitor}.
\begin{figure}[htbp]
\begin{center}
\begin{tabular}{|l|l|}
  \hline
  Type & Index \\
  \hline
  \verb+varinfo+\scodeidx{Cil\_types}{varinfo} & \verb+vid+ \\
  \hline
  \verb+compinfo+\scodeidx{Cil\_types}{compinfo} & \verb+ckey+ \\
  \hline
  \verb+enuminfo+\scodeidx{Cil\_types}{enuminfo} & \verb+ename+ \\
  \hline
  \verb+typeinfo+\scodeidx{Cil\_types}{typeinfo} & \verb+tname+ \\
  \hline
  \verb+stmt+\scodeidx{Cil\_types}{stmt} & \verb+sid+ \\
  \hline
  \verb+logic_info+\scodeidx{Cil\_types}{logic\_info} &
  \verb+l_var_info.lv_id+ \\
  \hline
  \verb+logic_var+\scodeidx{Cil\_types}{logic\_var} & \verb+lv_id+ \\
  \hline
  \verb+fieldinfo+\scodeidx{Cil\_types}{fieldinfo} & \verb+fname+ and
  \verb+fcomp.ckey+ \\
  \hline
\end{tabular}
\end{center}
\caption{Indices of AST nodes.}\label{fig:idx-visitor}
\end{figure}

\begin{important}
  Last, when using a copy visitor\index{Visitor!Copy}, the actions (see previous
  section) \verb+SkipChildren+\sscodeidx{Cil}{visitAction}{SkipChildren} and
  \verb+ChangeTo+\sscodeidx{Cil}{visitAction}{ChangeTo} must be used with care,
  \emph{i.e.}  one has to ensure that the children are fresh. Otherwise, the new
  AST will share some nodes with the old one.\index{Sharing} Even worse, in such
  a situation the new AST might very well be left in an inconsistent state, with
  uses of shared node ({\it e.g.} a \verb+varinfo+ for a function \verb+f+ in a
  function call) which do not match the corresponding declaration ({\it e.g} the
  \verb+GFun+ definition of \verb+f+).

  When in doubt, a safe solution is to use
  \verb+JustCopy+\sscodeidx{Cil}{visitAction}{JustCopy} instead of
  \verb+SkipChildren+\sscodeidx{Cil}{visitAction}{SkipChildren} and
  \verb+ChangeDoChildrenPost(x,fun x -> x)+%
  \sscodeidx{Cil}{visitAction}{ChangeDoChildrenPost} instead of
  \verb+ChangeTo(x)+\sscodeidx{Cil}{visitAction}{ChangeTo}.
\end{important}
\subsection{Differences Between the \cil and \framac Visitors}
\label{adv:sec:diff-betw-cil}

As said in Section~\ref{adv:sec:methods}, \verb+vstmt+ and \verb+vglob+ should
not be redefined. Use \verb+vstmt_aux+ and \verb+vglob_aux+ instead. Be aware
that the entries corresponding to statements and globals in \framac tables are
considered more or less as children of the node. In particular, if the method
returns \verb+ChangeTo+\sscodeidx{Cil}{visitAction}{ChangeTo} action (see
Section~\ref{adv:sec:action-performed}), it is assumed that it has taken care of
updating the tables accordingly, which can be a little tricky when copying a
\verb+file+\scodeidx{Cil\_types}{file}\index{AST!Copying} from a project to
another one. Prefer
\verb+ChangeDoChildrenPost+\sscodeidx{Cil}{visitAction}{ChangeDoChildrenPost}.
On the other hand, a
\verb+SkipChildren+\sscodeidx{Cil}{visitAction}{SkipChildren} action implies
that the visit will stop, but the information associated to the old value will
be associated to the new one. If the children are to be visited, it is undefined
whether the table entries are visited before or after the children in the AST.

\subsection{Example}

Here is a small copy visitor that adds an assertion for each
division in the program, stating that the divisor is not zero:
\scodeidx{Visitor}{generic\_frama\_c\_visitor}
\scodeidx{Cil}{copy\_visit}
\scodeidx{Cil}{lzero}
\sscodeidx{Cil}{cilVisitor}{vexpr}
\sscodeidx{Cil\_types}{exp\_node}{BinOp}
\sscodeidx{Cil\_types}{relation}{Rneq}
\sscodeidx{Cil\_types}{binop}{Div}
\sscodeidx{Cil\_types}{binop}{Mod}
\scodeidx{Logic\_utils}{expr\_to\_term}
\scodeidx{Logic\_const}{prel}
\scodeidx{Extlib}{the}
\sscodeidx{Cil}{cilVisitor}{get\_filling\_actions}
\sscodeidx{Cil}{visitAction}{DoChildren}
\scodeidx{File}{create\_project\_from\_visitor}
\scodeidx{Annotations}{add\_assert}
\sscodeidx{Db}{Main}{extend}
\scodeidx{Cil}{get\_stmt}
\scodeidx{Cil}{get\_kernel\_function}
\sscodeidx{Cil}{cilVisitor}{behavior}
\scodeidx{Ast}{self}
\sscodeidx{Cil}{cilVisitor}{current\_kinstr}
\sscodeidx{Visitor}{frama\_c\_visitor}{current\_kf}
\ocamlinput{./examples/generated/syntactic_check.ml}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Logical Annotations}\label{adv:annotations}\index{Annotation|bfit}

\begin{prereq}
  None.
\end{prereq}

Logical annotations set by the users in the analyzed \C program are part of the
AST\index{AST}. However others annotations (those generated by plug-ins) are
not directly in the AST because it would contradict the rule ``an AST must
never be modified inside a project'' (see Section~\ref{proj:use}).

So all the logical annotations (including those set by the users) are put in
global projectified tables maintained up-to-date by the \framac kernel. Anytime
a plug-in wants either to access to or to add/delete an annotation, it
\emph{must} use the corresponding modules or functions and not the annotations
directly stored in the AST. These modules and functions are the following.
\begin{itemize}
\item Module \texttt{Annotations}\codeidxdef{Annotations} which contains the
  database of annotations related to the AST (global annotations, function
  contracts and code annotations). Adding or deleting an annotation requires to
  define an emitter by \texttt{Emitter.create}\scodeidx{Emitter}{create} first.
\item Module \texttt{Property\_status}\codeidxdef{Property\_status} should
  be used to get or to modify the validity status of logical
  properties. Modifying a property status requires to define an emitter by
  \texttt{Emitter.create}\scodeidx{Emitter}{create} first. Key concepts and
  theoretical foundation of this module are described in an associated research
  paper~\cite{fmics12}.
\item Module \texttt{Property}\codeidxdef{Property} provides access to all
  logical properties on which property statuses can be emitted. In particular,
  an ACSL annotation has to be converted into a property if you want to access
  its property statuses.
\item Modules \texttt{Logic\_const}\codeidxdef{Logic\_const},
  \texttt{Logic\_utils}\codeidxdef{Logic\_utils} and
  \texttt{Db.Properties}\scodeidxdef{Db}{Properties} contain several
  operations over annotations.
\end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Extending ACSL annotations}\label{sec:extend-acsl-annot}
\begin{prereq}
  Knowledge of the ACSL specification language.
\end{prereq}

\framac supports the possibility of adding specific ACSL annotations in the
form of special clauses.
Such clauses can be of different categories, as described by
\scodeidx{Cil\_types}{ext\_category}\texttt{Cil\_types.ext\_category}.
\begin{itemize}
\item A contract extension will be
stored in the \texttt{b\_extended}\sscodeidx{Cil\_types}{behavior}{b\_extended} field of
\texttt{Cil\_types.behavior}\scodeidx{Cil\_types}{behavior}.
\item A global extension will be found as a global ACSL annotation in the form of a
\scodeidx{Cil\_types}{Dextended}\texttt{Cil\_types.Dextended} constructor.
\item A code annotation extension will be stored with an
\scodeidx{Cil\_types}{AExtended}\texttt{Cil\_types.AExtended} constructor. Such an extension has itself
different flavors, determined by the \scodeidx{Cil\_types}{ext\_code\_annot\_context} type:
\begin{itemize}
\item it can be meant to be evaluated exactly at the current program point
 (like an ACSL \texttt{assert}), or
\item it can be related to the next statement (or block), like an ACSL statement contract, or
\item it can be a loop extension, or
\item it can be used both as a loop extension or be related to the next (non-loop) statement.
\end{itemize}
\end{itemize}

An extension is characterized by its introducing keyword \texttt{kw}, or \texttt{loop kw} for a loop
extension. It is not possible to have the same keyword for two distinct extensions, especially if
they belong to different categories, as this would lead to ambiguities in the parser.

An \texttt{acsl\_extension}\scodeidx{Cil\_types}{acsl\_extension} is a triple
\texttt{(id, kw, ext)} where \texttt{id} is its unique ID, used in annotation tables and generated
by \scodeidx{Logic\_const}{new\_acsl\_extension}\texttt{Logic\_const.new\_acsl\_extension},
\texttt{kw} identifies the extension, and \texttt{ext} is an
\texttt{acsl\_extension\_kind}\scodeidx{Cil\_types}{acsl\_extension\_kind}
and can take three forms:
\begin{itemize}
\item \texttt{Ext\_id id} with \texttt{id} an \texttt{int}
that the plugin can use to refer to the annotation in its internal state.
This identifier is under the full responsibility of the plugin
 and will never be used by the kernel.
\item \texttt{Ext\_preds preds} with \texttt{preds} a possibly empty list of
predicates.
\item \texttt{Ext\_terms terms} with \texttt{terms} a possibly empty list of
  terms.
\end{itemize}
For the latter two cases,
the corresponding list is traversed normally by the visitor
(see section~\ref{adv:visitors}).

In order for the extension to be recognized by the parser, it must be
registered by one of the following functions, depending on its category:
\begin{itemize}
\item \texttt{Logic\_typing.register\_behavior\_extension}%
\scodeidx{Logic\_typing}{register\_behavior\_extension}
\item \texttt{Logic\_typing.register\_global\_extension}%
\scodeidx{Logic\_typing}{register\_global\_extension}
\item \texttt{Logic\_typing.register\_code\_annot\_extension}%
\scodeidx{Logic\_typing}{register\_code\_annot\_extension}
\item \texttt{Logic\_typing.register\_code\_annot\_next\_stmt\_extension}%
\scodeidx{Logic\_typing}{register\_code\_annot\_next\_stmt\_extension}
\item \texttt{Logic\_typing.register\_code\_annot\_next\_loop\_extension}%
\scodeidx{Logic\_typing}{register\_code\_annot\_next\_both\_extension}
\item \texttt{Logic\_typing.register\_code\_annot\_next\_both\_extension}%
\scodeidx{Logic\_typing}{register\_code\_annot\_next\_loop\_extension}
\end{itemize}
After a call to the appropriate registration function,
a clause of the form \verb|kw e1,...,en;|, where each \verb|ei| can be
any syntactically valid ACSL term or predicate,
will be treated by the parser
as belonging to the extension \verb|kw|. Contract extension clauses must occur after
\verb|assumes| and \verb|requires| clauses if any, but can be freely
mixed with other behavior clauses (post-conditions, \verb|assigns|,
\verb|frees| and \verb|allocates|).

Similarly, in a loop annotation,
\verb|loop kw e1, ..., en;| will be treated as belonging to the \verb|kw|
extension. In case the loop annotation has a \verb|loop variant|, the
extension must occur before. Otherwise, there is no ordering constraint with
other loop annotations clauses.

Global extensions can appear either alone in a global annotation, or as part of an axiomatic with
a set of other global annotations.

Finally, a code annotation extension must appear as a single code annotation, like any code annotation.

Code (and loop) extensions can be made specific to a
set of existing behaviors using the standard ACSL \verb|for| construction.
Namely, \verb|for bhv: loop kw e1, ..., en;| will indicate that the
(loop) extension is supposed to be considered only when behavior \verb|bhv| is
active (although it is ultimately up to the plugin to decide what to do with
this information).

During type-checking,
the list \verb|[e1;...;en]| will be given to \verb|f|, together
with the current typing environment (which allows discriminating
between contract and loop extensions and will have the appropriate logic labels
set in the local environment). \verb|f| must return the corresponding
\verb|acsl_extension_kind| (possibly adding an entry for key \verb|id|
in an internal table if it chooses to return \verb|Ext_id id|).

The first argument of \verb|f| is a \verb|Logic_typing.typing_context|%
\scodeidx{Logic\_typing}{typing\_context}
which provides lookup functions for the various kinds of identifiers that are
present in the environment, as well as extensible type-checking functions for
predicates, terms, and assigns clauses. Indeed, these functions take themselves
as argument a \verb|typing_context| \verb|ctxt| and will use the functions
of \verb|ctxt| to type-check the children of the current node. Extensions
can take advantage of this open recursion to recognize only subtrees of an
otherwise normal ACSL predicate or term. For instance, the following code will
let extension \verb|foo| replace all occurrences of \verb|\foo| by \verb|42|.

\ocamlinput{./examples/acsl_extension.ml}

With this extension enabled, \framac will interpret the following clause in
a given source file:
\begin{lstlisting}[language=C,alsolanguage=ACSL]
/*@ foo 84 == \foo + \foo; */
\end{lstlisting}
as the following type-checked AST fragment:
\begin{lstlisting}[language=C,alsolanguage=ACSL]
/*@ foo 84 == 42 + 42; */
\end{lstlisting}

If all the information of the extended clause is contained in
the predicate list \verb|preds|, no other registration is needed beyond
the parsing and type-checking: the pretty-printer will output \verb|preds|
as a comma-separated list preceded by \verb|kw| (or \verb|loop kw| if
the extension is a loop annotation), and the visitor will traverse
each \verb|preds| as well as any predicate present in the AST. However, if
some information is present in the internal state of the plugin, two more
functions may be required for pretty-printing and visiting the extending
clause respectively.

First,
\texttt{Cil\_printer.register\_behavior\_extension}%
\scodeidx{Cil\_printer}{register\_behavior\_extension}
registers a new pretty-printer \verb|pp| for a given extension \verb|kw|.
Together with the \verb|acsl_extension_kind| of the extended clause,
\verb|pp| is given the current pretty-printer and the formatter where to
output the result.

Second,
\texttt{Cil.register\_behavior\_extension}%
\scodeidx{Cil}{register\_behavior\_extension}
registers a custom visitor \verb|vext| for a given extension \verb|kw|.
\verb|vext| is given the content of the extended clause and the current
visitor, and must return a \texttt{Cil.visitAction}
(if all the information is in the plugin's own table, it can
return \texttt{SkipChildren}).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Locations}\label{adv:memory}\index{Location|bfit}

\begin{prereq}
  None.
\end{prereq}

In \framac, different representations of \C locations
exist. Section~\ref{memory:repr} presents them. Moreover, maps indexed by
locations are also provided. Section~\ref{memory:map} introduces them.

\subsection{Representations}\label{memory:repr}

There are four different representations of \C locations. Actually only three
are really relevant. All of them are defined in module
\texttt{Locations}\codeidxdef{Locations}. They are introduced below. See the
documentation of \texttt{src/kernel\_services/abstract\_interp/locations.mli}
for details about the provided operations on these types.

\begin{itemize}
\item Type \texttt{Location\_Bytes.t}\scodeidxdef{Locations}{Location\_Bytes}
  is used to represent values of \C expressions like \texttt{2} or
  \texttt{((int) \&a) + 13}. With this representation, there is no way to know
  the size of a value while it is still possible to join two values. Roughly
  speaking it is represented by a mapping between \C variables and offsets in
  bytes.
\item Type \texttt{location}\scodeidxdef{Locations}{location}, equivalently
  \texttt{Location.t}\scodeidxdef{Locations}{Location} is used to represent the
  right part of a \C affectation (including bitfields). It is represented by a
  \texttt{Location\_Bits.t} (see below) attached to a size. It is possible to
  join two locations \emph{if and only if they have the same sizes.}
\item Type \texttt{Location\_Bits.t}\scodeidxdef{Locations}{Location\_Bits} is
  similar to \texttt{Location\_Bytes.t} with offsets in bits instead of
  bytes. Actually it should only be used inside a location.
\item Type \texttt{Zone.t}\scodeidx{Locations}{Zone} is a set of bits (without
  any specific order). It is possible to join two zones \emph{even if they have
    different sizes}.
\end{itemize}
\begin{convention}
  Roughly speaking, locations and zones have the same purpose. You should use
  locations as soon as you have no need to join locations of different sizes.
  If you require to convert locations to zones, use the function
  \texttt{Locations.enumerate\_valid\_bits}%
  \scodeidxdef{Locations}{enumerate\_valid\_bits}.
\end{convention}

As join operators are provided for these types, they can be easily used in
abstract interpretation analyses\index{Abstract Interpretation} (which can
themselves be implemented thanks to one of functors of module
\texttt{Dataflow2}\codeidx{Dataflow}.

\subsection{Map Indexed by Locations}\label{memory:map}

Modules \texttt{Lmap}\codeidxdef{Lmap} and
\texttt{Lmap\_bitwise}\codeidxdef{Lmap\_bitwise} provide functors implementing
maps indexed by locations and zones (respectively). The argument of these
functors have to implement values attached to indices (resp. locations or zones).

These implementations are quite more complex than simple maps because they
automatically handle overlaps of locations (or zones). So such implementations
actually require that the structures implementing the values attached to indices
are at least semi-lattices\index{Lattice}; see the corresponding signatures
in module \texttt{Lattice\_type}\codeidx{Lattice\_type}. For
this purpose, functors of the abstract interpretation toolbox%
\index{Abstract Interpretation} can help (see in particular module
\texttt{Abstract\_interp}\codeidx{Abstract\_interp}).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{GUI Extension}\label{adv:gui}\index{GUI|bfit}\index{Plug-in!GUI|bfit}

\begin{prereq}
  Knowledge of \lablgtk.
\end{prereq}

Each plug-in can extend the \framac graphical user interface (aka \emph{GUI})
in order to support its own functionalities in the \framac viewer. For this
purpose, a plug-in developer has to register a function of type
\texttt{Design.main\_window\_extension\_points \fl~unit} thanks to
\texttt{Design.register\_extension}\scodeidx{Design}{register\_extension}. The
input value of type \texttt{Design.main\_window\_extension\_points} is an
object corresponding to the main window of the \framac GUI. It provides
accesses to the main widgets of the \framac GUI and to several plug-in extension
points. The documentation of the class type
\texttt{Design.main\_window\_extension\_points}%
\scodeidxdef{Design}{main\_window\_extension\_points} is accessible through the
source documentation (see Section~\ref{adv:documentation}).

The GUI plug-in code has to be put in separate files into the plug-in
directory\index{Plug-in!Directory}. Furthermore, in the
\texttt{Makefile}\codeidx{Makefile}, the variable
\texttt{PLUGIN\_GUI\_CMO}\codeidx{PLUGIN\_GUI\_CMO} has to be set in order to
compile the GUI plug-in code (see Section~\ref{make:plugin}).

Besides time-consuming computations have to call the function
\texttt{!Db.progress}\scodeidx{Db}{progress} from time to time in order to keep the GUI reactive.

%Mainly that's all! 
The GUI implementation uses
\lablgtk~\cite{lablgtk}\index{Lablgtk}: you can use any \lablgtk-compatible
code in your gui extension. A complete example of a GUI extension may be found in
the plug-in \texttt{Occurrence}\index{Occurrence} (see file
\texttt{src/plugins/occurrence/register\_gui.ml}).

\begin{important}
\paragraph{Potential issues}
All the GUI plug-in extensions share the same window and same
widgets\index{Sharing!Widget}. So conflicts can occur, especially if you
specify some attributes on a predefined object. For example, if a plug-in wants
to highlight\index{Highlighting} a statement $s$ in yellow and another one
wants to highlight $s$ in red at the same time, the behavior is not specified
but it could be quite difficult to understand for an user.
\end{important}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Documentation}\label{adv:documentation}
\index{Documentation|bfit}

\begin{prereq}
  Knowledge of \ocamldoc.
\end{prereq}

Here we present some hints on the way to document your plug-in. First
Section~\ref{doc:rules} introduces a quick general overview about the
documentation process. Next Section~\ref{doc:plugin} focuses on the
plug-in source documentation.

\subsection{General Overview}\label{doc:rules}

Command \texttt{make doc}\codeidx{Makefile} produces the whole \framac source
documentation\index{Documentation!Source} in HTML format. The generated index
file is \texttt{doc/code/html/index.html}. A more general purpose index is
\texttt{doc/index.html}\codeidx{index.html} (from which the previous index is
accessible).

The previous command takes times. So command \texttt{make doc-kernel} only
generates the kernel documentation\index{Documentation!Kernel} (\emph{i.e.}
\framac without any plug-in) while \texttt{make \$(PLUGIN\_NAME)\_DOC} (by
substituting the right value for
\texttt{\$(PLUGIN\_NAME)})\codeidx{PLUGIN\_NAME} generates
the documentation for a single plug-in\index{Plug-in!Documentation}%
\index{Documentation!Plug-in|see{Plug-in Documentation}}.

\subsection{Source Documentation}\label{doc:plugin}
\index{Plug-in!Documentation|bfit}

Each plug-in should be properly documented. \framac uses \ocamldoc and so you
can write any valid \ocamldoc comments.

\paragraph{\ocamldoc tags for \framac}\index{Documentation!Tags} The tag
\texttt{@since version} should document any element introduced after the very
first release, in order to easily know the required version of the \framac
kernel or specific plug-ins. In the same way, the \framac documentation
generator provides a custom tag \texttt{@modify version description} which
should be used to document any element which semantics have changed since its
introduction.

Furthermore, the special tag \texttt{@plugin developer guide} must be attached
to each function used in this document.

\paragraph{Plug-in API}

A plug-in should export functions in its plug-in interface or through modules
\texttt{Db}\codeidx{Db} or \texttt{Dynamic}\codeidx{Dynamic} as explained in
Section~\ref{adv:plugin-registration}.

\begin{important}
The interface name of a plug-in \texttt{plugin} must be \texttt{Plugin.mli}. Be
careful to capitalization of the filename which is unusual in \caml but
required here for compilation purposes.
\end{important}

\paragraph{Internal Documentation for Kernel Integrated Plug-ins}

The \framac documentation generator also produces an internal plug-in
documentation which may be useful for the plug-in developer itself. This
internal documentation is available \emph{via} file
\texttt{doc/code/$plugin$/index.html}\codeidx{index.html} for each plug-in
$plugin$. You can add an introduction to this documentation into a file. This
file has to be assigned into variable
\texttt{PLUGIN\_INTRO}\codeidx{PLUGIN\_INTRO} of the
\texttt{Makefile}\codeidx{Makefile} (see Section~\ref{make:plugin}).

In order to ease access to this internal documentation, you have to
manually edit the file \texttt{doc/index.html}\codeidx{index.html} in order to add
an entry for your plug-in in the plug-in list.

\paragraph{Internal Documentation for External Plug-ins}

External plug-ins can be documented in the same way as plug-ins that are 
compiled together with \framac. However, in order to be able to compile the
documentation with \texttt{make doc}, you must have generated the documentation
of Frama-C's kernel (\texttt{make doc}, see above) and installed it with the 
\texttt{make install-doc-code}\codeidx{install-doc-code} command.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%% \section{License Policy}\label{adv:copyright}\index{License|bfit}

%% \begin{target}developers with a SVN access.\end{target}

%% \begin{prereq}
%% knowledge of \make.
%% \end{prereq}

%% If you want to redistribute a plug-in inside \framac, you have to define a
%% proper license policy\index{Plug-in!License}. For this purpose, some help is
%% provided in the \texttt{Makefile}\codeidx{Makefile}. Mainly we distinguish two
%% cases described below.
%% \begin{itemize}
%% \item \textbf{If the wished license is already used inside \framac}, just
%%   extend the variable corresponding to the wished license in order to include
%%   files of your plug-in. Next run \texttt{make headers}\index{Header}.
%%   \begin{example}
%%     Plug-in \texttt{slicing}\index{Slicing}\index{Plug-in!Slicing|see{Slicing}}
%%     is released under LGPL\index{LGPL} and is proprietary of both CEA and
%%     INRIA\index{Copyright}. Thus, in the \texttt{Makefile}, there is the following line.
%%     \codeidx{CEA\_INRIA\_LGPL}
%%     \begin{makefilecode}
%% CEA_INRIA_LGPL= ... €
%%                  src/plugins/slicing/*.ml*
%%      \end{makefilecode}
%%   \end{example}
%% \item \textbf{If the wished license is unknown inside \framac}, you have to:
%%   \begin{enumerate}
%%   \item Add a new variable $v$ corresponding to it and assign files of your
%%     plug-in;
%%   \item Extend variable \texttt{LICENSES}\codeidx{LICENSES} with
%%     this variable;
%%   \item Add a text file in directory \texttt{licenses}\codeidx{licenses}
%%     containing your licenses
%%   \item Add a text file in directory \texttt{headers}\codeidx{headers}
%%     containing the headers\index{Header} to add into files of your plug-in
%%     (those assigned by $v$).

%%     \begin{important}
%%       The filename must be the same than the variable name $v$. Moreover this
%%       file should contain a reference to the file containing the whole license
%%       text.
%%     \end{important}
%%   \item Run \texttt{make headers}.
%%   \end{enumerate}
%% \end{itemize}

% Local Variables:
% ispell-local-dictionary: "english"
% TeX-master: "main"
% End: