Skip to content
Snippets Groups Projects
Commit 459eddb7 authored by DavidCok's avatar DavidCok Committed by Virgile Prevosto
Browse files

More editing of the user manual

parent 821a92a9
No related branches found
No related tags found
No related merge requests found
......@@ -4,7 +4,7 @@ C_CODE=$(wildcard examples/*.[ci])
VERSION_FILE=../../../../../VERSION
ifeq ("$(wildcard $(VERSION_FILE))", "")
VERSION_FILE=../../VERSION
FC_VERSION+=Chlorine+
FC_VERSION+=Potassium+
else
#internal mode
FC_VERSION=$(shell cat $(VERSION_FILE))
......@@ -13,6 +13,7 @@ FCLANG_VERSION= $(shell cat $(VERSION_FILE))
DEPS_MODERN=fclangversion.tex biblio.bib macros.tex \
introduction.tex \
description.tex \
limitations.tex \
changes.tex \
$(C_CODE) \
......@@ -68,7 +69,7 @@ fclangversion.tex: Makefile
%.ml: %.mll
ocamllex $<
%.pdf: %.tex
%.pdf: %.tex Makefile $(DEPS_MODERN)
pdflatex $*
makeindex $*
bibtex $*
......
The \fclang plug-in intends to provide a full translation of C++ and \acslpp into the \framac internal representation, and from there be able to be analyzed by other \framac plug-ins.
This is a work in progress.
The following sections describe the limitations of the current implementation.
\begin{itemize}
\item The plug-in aims for the TBD version of C++
\item \acslpp is described in the companion \acslpp reference manual, also a part of the \framac release.
\end{itemize}
Notes
\begin{itemize}
\item Which version of Clang?
\item Clang (8) support C++98 (except exported templates, later removed) and C++11 and current draft standard for C++1y
\item see https://clang.llvm.org/docs/UsersManual.html\#cxx for supported features in clang C++
\end{itemize}
\chapter{Installation}
......@@ -20,1131 +5,113 @@ TBD - installation instructions
\chapter{Running the plug-in}
TBD - including command-line options
\section{C++ files}
Once installed the plugin is run automatically by Frama-C on any C++ files listed on the command-line. C++ files are identified by the filename suffix. The default C++ suffixes are these:
.cpp, .C, .cxx, .ii, .ixx, .ipp, .i++, .inl, .h, .hh
\chapter{Current implementation details and limitations}
Currently this set of suffixes is compiled in the plugin (in file frama\_Clang\_register.ml) and can only be changed by recompiling and
reinstalling the plugin.
\section{Preprocessing}
\section{Frama-clang executable}
The plug-in operates by invoking the executable \irg
on each file identified as C++, in turn.
For each file it produces a temporary output file containing the C AST, which is then given as input to frama-c. This executable is a single-file-at-a-time command-line executable only.
Various options control its behavior.
As a refresher, the C/C++ preprocessor (CPP) (cf. \url{https://gcc.gnu.org/onlinedocs/cpp/}) conceptually implements the following operations on a C++ source file:
\begin{itemize}
\item The input is read and broken it into a sequence of physical lines according to the line terminators (ASCII character sequences \\r, \\n, or \\r\\n).
\item Each C trigraph is replaced by its corresponding character.
\item Any backslash-whitespace-line-terminator sequence is removed and the line that it ends is combined with the following line.
\item Comments are replaced by single spaces. This requires tokenizing the input to avoid recognizing comment markers within strings as indicating a comment. Note that this allows block comments to hide line terminations.
\item The input text is divided into preprocessing tokens grouped in logical lines. Each preprocessor token becomes a compiler token (except where \#\# concatenation occurs). However, \acslb tokens are slightly different, as described below.
\item The source text is transformed according to any preprocessing directives contained within it. Each preprocessing directive must be contained within one logical line
\end{itemize}
The result is a sequence of preprocessing tokens that is passed on to the
remaining compiler phases.
The file-system path identifying the executable is provided by the \textbf{-cxx-clang-command <cmd>}
option and is \irg by default. The path may be absolute; if it is a relative path, it is found relative to the TODO directory. [ TODO - check this]
\subsection{Trigraphs}
TBD - are these supported by clang?
\section{Frama-clang options}
\subsection{Preprocessor tokens}
Preprocessor tokens (per CPP) belong to one of five categories. White space (space, tab, TBD) serves only to separate tokens; it is not needed between tokens whose concatenation is not a single token. Line terminators also separate tokens and also delineate certain features: preprocessing directives and string literals do not extend over more than one (logical) line.
\begin{itemize}
\item identifiers: character sequences that match the regular expression \texttt{[a-zA-Z\_][a-zA-Z\_0-9]*} . Dollar signs are allowed as non-digit identifier characters if the clang option \texttt{-fdollars-in-identifiers} is used. (TBD - on by default?)
\item number: character sequences that match the regular expression \texttt{[.]?[0-9]([0-9a-zA-Z.]|[eEpP][+-]))*} . (TBD - dollar signs also allowed?)
\item string literals: character sequence enclosed in " " or ' ' or < >, with \textbackslash " for " in a double-quoted literal (that is not a header file name) and \' for ' in a single-quoted literal.
\item punctuator: all single non-alphanumeric printable characters except \@, \# and `, and all multi-punctuation sequences that meaningful to C/C++ (e.g. >>>= ), but not an arbitrary multi-punctuation character sequence.
\\item other tokens: \@, \#, ` and all non-ASCII single characters.
\end{itemize}
TBD - unicode characters
Note that not all preprocessor tokens are valid C/C++ parser tokens. Tokens in the other category have no meaning to C/C++ and the \texttt{number} category allows many sequences that are not legal C/C++ numeric tokens. These tokens will generally provoke compiler errors. For example in C/C++, 0..2 is one token and is not interpreted as two consecutive numeric tokens.
The options known to the \fcl plugin are processed by the \fc kernel when listed on the \fc command-line.
The use of the frama-c command-line is described in the core frama-c
user guide.
There are many kernel options that affect all plugins and many options specific to frama-clang.
The command \\
\lstinline|frama-c -kernel-h| \\
shows all kernel options; the command\\
\lstinline|frama-c -fclang-h| \\
shows all \fcl-specific options.
\acsl and \acslpp have slightly different tokens than the above, so the preprocessor tokens need to be re-tokenized in some cases:
The most important of the options are these:
\begin{itemize}
\item The \@ token is a whitespace character in \acslb.
\item There are some \acslb multi-character punctuator tokens that are not
preprocessor tokens:
\begin{itemize}
\item[] $==>$ (logical implies)
\item[] $-->$ (bit-wise implies)
\item[] $<==>$ (logical equality)
\item[] $<-->$ (bit-wise equality)
\item[] TBD any more?
\end{itemize}
These \acslb tokens need to be assembled from multiple CPP tokens (and those CPP tokens must not be separated by white space)
\item A CPP numeric token that contains .. will not be a legal C/C++ number, but may be multiple legal \acslb tokens with the .. representing the range operator. For example, the single CPP token \texttt{0..last} is retokenized for \acslb as if it were written \texttt{0 .. last} .
\item \acslb allows certain built-in non-ASCII symbols, namely
\begin{itemize}
\item[] $\forall$ (unicode 0x2200) - universal quantifier
\item[] $\exists$ (unicode 0x2203) - existential quantifier
% \item[] $\equal$ (unicode 0x2261) - equality
% \item[] $\notequal$ (unicode 0x2262) - inequality
\item[] $\leq$ (unicode 0x2264) - less than or equal
\item[] $\geq$ (unicode 0x2265) - greater than or equal
% \item[] $\minus$ (unicode 0x2212) - minus
% \item[] $\implies$ (unicode 0x21D2) - implies
% \item[] $\equiv$ (unicode 0x21D4) - equivalence
\item[] $\bigwedge$ (unicode 0x2227) - logical and
\item[] $\bigvee$ (unicode 0x2228) - logical or
\item[] $\neg$ (unicode 0x00AC) - logical negation
% \item[] $\logicalxor$ (unicode 0x22BB) - logical inequivalence
\item[] $\subset$ (unicode 0x2208) - subset
% \item[] $\Boolean$ (unicode 0x1D539) - set of booleans
% \item[] $\Integer$ (unicode 0x2124) - set of integers
% \item[] $\Real$ (unicode 0x211D) - set of reals
\item TBD - more - ensure unicode equivalent
\end{itemize}
\item \option{-cpp-extra-args <string>} -- the single string argument to this option is appended to the command-line when
\lstinline|frama-clang| is invoked internally. It is particularly
important for adding include directories (\lstinline|-I|) and
other options to be passed on to the clang compiler or the \irg executable.
Multiple instances of this option have a cumulative effect (rather
than later instances replacing earlier ones). [TODO - check this]
\item \option{-machdep <arg>} -- TODO
\item \option{-arg <arg>} -- TODO
\item \option{-fclang-msg-key <categories>} -- TODO
\item \option{-fclang-warn-key <categories>} -- TODO
\item \option{-fclang-verbose <n>} -- TODO
\item \option{-fclang-debug <n>} -- TODO
\item TODO
\end{itemize}
\subsection{Preprocessor directives}
A preprocessing directive consists of a line (after the previous preprocessing phaseshave been completed) that begins with optional white space, the '\#' character, additional optional white space, and a preprocessor directive identifier.
The preprocessing language contains a fixed set of preprocessing directive identifiers:
\begin{itemize}
\item \texttt{define}, \texttt{undef}
\item \texttt{if}, \texttt{ifdef}, \texttt{ifndef}, \texttt{elif}, \texttt{else}, \texttt{endif}
\item \texttt{warning}, \texttt{error}
\item \texttt{include}
\item \texttt{line}
\item \texttt{pragma}
\end{itemize}
In addition, identifiers that have been defined (by a \#define directive) as macros are expanded according to the macro expansion rules (not described here).
\section{Include directories}
Now \acslb annotations are contained in C/C++ comments.
Consequently, any directives contained in the annotation are not seen when the source file is processed simply as a C/C++ source file. However, the effect of some directives lasts until the end of the source file.
Accordingly, \acslpp imposes constraints on the directives that may be present within annotations:
\begin{itemize}
\item \texttt{define} and \texttt{undef} are not permitted in an annotation
\item \texttt{if}, \texttt{ifdef}, \texttt{ifndef}, \texttt{elif}, \texttt{else}, \texttt{endif} are permitted but must be completely nested within the annotation in which they appear (an \#endif and its corresponding \#if/\#ifdef/\#ifndef must be in the same annotation comment.)
\item \texttt{warning} and \texttt{error} are permitted
\item \texttt{include} is permitted, but will cause errors if it contains, as is almost always the case, either \texttt{define} or \texttt{pragma} directives
\item \texttt{line} - TBD
\item \texttt{pragma} is not permitted
\end{itemize}
TODO
\section{32 and 64-bit targets}
TODO
\section{\acslpp constructs}
\textit{This section is not yet written}
\chapter{Running the frama-clang front-end standalone}
%This chapter is the core of this manual and describes how to use the \fclang
%plug-in.
In normal use within \fc, the \irg executable is
invoked automatically. However, it can also be run standalone.
In this mode it accepts command-line options and a single file;
it produces a C AST representing the translated C++, in a text format
of the CIL (C Intermediate language). [TODO - or is it CABS]
\section{Clang options}
%% You can still call the option \shortopt{e-acsl-help} to get the list of
%% available options with few lines of documentation.
%First, Section~\ref{sec:simple} shows how to run the plug-in on a toy example,
%compile the generated code with the \gcc compiler and detect invalid
%annotations at runtime. Further, Section~\ref{sec:wrapper} describes \eacslgcc\
%-- a convenience wrapper script around \framac and \gcc. The aim of this script
%is to simplify instrumentation and compilation of the instrumented code.
%Section~\ref{sec:exec-env} gives additional details on the execution of the
%generated code. Next, Section~\ref{sec:incomplete} focuses on how to deal with
%incomplete programs, \emph{i.e.} in which some functions are not defined or in
%which there are no main function. Section~\ref{sec:combine} explains how to
%combine the \eacsl plug-in with other \framac plug-ins. Finally,
%Section~\ref{sec:custom} introduces how to customize the plug-in, and
%Section~\ref{sec:verbose} details the verbosing policy of the plug-in.
Frama-Clang is built on the Clang C++ parser.
As such, the \irg executable accepts the clang
compiler options and passes them on to clang. There are many of these.
Many of these are irrelevant to \fcl as they relate to lstinline|framaCIRGen|
code generation, whereas \fcl only uses clang for parsing, name
and type resolution.
You can see a list of options by running
\lstinline|framaCIRGen -help|
The most significant options for \irg are these:
\begin{itemize}
\item \option{-I} -- adds a directory to the include file search path
\item \option{-o <file>} -- specifies the location of the output file (in this case, the file to contain the generated AST)
\item \option{-std=...} -- the input programming language (default is \lstinline|-std=c++11|)
\item \option{-w} -- suppress warnings
\item \option{-{-}version} -- print version information
\end{itemize}
\section{Frama-Clang specific options}
There are also options that are specific to \fcl.
These are listed below:
\begin{itemize}
\item \option{-{-}stop-annot-error} -- if set, then parsing stops on the first error; default is off
\item \option{-{-}gen-impl-method} -- TODO
\item \option{-{-}exit-code} -- if set, then the exit code of \irg is 1 if errors occur; this is not the default because then \lstinline|frama-c| would terminate upon
any error in \lstinline|framaCIRGen|
\item \option{-{-}no-exit-code} -- disables \option{-{-}exit-code}
\item \option{-{-}info=<n>} -- sets the level of informational messages to \textbf{n}; 0 is completely quiet; increasing values are
more verbose. \option{-{-}info} sets the level to 1;
\option{-{-}no-info} sets the level to 0. The \lstinline|frama-c| option \option{-fclang-msg-key=parse} is equivalent to setting a value of 1.
\item \option{-{-}warn=<n>} -- sets the level of parser warning messages to \textbf{n}; 0 is completely quiet; increasing values are
more verbose. \option{-{-}warn} sets the level to 1;
\option{-{-}no-warnyyp} sets the level to 0. The \lstinline|frama-c| option \option{-fclang-warn-key=parse} is equivalent to setting a value of 1.
\item \option{-{-}debug=<n>} -- sets the level of parser debug messages to \textbf{n}; 0 is completely quiet; increasing values are
more verbose. \option{-{-}debug} sets the level to 1;
\option{-{-}no-debug} sets the level to 0. The \lstinline|frama-c| option \option{-fclang-debug=<n>} is equivalent to setting a value of \textbf{n}.
In particular, a debug value of 1 shows the command-line that invokes
\irg.
\item TODO: -b -x -o -v
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\section{Simple Example} % <<<
%\label{sec:simple}
%
%This section is a mini-tutorial which explains from scratch how to use the
%\eacsl plug-in to detect whether an \eacsl annotation is violated at runtime.
%
%% All tool invocations and code listings shown in this section assume a Linux
%% distribution with system-wide \framac installation.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Running \eacsl}
%\label{sec:run}
%
%Consider the following program containing two \eacsl assertions such that
%the first assertion is valid, whereas the second one is not.
%
%\listingname{first.i}
%\cinput{examples/first.i}
%
%Running \eacsl on \texttt{first.i} consists of adding \shortopt{e-acsl} option
%to the \framac command line:
%\begin{shell}
%\$ frama-c -e-acsl first.i
%[kernel] Parsing FRAMAC_SHARE/libc/__fc_builtin_for_normalization.i (no preprocessing)
%[kernel] Parsing FRAMAC_SHARE/e-acsl/e_acsl.h (with preprocessing)
%[kernel] Parsing FRAMAC_SHARE/e-acsl/e_acsl_gmp.h (with preprocessing)
%[kernel] Parsing FRAMAC_SHARE/e-acsl/e_acsl_mmodel_api.h (with preprocessing)
%[kernel] Parsing first.i (no preprocessing)
%[e-acsl] beginning translation.
%<skip a warning when translating the Frama-C libc>
%[e-acsl] translation done in project "e-acsl".
%\end{shell}
%
%Even though \texttt{first.i} has already been pre-processed, \eacsl first asks
%the \framac kernel to process and combine it against several header files
%provided by the \eacsl plug-in. Further \eacsl translates the annotated code
%into a new \framac project named \texttt{e-acsl}\footnote{The notion of
%\emph{project} is explained in Section 8.1 of the \framac user
%manual~\cite{userman}.}. By default, the option \shortopt{e-acsl} does nothing
%more. It is however possible to have a look at the generated code in the
%\framac GUI. This is also possible through the command line thanks to the
%kernel options \shortopt{then-last} and \shortopt{print}. The former
%switches to the last generated project, while the latter pretty prints the \C
%code~\cite{userman}:
%
%\input{examples/instrumented_first.c}
%
%As you can see, the generated code contains additional type declarations,
%constant declarations and global \acsl annotations not present in the initial
%file \texttt{first.i}. They are part of the \eacsl monitoring library. You can
%safely ignore them for now. The translated \texttt{main} function of
%\texttt{first.i} is displayed at the end. After each \eacsl annotation,
%a call to \texttt{\_\_e\_acsl\_assert}\codeidx{e\_acsl\_assert} has been added.
%
%\begin{minipage}{\linewidth}
%\begin{ccode}
% /*@ assert x == 0; */
% __e_acsl_assert(x == 0,(char *)"Assertion",(char *)"main",(char *)"x == 0",3);
% /*@ assert x == 1; */
% __e_acsl_assert(x == 1,(char *)"Assertion",(char *)"main",(char *)"x == 1",4);
%\end{ccode}
%\end{minipage}
%
%Each call to \texttt{\_\_e\_acsl\_assert}\codeidx{e\_acsl\_assert}, function
%defined by the \eacsl monitoring library, dynamically verifies the validity of
%the corresponding assertion. In other words, it checks that its first argument
%(here \texttt{x == 0} or \texttt{x == 1}) is not equal to 0 and fails
%otherwise. The extra arguments are used to display error reports as shown in
%Section~\ref{sec:exec}.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Executing the generated code}
%\label{sec:exec}
%
%Using \shortopt{ocode} \framac~\cite{userman} option, the code generated by the
%\eacsl plug-in can be redirected into a file as follows:
%\begin{shell}
%\$ frama-c -e-acsl first.i -then-last -print -ocode monitored_first.c
%\end{shell}
%
%\framac uses architecture-dependent configuration which
%affects sizes of integer types, endianness and potentially other features. It
%can be seen that the code generated from \texttt{first.i} (shown in the
%previous section) defines \C type \texttt{size\_t} as \texttt{unsigned int}, whereas
%in 64-bit architectures \texttt{size\_t} is typically defined as
%\texttt{unsigned long}. Architecture used during \framac translation is
%controlled through \framac \shortopt{machdep} option that specifies the
%architecture type to use during translation. The default value of
%\shortopt{machdep} is \texttt{x86\_32} (a generic 32-bit x86 architecture).
%Note that since code generated by \eacsl is aimed at being compiled it is
%important that the architecture used by \framac matches the architecture
%corresponding to your compiler and your system. For instance, in a 64-bit
%machine you should also pass
%\shortopt{machdep} \texttt{x86\_64} option as follows:
%\begin{shell}
% \$ frama-c -machdep x86_64 -e-acsl first.i -then-last \
% -print -ocode monitored_first.c
%\end{shell}
%
%This file can be compile with a \C compiler (for instance \gcc), as follows:
%
%\lstset{escapechar=£}
%\begin{shell}
%\$ gcc -c -Wno-attributes monitored_first.c
%\end{shell}
%
%However, creating an executable through a proper invokation to \gcc is painful
%and is not documented. Instead, \eacsl comes with a companion wrapper script for
%this purpose.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{\eacsl Wrapper Script} % <<<
%\label{sec:wrapper}
%
%\begin{important}\index{Gcc}
% Presently, \eacslgcc is a recommended way of running the \eacsl plug-in.
% This manual further describes features of the \eacsl plug-in using \eacslgcc
% as its main driver for source code instrumentation and compilation of the
% instrumented programs.
%\end{important}
%
%In this section we describe \eacslgcc and provide several examples of its use.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Standard Usage}
%
%The default behaviour of \eacslgcc is to instrument an annotated C program
%with runtime assertions via a run of \framac.
%
%\begin{shell}
% \$ e-acsl-gcc.sh -omonitored_first.i first.i
%\end{shell}
%
%The above command instruments \texttt{first.i} with runtime assertions and
%outputs the generated code to \texttt{monitored\_first.i}. Unless the name of
%the output file is specified via the \shortopt{o} option (i.e.,
%\shortopt{-omonitored\_first.i} in the above example), the generated
%code is placed to a file named \texttt{a.out.frama.c}.
%
%Compilation and Linking of the original and the instrumented sources
%is enabled using the \shortopt{c} option. For instance, command
%
%\begin{shell}
% \$ e-acsl-gcc.sh -c -omonitored_first.i first.i
%\end{shell}
%
%instruments the code in \texttt{first.i}, outputs it to
%\texttt{monitored\_first.i} and produces 2 executables: \texttt{a.out} and
%\texttt{a.out.e-acsl}, such that \texttt{a.out} is an executable compiled from
%\texttt{first.i}, while \texttt{a.out.e-acsl} is an executable compiled from
%\texttt{monitored\_first.i} generated by \eacsl.
%
%You can execute the generate binaries, in particular \texttt{a.out.e-acsl} which
%detects at runtime the incorrect annotation.
%\begin{shell}
%\$ ./a.out.e-acsl
%Assertion failed at line 4 in function main.
%The failing predicate is:
%x == 1.
%Aborted
%\$ echo \$?
%134
%\end{shell}
%The execution aborts with a non-zero exit code (134) and an error message
%indicating \eacsl annotation that has been violated. There is no output for the
%valid \eacsl annotation. That is, the run of an executable generated from the
%instrumented source file (i.e., \texttt{monitored\_first.i}) detects that the
%second annotation in the initial program is invalid, whereas the first
%annotation holds for this execution.
%
%Named executables can be created using the \shortopt{O} option. This is such
%that a value supplied to the \shortopt{O} option is used to name the executable
%generated from the original program and the executable generated from the
%\eacsl-instrumented code is suffixed \texttt{.e-acsl}.
%
%For example, command
%\begin{shell}
% \$ e-acsl-gcc.sh -c -omonitored_first.i -Omonitored_first first.i
%\end{shell}
%names the executables generated from \texttt{first.i} and
%\texttt{monitored\_first.i}: \texttt{monitored\_first} and
%\texttt{monitored\_first.e-acsl} respectively.
%
%The \eacslgcc \shortopt{C} option allows to skip the instrumentation step and
%compile a given program as if it was already instrumented by the \eacsl
%plug-in. This option is useful if one makes changes to an already
%instrumented source file by hand and only wants to recompile it.
%
%\begin{shell}
% \$ e-acsl-gcc.sh -C -Omonitored_first monitored_first.i
%\end{shell}
%
%The above command generates a single executable named
%\texttt{monitored\_first.e-acsl}. This is because \texttt{monitored\_first.i} is
%considered to be instrumented code and thus no original program is provided.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Customising \framac and \gcc Invocations}
%
%By default \eacslgcc uses \T{frama-c} and \T{gcc} executables by looking them
%up in a system's path. If either of the tools are not found the script fails
%with an appropriate error message. Paths to \framac and \gcc executables can
%also be passed using \shortopt{I} and \shortopt{G} options
%respectively, for instance as follows:
%
%\begin{shell}
% \$ e-acsl-gcc.sh -I/usr/local/bin/frama-c -G/usr/bin/gcc6 foo.c
%\end{shell}
%
%Runs of \framac and \gcc issued by \eacslgcc can further be customized by using
%additional flags. \framac flags can be passed using the \shortopt{F} option,
%while \shortopt{l} and \shortopt{e} options allow for passing additional
%pre-processor and linker flags during \gcc invocations. For example, command
%
%\begin{shell}
% \$ e-acsl-gcc.sh -c -F"-unicode" -e"-I/bar -I/baz" foo.c
%\end{shell}
%
%yields an instrumented program \texttt{a.out.frama.c} where ACSL formulas
%are output using the UTF-8 character set. Further, during the compilation of
%\texttt{a.out.frama.c} \gcc adds directories \texttt{/bar} and
%\texttt{/baz} to the list of directories to be searched for header files.
%
%It is worth mentioning that \eacslgcc implements several convenience flags for
%setting some of the \framac options. For instance, \shortopt{E} can be used to
%pass additional arguments to the \framac pre-processor (see
%Section~\ref{sec:eacsl-gcc:doc}).
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Verbosity Levels and Output Redirection}
%
%By default \eacslgcc prints the executed \framac and \gcc commands and pipes
%their output to the terminal. The verbosity of \framac output can be changed
%using \shortopt{v} and \shortopt{d} options used to pass values to
%\shortopt{verbose} and \shortopt{debug} flags of \framac respectively. For
%instance, to increase the verbosity of plug-ins but suppress the verbosity of
%the \framac kernel while instrumenting \texttt{foo.c} one can use the following
%command:
%
%\begin{shell}
% \$ e-acsl-gcc.sh -v5 -F"-kernel-verbose 0" foo.c
%\end{shell}
%
%Verbosity of the \eacslgcc output can also be reduced using the \shortopt{q}
%option that suppresses any output except errors or warnings issued by \gcc,
%\framac, and \eacslgcc itself.
%
%The output of \eacslgcc can also be redirected to a specified file using the
%\shortopt{s} option. For example, the following command redirects all
%output during instrumentation and compilation of \texttt{foo.c} to the
%file named \texttt{/tmp/log}.
%
%\begin{shell}
% \$ e-acsl-gcc.sh -c -s/tmp/log foo.c
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Architecture Dependency}
%
%As pointed out in Section~\ref{sec:exec} the execution of a \C program depends
%on the underlying machine architecture it is executed on. The program must be
%compiled on the very same architecture (or cross-compiled for it) for the
%compiler being able to generate a correct binary.
%
%\framac makes assumptions about the machine architecture when analyzing source
%code. By default, it assumes an X86 32-bit platform, but it can be customized
%through \shortopt{machdep} switch~\cite{userman}. This option is of primary
%importance when using the \eacsl plug-in: it must be set to the value
%corresponding to the machine architecture which the generated code will be
%executed on, otherwise the generated program is likely to fail to link against
%the \eacsl library. Note that the library is built using the default
%machine architecture.
%
%One of the benefits of using the wrapper script is that it makes sure that
%\framac is invoked with the correct \shortopt{machdep} option. By default
%\eacslgcc uses \shortopt{machdep gcc\_x86\_64} for 64-bit architectures and
%\shortopt{machdep gcc\_x86\_32} for 32-bit ones. Compiler invocations are
%further passed \shortopt{m64} or \shortopt{m32} options to generate code using
%64-bit or 32-bit ABI respectively.
%
%At the present stage of implementation \eacsl does not support generating
%executables with ABI different to the default one.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{Documentation}\label{sec:eacsl-gcc:doc}
%
%For full up-to-date documentation of \eacslgcc see its man page:
%
%\begin{shell}
% \$ man e-acsl-gcc.sh
%\end{shell}
%
%Alternatively, you can also use the \shortopt{h} option of \eacslgcc:
%
%\begin{shell}
% \$ man e-acsl-gcc.sh -h
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Execution Environment of the Generated Code} %<<<
%\label{sec:exec-env}
%
%The environment in which the code is executed is not necessarily the same as
%the one assumed by \framac. You should take care of that when running the \eacsl
%plug-in and when compiling the generated code with \gcc. In this aspect, the
%plug-in offers a few possibilities of customization.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Runtime Errors in Annotations}
%\label{sec:runtime-error}
%\index{Runtime Error}
%
%The major difference between \acsl~\cite{acsl} and \eacsl~\cite{eacsl}
%specification languages is that the logic is total in the former while it is
%partial in the latter one: the semantics of a logic construct $c$ denoting a \C
%expression $e$
%is undefined if $e$ leads to a runtime error and, consequently, the semantics of
%any term $t$ (resp. predicate $p$) containing such a construct $c$ is
%undefined as soon as $e$ has to be evaluated in order to evaluate $t$
%(resp. $p$). The \eacsl Reference Manual also states that \emph{``it is the
%responsibility of each tool which interprets \eacsl to ensure that an
%undefined term is never evaluated''}~\cite{eacsl}.
%
%Accordingly, the \eacsl plug-in prevents an undefined term from being
%evaluated. If an annotation contains such a term, \eacsl will report a proper
%error at runtime instead of evaluating it.
%
%Consider for instance the following annotated program.
%
%\listingname{rte.i}
%\cinput{examples/rte.i}
%
%The terms and predicates containing the modulo `\%' in the \T{assumes} clause
%are undefined in the context of the \T{main} function call since the second
%argument is equal to 0. However, we can generate an instrumented code and
%compile it using the following command:
%
%\begin{shell}
%\$ e-acsl-gcc.sh -c -omonitored_rte.i -Orte rte.i
%\end{shell}
%
%Now, when \T{rte.e-acsl} is executed, you get the following output indicating
%that your function contract is invalid because it contains a division by zero.
%
%\begin{shell}
%\$ ./rte.e-acsl
%RTE failed at line 5 in function is_dividable.
%The failing predicate is:
%division_by_zero: y != 0.
%Aborted
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Libc}\index{Libc}
%
%By default, \framac uses its own standard library (\emph{aka} libc) instead of
%the one of your system. If you do not want to use it, you have to add the option
%\shortopt{no-framac-stdlib}\optionidx{-}{framac-stdlib} to the \framac command
%line.
%
%It might be particularly useful in one of the following contexts:
%\begin{itemize}
%\item several libc functions used by the analyzed program are still not defined
% in the \framac libc; or
%\item your system libc and the \framac libc mismatch about several function
% types (for instance, your system libc is not 100\% POSIX compliant); or
%\item several \framac lib functions get a contract and you are not interested in
% verifying them (for instance, only checking undefine behaviors matters).
%\end{itemize}
%
%\begin{important} \index{eacslgcc}
%Notably, \eacslgcc disables \framac libc by default. This is because most of
%its functions are annotated with \eacsl contracts and generating code for
%these contracts results in additional runtime overheads. To enforce default
%\framac contracts with \eacslgcc you can pass \shortopt{L} option to the wrapper
%script.
%\end{important}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Integers}
%\label{sec:integers}
%\index{GMP}
%
%\eacsl uses an \texttt{integer}\codeidx{integer} type corresponding to
%mathematical integers which does not fit in any integral \C type. To circumvent
%this issue, \eacsl uses the \gmp library\footnote{\url{http://gmplib.org}}.
%Thus, even if \eacsl does its best to use standard integral types instead of
%\gmp~\cite{sac13,jfla15}, it may generate such integers from time to time. It is
%notably the case if your program contains \lstinline|long long|, either signed
%or unsigned.
%
%Consider for instance the following program.
%
%\listingname{gmp.i}
%\cinput{examples/gmp.i}
%
%Even on a 64-bit machine, it is not possible to compute the assertion with a
%standard \C type. In this case, the \eacsl plug-in generates \gmp code. Note
%that E-ACSL library already includes \gmp, thus no additional arguments need to
%be provided. We can generate and execute the instrumented code as usual via
%the following command:
%
%\begin{shell}
%\$ e-acsl-gcc.sh -omonitored_gmp.i -Ogmp gmp.i
%\$ ./gmp.e-acsl
%\end{shell}
%
%Since the assertion is valid, there is no output in this case.
%
%Note that it is possible to use \gmp arithmetic in all cases (even if not
%required). This option is controlled through \shortopt{g} switch of \eacslgcc
%which passes the \shortopt{e-acsl-gmp-only} option to the \eacsl
%plug-in. However it could slow down the execution of the instrumented program
%significantly.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Memory-related Annotations}
%\label{sec:memory}
%
%The \eacsl plug-in handles memory-related annotations such as
%\lstinline|\valid|. Consider for instance the following program.
%
%\listingname{valid.c}
%\cinput{examples/valid.c}
%
%Even though the first annotation holds, the second annotation (labelled
%\I{freed}) is invalid. This is because at the point of its execution memory
%allocated via \texttt{malloc} has been deallocated and \T{x} is a stale pointer
%referencing unallocated memory. The execution of the instrumented program thus
%reports the violation of the second annotation:
%
%\begin{shell}
%\$ e-acsl-gcc.sh -c -Ovalid valid.c
%\$ ./valid.e-acsl
%Assertion failed at line 11 in function main.
%The failing predicate is:
%freed: \valid(x).
%Aborted
%\end{shell}
%
%To check memory-related predicates (such as \lstinline|\valid|) \eacsl tracks
%memory allocated by a program at runtime. \eacsl records memory blocks in a
%meta-storage tracking each block's boundaries (i.e., its base address and
%length), per-byte initialization and the notion of an address being freeable
%(i.e., whether it can be safely passed to the \texttt{free} function). During
%an execution of the instrumented program code injected by \eacsl
%transformations queries the meta-storage to identify properties of an address
%in question. For instance, during the execution of the above program the
%monitor injected by \eacsl first records information about the memory block
%allocated via a call to \texttt{malloc}. Further, during the execution of the
%first assertion the monitor queries the meta-storage about the address of the
%memory location pointed to by \texttt{x}. Since the location belongs to an
%allocated memory block, the meta-storage identifies it as belonging to the
%tracked allocation. Consequently, the assertion evaluates to true and the
%monitors allows the execution to continue. The second assertion, however,
%fails. This is because the block pointed to by \T{x} has been removed from the
%meta-storage by a call to \T{free}, and thus a meta-storage query identifies
%location \T{x} as belonging unallocated memory space.
%
%\subsubsection{\eacsl Memory Models}
%
%Memory tracking implemented by the \eacsl library comes in two flavours dubbed
%\I{bittree-based} and \I{segment-based} memory models. With the bittree-based
%model meta-storage is implemented as a compact prefix trie called Patricia
%trie~\cite{rv13}, whereas segment-based memory model
%\footnote{Segment-based model of \eacsl has not yet appeared in the literature.}
%is based on shadow memory~\cite{pldi16}. The functionality of the provided
%memory models is equivalent, however they differ in performance. We further
%discuss performance considerations.
%
%The \eacsl models allocate heap memory using a customized version of the
%\dlmalloc\footnote{\url{http://dlmalloc.net/}} memory allocator replacing
%system-wide \T{malloc} and similar functions (e.g., \T{calloc} or \T{free}).
%
%At the level of \eacslgcc you can choose between different memory models using
%the \shortopt{m} switch followed by the name of the memory model to link
%against. For instance, command
%\begin{shell}
%\$ e-acsl-gcc.sh -mbittree -c -Ovalid valid.c
%\end{shell}
%links a program generated from \T{valid.c} against the library implementing the
%bittree-based memory model, whereas the following invocation uses shadow-based
%tracking:
%\begin{shell}
%\$ e-acsl-gcc.sh -msegment -c -Ovalid valid.c
%\end{shell}
%It is also possible to generate executables using both models as follows:
%\begin{shell}
%\$ e-acsl-gcc.sh -msegment,bittree -c -Ovalid valid.c
%\end{shell}
%In this case, \eacslgcc produces 3 executables: \T{valid},
%\T{valid.e-acsl-segment} and \T{valid.e-acsl-bittree} corresponding to an
%executable generated from the original source code, and the executables
%generated from the \eacsl-instrumented sources linked against
%segment-based and bittree-based memory models.
%
%Unless specified, \eacsl defaults to the segment-based memory model.
%
%\subsubsection{Performance Considerations}
%\label{sec:perf}
%
%As pointed out in the previous section the functionalities provided by both
%segment-based and bittree-based memory models are equivalent but differ in
%performance. The bittree-based model uses compact memory representation of
%metadata. The segment-based model on the other hand uses significantly more
%memory. You can expect over 2x times memory overheads when using it. However It
%is often an order of magnitude faster than the bittree-based
%model~\cite{pldi16}.
%
%%[Need ref here with comparison of the performance. Ideally a reference to the PLDI paper].
%
%\subsubsection{Maximized Memory Tracking}
%
%By default \eacsl uses static analysis to reduce instrumentation and therefore
%runtime and memory overheads~\cite{scp16}. With respect to memory tracking this
%means that
%\eacsl may omit recording memory blocks irrelevant for runtime analysis. It
%is, however, still possible to systematically instrument the code for handling
%potential memory-related annotations even when it is not required. This
%feature can be enabled using the \shortopt{M} switch of \eacslgcc or
%\shortopt{e-acsl-full-mmodel} option of the \eacsl plug-in.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% \subsection{Debug Libraries}
%%% \label{sec:runtime-debug}
%
%%% \eacsl memory models mentioned in Section~\ref{sec:memory} are optimized for
%%% performance (via \shortopt{-O2} compiler flag). An \eacsl installation also
%%% provides unoptimized versions of these libraries (suffixed \T{-dbg.a}) aimed at
%%% being used during debugging.
%
%%% The main difference between unoptimized (debug) and optimized (production)
%%% libraries is that debug libraries include assertions validating the correctness
%%% of the RTL itself. For instance, during memory tracking such assertions ensure
%%% that during a run of a program all tracked memory blocks are disjoint.
%
%%% Consider the following program that violates an annotation
%%% in function \T{foo} via a call \T{bar(p+1)} in the main function:
%
%%% \listingname{rte\_debug.i}
%%% \cinput{examples/rte_debug.i}
%
%%% A monitored executable linked against a debug version of \eacsl RTL
%%% can ge generated using
%%% \longopt{rt-debug} flag of \eacslgcc as follows:
%
%%% \begin{shell}
%%% \$ e-acsl-gcc.sh --rt-debug -c -omonitored_rte_debug.c -Orte_debug rte_debug.i
%%% \end{shell}
%
%%% A run of the monitored executable shows violation of the \eacsl annotation.
%%% \begin{shell}
%%% \$ rte_debug.e-acsl
%%% Assertion failed at line 2 in function foo.
%%% The failing predicate is:
%%% \valid(p).
%%% /** Backtrace **************************/
%%% trace at e_acsl_trace.h:45
%%% - exec_abort at e_acsl_assert.h:58
%%% - runtime_assert at e_acsl_assert.h:93
%%% - foo at monitored_rte_debug.c:92
%%% - bar at monitored_rte_debug.c:102
%%% - main at monitored_rte_debug.c:119
%%% /***************************************/
%%% Aborted
%%% \end{shell}
%
%%% A run of an executable linked against an unoptimized RTL shows a stack trace on
%%% the instrumented program saved to \T{monitored\_rte\_debug.c}. In the example
%%% above calls to \T{exec\_abort} and \T{runtime\_assert} belong to the \eacsl
%%% RTL, whereas calls to \T{foo}, \T{bar} and \T{main} belong to the input
%%% program.
%
%%% It should be noted that because debug versions of \eacsl RTL are compiled
%%% without compile-time optimizations and execute additional assertions the
%%% runtime overheads of the instrumented programs linked against the debug
%%% versions of \eacsl RTL can be up to 10 times greater than those linked against
%%% the production versions.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Runtime Monitor Behavior}
%
%When a predicate is checked at runtime function
%\texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert} is called. By
%default this function does nothing if the predicate is valid, and prints
%an error message and aborts the execution (by raising an \texttt{ABORT}
%signal via a call to the \C function \texttt{abort}) if the predicate is invalid.
%
%The default behavior of \texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert}
%can be altered using \longopt{fail-with-code=CODE} option of \eacslgcc
%that uses \texttt{exit(CODE)} instead of \texttt{abort}.
%
%For instance, the program generated using the following command
%exits with code \texttt{5} on a predicate violation.
%\begin{shell}
% \$ e-acsl-gcc.sh -c --fail-with-code=5 foo.c
%\end{shell}
%
%Forceful termination of a monitored program can be disabled using
%\longopt{keep-going} option. If specified, \eacslgcc generates code that
%reports each property violated at runtime but allows the monitored program to
%continue execution. \longopt{keep-going} should be used with caution:
%such unterminated executions may lead to incorrect verification results.
%
%\eacslgcc also provides a way to use an alternative definition of
%\texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert}.
%
%Custom implementation of \texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert}
%should be provided in a separate \C (or object file) and respect the
%following signature:
%
%\cinput{examples/assert_sign.c}
%
%Below is an example which prints the validity status of each property but never
%exits.
%
%\listingname{my\_assert.c}
%\cinput{examples/my_assert.c}
%
%A monitored program that uses custom definition of
%\texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert} can then be generated
%as follows using \longopt{external-assert} option of \eacslgcc to replace the
%default implementation of \texttt{\_\_e\_acsl\_assert}\codeidxdef{e\_acsl\_assert}
%with the customized one specified in \texttt{my\_assert.c}
%
%\begin{shell}
%\$ e-acsl-gcc.sh -c first.i my_assert.c -Ofirst --external-assert=my\_assert.c
%\$ ./first.e-acsl
%Assertion at line 3 in function main is valid.
%The verified predicate was: `x == 0'.
%Assertion at line 4 in function main is invalid.
%The verified predicate was: `x == 1'.
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Additional Verifications} % <<<
%
%In addition to checking annotations at runtime, \eacsl is able to detect a few
%errors at runtime.
%
%\subsection{Format String Vulnerabilities}
%\index{Format String Vulnerabilities}
%
%Whenever option \longopt{validate-format-strings} of \eacslgcc is set, \eacsl
%detects format-string vulnerabilities in \texttt{printf}-like function. Below is
%an example which incorrectly tries to print a string through a \texttt{\%d}
%format.
%
%\listingname{printf.c}
%\cinput{examples/printf.c}
%
%This error can be detected by \eacsl as follows.
%
%\begin{shell}
%\$ e-acsl-gcc.sh -Oprintf -c --validate-format-strings printf.c
%\$ ./printf.e-acsl
%printf: directive 1 ('%d') expects argument of type 'int' but the corresponding
%argument has type 'char*'
%Abort
%\end{shell}
%
%\subsection{Incorrect Usage of Libc Functions}
%\index{Libc}
%
%Whenever option \longopt{libc-replacements} of \eacslgcc is set, \eacsl is
%able to detect incorrect usage of the following libc functions:
%\begin{itemize}
%\item \texttt{memcmp}
%\item \texttt{memcpy}
%\item \texttt{memmove}
%\item \texttt{memset}
%\item \texttt{strcat}
%\item \texttt{strncat}
%\item \texttt{strcmp}
%\item \texttt{strcpy}
%\item \texttt{strncpy}
%\item \texttt{strlen}
%\end{itemize}
%
%For instance, the program below incorrectly uses \texttt{strcpy} because the
%destination string should contain at least 7 \texttt{char} (and not only 6).
%
%\listingname{strcpy.c}
%\cinput{examples/strcpy.c}
%
%This error can be reported by \eacsl as follows.
%
%\begin{shell}
%\$ e-acsl-gcc.sh -Ostrcpy -c --libc-replacements strcpy.c
%\$ ./strcpy.e-acsl
%strlen: insufficient space in destination string, at least 7 bytes required
%Abort
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Incomplete Programs} % <<<
%\label{sec:incomplete}
%
%Executing a \C program requires to have a complete application. However, the
%\eacsl plug-in does not necessarily require to have it to generate the
%instrumented code. It is still possible to instrument a partial program with no
%entry point or in which some functions remain undefined.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Programs without Main}
%\label{sec:no-main}
%\index{Program!Without main}
%
%The \eacsl plug-in can instrument a program without the \T{main} function.
%Consider for instance the following annotated program without \texttt{main}.
%
%\listingname{no\_main.i}
%\cinput{examples/no_main.i}
%
%The instrumented code is generated as usual, even though you get an additional
%warning.
%\begin{shell}
%\$ e-acsl-gcc.sh no_main.i -omonitored_no_main.i
%<skip preprocessing command line>
%[e-acsl] beginning translation.
%[e-acsl] warning: cannot find entry point `main'.
% Please use option `-main' for specifying a valid entry point.
% The generated program may miss memory instrumentation.
% if there are memory-related annotations.
%[e-acsl] translation done in project "e-acsl".
%\end{shell}
%
%This warning indicates that the instrumentation could be incorrect if the
%program contains memory-related annotations (see
%Section~\ref{sec:limits:no-main}). That is not the case here, so you can
%safely ignore it. Now, it is possible to compile the generated code with a \C
%compiler in a standard way, and even link it against additional files such as
%the following:
%
%\listingname{main.c}
%\cinput{examples/main.c}
%
%\begin{shell}
%\$ e-acsl-gcc.sh -C monitored_no_main.i main.c -Owith_main
%\$ ./with_main.e-acsl
%x = 65536
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\
%\subsection{Undefined Functions}
%\label{sec:no-code}
%\index{Function!Undefined}
%
%Similar to the above section \eacsl is capable of instrumenting a program which
%has definitions of arbitrary functions missing. Consider for instance the
%following annotated program for which the implementation of the function
%\texttt{my\_pow} is not provided.
%
%\listingname{no\_code.c}
%\cinput{examples/no_code.c}
%
%The instrumented code is generated as usual, even though you get an additional
%warning.
%\begin{shell}
%\$ e-acsl-gcc.sh -omonitored_no_code.i no_code.c
%[e-acsl] beginning translation.
%[e-acsl] warning: annotating undefined function `my_pow':
% the generated program may miss memory instrumentation
% if there are memory-related annotations.
%no_code.c:9:[kernel] warning: No code nor implicit assigns clause for function my_pow,
%generating default assigns from the prototype
%[e-acsl] translation done in project "e-acsl".
%\end{shell}
%
%Similar to the previous Section the warning indicates that the instrumentation
%could be incorrect if the program contains memory-related annotations (see
%Section~\ref{sec:limits:no-code}). That is still not the case here, so you can
%safely ignore it.
%
%\listingname{pow.i}
%\cinput{examples/pow.i}
%
%Similar to the example shown in the previous section you can generate the
%executable by providing definition of \T{my\_pow}.
%
%\begin{shell}
%\$ e-acsl-gcc.sh -C -Ono_code pow.i monitored_no_code.i
%\$ ./no_code.e-acsl
%Postcondition failed at line 5 in function my_pow.
%The failing predicate is:
%\old(n % 2 == 0) ==> \result >= 1.
%Aborted
%\end{shell}
%
%The execution of the corresponding binary fails at runtime: actually, our
%implementation of the function \texttt{my\_pow} overflows in case of large
%exponentiations.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Combining E-ACSL with Other Plug-ins} % <<<
%\label{sec:combine}
%
%As the \eacsl plug-in generates a new \framac project, it is easy to run any
%plug-in on the generated program, either in the same \framac session (thanks to
%the option \shortopt{then} or through the GUI), or in another one. The only
%issue might be that, depending on the plug-in, the analysis may be imperfect if
%the generated program uses \gmp or the dedicated memory library: both make
%intensive use of dynamic structures which are usually difficult to handle by
%analysis tools.
%
%Another way to combine \eacsl with other plug-ins is to run \eacsl last. For
%instance, the \rte plug-in~\cite{rte} may be used to generate annotations
%corresponding to runtime errors. Then the \eacsl plug-in may generate an
%instrumented program to verify that there are no such runtime errors during the
%execution of the program.
%
%Consider the following program.
%
%\listingname{combine.i}
%\cinput{examples/combine.i}
%To check at runtime that this program does not perform any runtime error (which
%are potential overflows in this case), just do:
%
%\begin{shell}
%\$ frama-c -rte combine.i -then -e-acsl -then-last -print
% -ocode monitored_combine.i
%\$ e-acsl-gcc.sh -C -Ocombine monitored_combine.i
%\$ ./combine.
%\end{shell}
%
%Automatic assertion generation can also be enabled via \eacslgcc directly via
%the following command:
%\begin{shell}
%\$ e-acsl-gcc.sh -c -Ocombine -omonitored_combine.i --rte=all
%\end{shell}
%
%Note the use of \eacslgcc \longopt{rte} option which asks \framac to combine
%\eacsl with \rte plug-in. This plug-in automatically instruments the input
%program with runtime
%assertions before running the \eacsl plug-in on it. \longopt{rte} option of
%\eacslgcc also accepts a comma-separated list of arguments indicating the types
%of assertions to generate. Consult the \eacslgcc man page for a detailed
%list of arguments accepted by \longopt{rte}.
%
%The present implementation of \eacslgcc does not fully support combining \eacsl
%with other \framac plug-ins. However it is still possible to instrument programs
%with \eacsl directly and use \eacslgcc to compile the generated code.
%
%If you run the \eacsl plug-in after another one, it first generates
%a new temporary project in which it links the analyzed program against its own
%library in order to generate the \framac internal representation of the \C
%program (\emph{aka} AST), as explained in Section~\ref{sec:run}. Consequently,
%even if the \eacsl plug-in keeps the maximum amount of information, the results
%of already executed analyzers (such as validity status of annotations) are not
%known in this new project. If you want to keep them, you have to set the option
%\shortopt{e-acsl-prepare} when the first analysis is asked for.
%
%In this context, the \eacsl plug-in does not generate code for annotations
%proven valid by another plug-in, except if you explicitly set the option
%\shortopt{e-acsl-valid}. For instance, \Eva~\cite{eva} is able to
%prove that there is no potential overflow in the previous program, so the \eacsl
%plug-in does not generate additional code for checking them if you run the
%following command.
%\begin{shell}
%\$ frama-c -e-acsl-prepare -rte combine.i -then -val -then -e-acsl \
% -then-last -print -ocode monitored_combine.i
%\end{shell}
%The additional code will be generated with one of the two following commands.
%\begin{shell}
%\$ frama-c -e-acsl-prepare -rte combine.i -then -val -then -e-acsl \
% -e-acsl-valid -then-last -print -ocode monitored_combine.i
%\$ frama-c -rte combine.i -then -val -then -e-acsl \
% -then-last -print -ocode monitored_combine.i
%\end{shell}
%In the first case, that is because it is explicitly required by the option
%\shortopt{e-acsl-valid} while, in the second case, that is because the option
%\shortopt{e-acsl-prepare} is not provided on the command line which results in
%the fact that the result of the value analysis are unknown when the \eacsl
%plug-in is running.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Customization} % <<<
%\label{sec:custom}
%
%There are several ways to customize the \eacsl plug-in.
%
%First, the name of the generated project -- which is \texttt{e-acsl} by default
%-- may be changed by setting the option \shortopt{e-acsl-project} option of the
%\eacsl plug-in. While there is not direct support for this option in \eacslgcc
%you can pass it using \shortopt{F} switch:
%
%\listingname{check.i}
%\cinput{examples/check.i}
%
%\begin{shell}
%\$ e-acsl-gcc.sh -F"-e-acsl-project foobar" check.i
%<skip preprocessing commands>
%[e-acsl] beginning translation.
%check.i:4:[e-acsl] warning: E-ACSL construct `right shift' is not yet supported.
% Ignoring annotation.
%[e-acsl] translation done in project "foobar".
%\end{shell}
%
%Further, the \eacsl plug-in option \shortopt{e-acsl-check} does not generate a
%new project but verifies that each annotation is translatable. Then it produces
%a summary as shown in the following example (left or right shifts in annotations
%are not yet supported by the \eacsl plug-in~\cite{eacsl-implem}).
%
%\begin{shell}
%\$ frama-c -e-acsl-check check.i
%<skip preprocessing commands>
%check.i:4:[e-acsl] warning: E-ACSL construct `left/right shift' is not yet supported.
% Ignoring annotation.
%[e-acsl] 0 annotation was ignored, being untypable.
%[e-acsl] 1 annotation was ignored, being unsupported.
%\end{shell}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%>>>
%\section{Verbosity Policy} % <<<
%\label{sec:verbose}
%
%By default, \eacsl does not provide much information when it is running. Mainly,
%it prints a message when it begins the translation, and another one when the
%translation is done. It may also display warnings when something requires the
%attention of the user, for instance if it is not able to translate an
%annotation. Such information is usually enough but, in some cases, you might
%want to get additional control on what is displayed. As quite usual with
%\framac plug-ins, \eacsl offers two different ways to do this: the verbosity
%level which indicates the \emph{amount} of information to display, and the
%message categories which indicate the \emph{kind} of information to display.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Verbosity Level}
%
%The amount of information displayed by the \eacsl plug-in is settable by the
%option \shortopt{e-acsl-verbose}. It is 1 by default. Below is indicated
%which information is displayed according to the verbosing level. The level $n$
%also displays the information of all the lower levels.
%
%\begin{center}
%\begin{tabular}{|l|l|}
%\hline
%\shortopt{e-acsl-verbose 0}& only warnings and errors\\
%\hline
%\shortopt{e-acsl-verbose 1}& beginning and ending of the translation\\
%\hline
%\shortopt{e-acsl-verbose 2}& different parts of the translation and
% functions-related information\\
%\hline
%\shortopt{e-acsl-verbose 3}& predicates- and statements-related information\\
%\hline
%\shortopt{e-acsl-verbose 4}& terms- and expressions-related information
%\\
%\hline
%\end{tabular}
%\end{center}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%\subsection{Message Categories}
%
%The kind of information to display is settable by the option
%\shortopt{e-acsl-msg-key} (and unsettable by the option
%\shortopt{e-acsl-msg-key-unset}). The different keys refer to different
%parts of the translation, as detailed below.
%
%\begin{center}
%\begin{tabular}{|l|l|}
%\hline
%analysis & minimization of the instrumentation for memory-related annotation
%(section~\ref{sec:memory}) \\
%\hline
%duplication & duplication of functions with contracts
%(section~\ref{sec:no-code}) \\
%\hline
%translation & translation of an annotation into \C code\\
%\hline
%typing & minimization of the instrumentation for integers
%(section~\ref{sec:integers}) \\
%\hline
%\end{tabular}
%\end{center}
% >>>
\newcommand{\eacslversion}{Chlorine-20180501+dev\xspace}
\newcommand{\fcversion}{Chlorine\xspace}
\newcommand{\fclangversion}{Chlorine+\xspace}
\newcommand{\fcversion}{Chlorine+\xspace}
\newcommand{\fcversion}{Potassium+\xspace}
......@@ -8,6 +8,22 @@ The \fclang plug-in supports the \acslpp extension of \acsl for C++ programs and
it is built on the Clang\footnote{https://clang.llvm.org/} compiler infrastructure and uses Clang for
parsing C++.
The \fclang plug-in intends to provide a full translation of C++ and \acslpp into the \framac internal representation, and from there be able to be analyzed by other \framac plug-ins.
This is a work in progress.
The following sections describe the limitations of the current implementation.
\begin{itemize}
\item The plug-in aims for the C++11 version of C++
\item \acslpp is described in the companion \acslpp reference manual, also a part of the \framac release.
\end{itemize}
Notes
\begin{itemize}
\item Which version of Clang?
\item Clang (8) support C++98 (except exported templates, later removed) and C++11 and current draft standard for C++1y
\item see https://clang.llvm.org/docs/UsersManual.html\#cxx for supported features in clang C++
\end{itemize}
%The \fclang version you are using is indicated by the
%command \texttt{frama-c -e-acsl-version}\optionidx{-}{e-acsl-version}. \fclang
%automatically translates an annotated C program into another program that fails
......
\chapter{Known Limitations}
The development of the \fclang plug-in is still ongoing.
%First, the \eacsl
%reference manual~\cite{eacsl} is not yet fully supported. Which annotations can
%already be translated into \C code and which cannot is defined in a separate
%document~\cite{eacsl-implem}. Second, even though we do our best to avoid them,
%bugs may exist. If you find a new one, please report it on the bug tracking
%system\footnote{\url{http://bts.frama-c.com}} (see Chapter 10 of the \framac
%User Manual~\cite{userman}). Third, there
%are some additional known limitations, which could be annoying for the user in
%some cases, but are tedious to lift. Please contact us if you are interested in
%lifting these limitations\footnote{Read \url{http://frama-c.com/support.html}
% for additional details.}.
%\section{Uninitialized Values}
%\index{Uninitialized value}
%
%As explained in Section~\ref{sec:runtime-error}, the \eacsl plug-in should never
%translate an annotation into \C code which can lead to a runtime error. This is
%enforced, except for uninitialized values which are values read before having
%been written.
%
%\listingname{uninitialized.i}
%\cinput{examples/uninitialized.i}
%
%If you generate the instrumented code, compile it, and finally execute it, you
%may get no runtime error depending on your \C compiler, but the behavior is
%actually undefined because the assertion reads the uninitialized variable
%\lstinline|x|. You should be caught by the \eacsl plug-in, but that is not
%the case yet.
%\begin{shell}
%
%\$ e-acsl-gcc.sh uninitialized.i -c -Omonitored_uninitialized
%monitored_uninitialized.i: In function 'main':
%monitored_uninitialized.i:44:16: warning: 'x' is used uninitialized in this function
%[-Wuninitialized]
%\$ ./monitored_uninitialized.e-acsl
%\end{shell}
%
%This is more a design choice than a limitation: should the \eacsl plug-in
%generate additional instrumentation to prevent such values from being evaluated,
%the generated code would be much more verbose and slower.
%
%If you really want to track such uninitializations in your annotation, you have
%to manually add calls to the \eacsl predicate
%\lstinline|\initialized|~\cite{eacsl}.
%
%\section{Incomplete Programs}
%
%Section~\ref{sec:incomplete} explains how the \eacsl plug-in is able to handle
%incomplete programs, which are either programs without main, or programs
%containing undefined functions (\emph{i.e.} functions without body).
%
%However, if such programs contain memory-related annotations, the generated code
%may be incorrect. That is made explicit by a warning displayed when the \eacsl
%plug-in is running (see examples of Sections~\ref{sec:no-main} and
%\ref{sec:no-code}).
%
%\subsection{Programs without Main}
%\index{Program!Without main}
%\label{sec:limits:no-main}
%
%The instrumentation in the generated program is partial for every program
%without main containing memory-related annotations, except if the option
%\optionuse{-}{e-acsl-full-mmodel} or the \eacsl plug-in (of \shortopt{M} option
%of \eacslgcc) is provided. In that case, violations of such annotations are
%undetected.
%
%Consider the following example.
%
%\listingname{valid\_no\_main.c}
%\cinput{examples/valid_no_main.c}
%
%You can generate the instrumented program as follows.
%\begin{shell}
%\$ e-acsl-gcc.sh -ML -omonitored_valid_no_main.i valid_no_main.c
%<skip preprocessing commands>
%[e-acsl] beginning translation.
%<skip warnings about annotations from the Frama-C libc
% which cannot be translated>
%[kernel] warning: no entry point specified:
% you must call function `__e_acsl_memory_init' by yourself.
%[e-acsl] translation done in project "e-acsl".
%\end{shell}
%
%The last warning states an important point: if this program is linked against
%another file containing \texttt{main} function, then this main function must
%be modified to insert a call to the function \texttt{\_\_e\_acsl\_memory\_init}
%\index{e\_acsl\_memory\_init@\texttt{\_\_e\_acsl\_memory\_init}} at the very
%beginning. This function plays a very important role: it initializes metadata
%storage used for tracking of memory blocks. Unless this call is inserted the
%run of a modified program is likely to fail.
%
%While it is possible to add such intrumentation manually we recommend using
%\eacslgcc. Consider the following incomplete program containing \T{main}:
%
%\listingname{modified\_main.c}
%\cinput{examples/modified_main.c}
%
%Then just compile and run it as explained in Section~\ref{sec:memory}.
%
%\begin{shell}
%\$ e-acsl-gcc.sh -M -omonitored_modified_main.i modified_main.c
%\$ e-acsl-gcc.sh -C -Ovalid_no_main monitored_modified_main.i monitored_valid_no_main.i
%\$ ./valid_no_main.e-acsl
%Assertion failed at line 11 in function f.
%The failing predicate is:
%freed: \valid(x).
%Aborted
%\end{shell}
%
%Also, if the unprovided main initializes some variables, running the
%instrumented code (linked against this main) could print some warnings from the
%\eacsl memory library\footnote{see
% \url{https://bts.frama-c.com/view.php?id=1696} for an example}.
%
%\subsection{Undefined Functions}
%\label{sec:limits:no-code}
%\index{Function!Undefined}
%
%The instrumentation in the generated program is partial for a program $p$ if and
%only if $p$ contains a memory-related annotation $a$ and an undefined function
%$f$ such that:
%\begin{itemize}
%\item either $f$ has an (even indirect) effect on a left-value occurring in $a$;
%\item or $a$ is one of the post-conditions of $f$.
%\end{itemize}
%A violation of such an annotation $a$ is undetected. There is no workaround yet.
%
%Also, the option \optionuse{-}{e-acsl-check} does not verify the annotations of
%undefined functions. There is also no workaround yet.
%
%\section{Recursive Function}
%\index{Function!Recursive}
%
%Programs containing recursive functions have the same limitations as the ones
%containing undefined functions (Section~\ref{sec:limits:no-code}) and
%memory-related annotations.
%
%%% JS: this issue should have been fixed:
%%% Also, even though there is no such annotations, the generated code may call a
%%% function before it is declared. When this behavior appears remains
%%% unspecifed. The generated code is however easy to fix by hand.
%
%\section{Variadic Functions}
%\index{Function!Variadic}
%
%Programs containing undefined variadic functions with contracts
%are not yet supported. Using the \variadic plug-in of \framac could be a
%solution in some cases, but its generated code cannot always be compiled.
%
%\section{Function Pointers}
%\index{Function!Pointer}
%
%Programs containing function pointers have the same limitations on
%memory-related annotations as the ones containing undefined or recursive
%functions.
%
%\section{Requirements to Input Programs}
%\index{Function!Input}
%
%\subsection{\eacsl Namespace}
%While \eacsl uses source-to-source transformations and not binary
%instrumentations it is important that the source code provided at input does
%not contain any variables or functions prefixed \T{\_\_e\_acsl\_}. \eacsl
%reserves this namespace for its transformations, and therefore an input program
%containing such symbols beforehand may fail to be instrumented or compiled.
%
%\subsection{Memory Management Functions}
%Programs providing custom definitions of \T{syscall}, \T{mmap} or
%\T{sbrk} should be rejected. Also, an input programs should not modify
%memory-management functions namely \T{malloc}, \T{calloc}, \T{realloc},
%\T{free}, \T{cfree}, \T{posix\_memalign} and \T{aligned\_alloc}. \eacsl relies
%on these functions in order to track heap memory. Further, correct heap memory
%monitoring requires to limit allocation and deallocation of heap memory to
%POSIX-compliant memory management functions
%listed above. Monitoring of programs that allocate memory using non-standard or
%obsolete functions (e.g., \T{valloc}, \T{memalign}, \T{pvalloc}) may not work
%correctly.
%
%\section{Limitations of E-ACSL Monitoring Libraries}
%\index{Function!Libraries}
%
%\eacsl monitoring library implementing segment-based memory model is fairly
%recent and presently has very limited support for 32-bit architectures. When
%using a 32-bit machine we recommend using the bittree-based memory model
%instead.
......@@ -44,6 +44,10 @@
\newcommand{\changeinsection}[2]{\textbf{Section \ref{sec:#1}:} #2.}
\newcommand{\todo}[1]{{\large \textbf{TODO: #1.}}}
\newcommand{\option}[1]{\textbf{#1}}
\newcommand{\irg}{\lstinline|framaCIRGen|i\xspace}
\newcommand{\fcl}{\lstinline|frama-clang|\xspace}
\newcommand{\fc}{\lstinline|frama-c|\xspace}
\newcommand{\markdiff}[1]{{\color{blue}{#1}}}
\newenvironment{markdiffenv}[1][]{%
......
......@@ -64,12 +64,23 @@ evolve in the future.
We gratefully thank the people who contributed to this document:
David R. Cok, Virgile Prevosto, Armand Pucetti, Franck Verdrine.
TBD-VESSEDIA
This work is done in the context of project VESSEDIA,
which received funding from the European Union's 2020
Research and Innovation Program under grant agreement
No. 731453.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\include{introduction}
\include{description}
\include{limitations}
\include{preprocessing}
\section{\acslpp constructs}
\textit{This section is not yet written}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
\section{Preprocessing}
As a refresher, the C/C++ preprocessor (CPP) (cf. \url{https://gcc.gnu.org/onlinedocs/cpp/}) conceptually implements the following operations on a C++ source file:
\begin{itemize}
\item The input is read and broken it into a sequence of physical lines according to the line terminators (ASCII character sequences \\r, \\n, or \\r\\n).
\item Each C trigraph is replaced by its corresponding character.
\item Any backslash-whitespace-line-terminator sequence is removed and the line that it ends is combined with the following line.
\item Comments are replaced by single spaces. This requires tokenizing the input to avoid recognizing comment markers within strings as indicating a comment. Note that this allows block comments to hide line terminations.
\item The input text is divided into preprocessing tokens grouped in logical lines. Each preprocessor token becomes a compiler token (except where \#\# concatenation occurs). However, \acslb tokens are slightly different, as described below.
\item The source text is transformed according to any preprocessing directives contained within it. Each preprocessing directive must be contained within one logical line
\end{itemize}
The result is a sequence of preprocessing tokens that is passed on to the
remaining compiler phases.
\subsection{Trigraphs}
TBD - are these supported by clang?
\subsection{Preprocessor tokens}
Preprocessor tokens (per CPP) belong to one of five categories. White space (space, tab, TBD) serves only to separate tokens; it is not needed between tokens whose concatenation is not a single token. Line terminators also separate tokens and also delineate certain features: preprocessing directives and string literals do not extend over more than one (logical) line.
\begin{itemize}
\item identifiers: character sequences that match the regular expression \texttt{[a-zA-Z\_][a-zA-Z\_0-9]*} . Dollar signs are allowed as non-digit identifier characters if the clang option \texttt{-fdollars-in-identifiers} is used. (TBD - on by default?)
\item number: character sequences that match the regular expression \texttt{[.]?[0-9]([0-9a-zA-Z.]|[eEpP][+-]))*} . (TBD - dollar signs also allowed?)
\item string literals: character sequence enclosed in " " or ' ' or < >, with \textbackslash " for " in a double-quoted literal (that is not a header file name) and \' for ' in a single-quoted literal.
\item punctuator: all single non-alphanumeric printable characters except \@, \# and `, and all multi-punctuation sequences that meaningful to C/C++ (e.g. >>>= ), but not an arbitrary multi-punctuation character sequence.
\\item other tokens: \@, \#, ` and all non-ASCII single characters.
\end{itemize}
TBD - unicode characters
Note that not all preprocessor tokens are valid C/C++ parser tokens. Tokens in the other category have no meaning to C/C++ and the \texttt{number} category allows many sequences that are not legal C/C++ numeric tokens. These tokens will generally provoke compiler errors. For example in C/C++, 0..2 is one token and is not interpreted as two consecutive numeric tokens.
\acsl and \acslpp have slightly different tokens than the above, so the preprocessor tokens need to be re-tokenized in some cases:
\begin{itemize}
\item The \@ token is a whitespace character in \acslb.
\item There are some \acslb multi-character punctuator tokens that are not
preprocessor tokens:
\begin{itemize}
\item[] $==>$ (logical implies)
\item[] $-->$ (bit-wise implies)
\item[] $<==>$ (logical equality)
\item[] $<-->$ (bit-wise equality)
\item[] TBD any more?
\end{itemize}
These \acslb tokens need to be assembled from multiple CPP tokens (and those CPP tokens must not be separated by white space)
\item A CPP numeric token that contains .. will not be a legal C/C++ number, but may be multiple legal \acslb tokens with the .. representing the range operator. For example, the single CPP token \texttt{0..last} is retokenized for \acslb as if it were written \texttt{0 .. last} .
\item \acslb allows certain built-in non-ASCII symbols, namely
\begin{itemize}
\item[] $\forall$ (unicode 0x2200) - universal quantifier
\item[] $\exists$ (unicode 0x2203) - existential quantifier
% \item[] $\equal$ (unicode 0x2261) - equality
% \item[] $\notequal$ (unicode 0x2262) - inequality
\item[] $\leq$ (unicode 0x2264) - less than or equal
\item[] $\geq$ (unicode 0x2265) - greater than or equal
% \item[] $\minus$ (unicode 0x2212) - minus
% \item[] $\implies$ (unicode 0x21D2) - implies
% \item[] $\equiv$ (unicode 0x21D4) - equivalence
\item[] $\bigwedge$ (unicode 0x2227) - logical and
\item[] $\bigvee$ (unicode 0x2228) - logical or
\item[] $\neg$ (unicode 0x00AC) - logical negation
% \item[] $\logicalxor$ (unicode 0x22BB) - logical inequivalence
\item[] $\subset$ (unicode 0x2208) - subset
% \item[] $\Boolean$ (unicode 0x1D539) - set of booleans
% \item[] $\Integer$ (unicode 0x2124) - set of integers
% \item[] $\Real$ (unicode 0x211D) - set of reals
\item TBD - more - ensure unicode equivalent
\end{itemize}
\end{itemize}
\subsection{Preprocessor directives}
A preprocessing directive consists of a line (after the previous preprocessing phaseshave been completed) that begins with optional white space, the '\#' character, additional optional white space, and a preprocessor directive identifier.
The preprocessing language contains a fixed set of preprocessing directive identifiers:
\begin{itemize}
\item \texttt{define}, \texttt{undef}
\item \texttt{if}, \texttt{ifdef}, \texttt{ifndef}, \texttt{elif}, \texttt{else}, \texttt{endif}
\item \texttt{warning}, \texttt{error}
\item \texttt{include}
\item \texttt{line}
\item \texttt{pragma}
\end{itemize}
In addition, identifiers that have been defined (by a \#define directive) as macros are expanded according to the macro expansion rules (not described here).
Now \acslb annotations are contained in C/C++ comments.
Consequently, any directives contained in the annotation are not seen when the source file is processed simply as a C/C++ source file. However, the effect of some directives lasts until the end of the source file.
Accordingly, \acslpp imposes constraints on the directives that may be present within annotations:
\begin{itemize}
\item \texttt{define} and \texttt{undef} are not permitted in an annotation
\item \texttt{if}, \texttt{ifdef}, \texttt{ifndef}, \texttt{elif}, \texttt{else}, \texttt{endif} are permitted but must be completely nested within the annotation in which they appear (an \#endif and its corresponding \#if/\#ifdef/\#ifndef must be in the same annotation comment.)
\item \texttt{warning} and \texttt{error} are permitted
\item \texttt{include} is permitted, but will cause errors if it contains, as is almost always the case, either \texttt{define} or \texttt{pragma} directives
\item \texttt{line} - TBD
\item \texttt{pragma} is not permitted
\end{itemize}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment