diff --git a/doc/userman/description.tex b/doc/userman/description.tex index e63d16166492e35102dd077b08ac5a9bdb2b262b..7f6a4d0a75a6d119d96dc0efe7c44e89c17cde54 100644 --- a/doc/userman/description.tex +++ b/doc/userman/description.tex @@ -66,7 +66,8 @@ The launching of \irg by \framac includes the following options by default. The \item \verb|-D__FC_MACHDEP_86_32| -- also set according to the chosen architecture \item \verb|-std=c++11| -- target C++11 features \item \verb|-nostdinc| -- use \fcl and \framac system header files, and not the compiler's own header files -\item \verb|-I$FRAMAC_SHARE/frama-clangs/libc++ -I$FRAMAC_SHARE/libc| -- include the \fclang and \framac header files, which contain system library definitions with \acslpp annotations (the paths used are controlled by the \fc options \lstinline|-cxx-c++stdlib-path| annd \lstinline|-cxx-cstdlib-path|). +\item \verb|-I$FRAMAC_SHARE/frama-clangs/libc++ -I$FRAMAC_SHARE/libc| -- include the \fclang and \framac header files, which contain system library definitions with \acslpp annotations (the paths used are controlled by the \fc options \lstinline|-cxx-c++stdlib-path| and \lstinline|-cxx-cstdlib-path|). +\item \verb|--annot| or \verb|--no-annot| according to the \verb|-annot| or \verb|-no-annot| \Framac kernel option \item \verb|-stop-annot-error| if the corresponding option (\lstinline|-fclang-warn-key=annot-error=abort|) is given to \framac \item options to set the level of info messages and warning messages, based on options on the \fc command-line \end{itemize} @@ -105,13 +106,17 @@ The most important of the options are these: Multiple instances of this option have a cumulative effect, in order (rather than later instances replacing earlier ones). \item \option{-print} -- prints out the input file seen by \fc; when \fcl is being used this is the input file after pre-processing and translation from \cpp to \C. Thus this output can be useful to see (and debug) the results of \fcl's transformations. + \item \option{-kernel-warn-key=annot-error=<val>} sets the behavior of \framac, including \fclang, when a parsing error is encountered. The default value (set by the kernel) is \texttt{abort}, which terminates processing upon the first error; a more useful alternative is \texttt{active}, which reports errors but continues processing further annotations. \item \option{-machdep <arg>} -- sets the target machine architecture, cf. \S\ref{sec:bit} \item \option{-fclang-msg-key <categories>} -- sets the amount of informational messages \item \option{-fclang-warn-key <categories>} -- sets the amount and behavior of warnings \item \option{-fclang-verbose <n>} -- sets the amount of information from the \fclang plug-in \item \option{-fclang-debug <n>} -- sets the amount of debug information from the \fclang plug-in + \item \option{-annot} -- enables processing \NAME annotations (enabled by default) + \item \option{-no-annot} -- disables processing \NAME annotations \end{itemize} +Note that the \framac option \verb|-no-pp-annot| is ignored by \fclang. Preprocessing is always performed on the source input (unless annotations are ignored entirely using \verb|-no-annot|). \section{Include directories} By default \irg is given the paths to the two directories containing the \fcl and \fc header files, which include \acslpp specifications for the \cpp library functions. The default paths to these directories are set by the \lstinline|-cxx-c++stdlib-path| and \lstinline|-cxx-cstdlib-path| options. @@ -135,29 +140,33 @@ As an example, to perform wp checking of files \lstinline|a.cpp| and \lstinline| \begin{itemize} \item The \textbf{-machdep} option to \framac. See the allowed values using the command\\ \centerline{ \lstinline|frama-c -machdep help|.} - For example, with a value of \lstinline|x86\_32|, \lstinline|sizeof(long)| has a value of 4, whereas with the option \textbf{-machdep x86\_64}, \lstinline|sizeof(long)| has a value of 8. -\item Alternately, the value of \textbf{-machdep} can be set instead using an environment variable: \lstinline|\_\_FC\_MACHDEP|. The variable can be set either in the shell environment or on the command line with \lstinline|-D\_\_FC\_MACHDEP=...| + For example, with a value of \lstinline|x86_32|, \lstinline|sizeof(long)| has a value of 4, whereas with the option \textbf{-machdep x86\_64}, \lstinline|sizeof(long)| has a value of 8. +\item Alternately, the value of \textbf{-machdep} can be set instead using an environment variable: \lstinline|__FC_MACHDEP|. The variable can be set either in the shell environment or on the command line with \lstinline|-D__FC_MACHDEP=...| \end{itemize} \section{Warnings, errors, and informational output} -Output messages arise from multiple places: from the \fcl plugin, from the \irg lexer and parser, from the \clang parser, and from the \framac kernel (as well as from any other plugins that may be invoked, such as wp). They are controlled by a number of options within the \fc kernel and each plugin. Remember that \cl and \irg options must be put in the \lstinline|-cpp-extra-args| option. +Output messages arise from multiple places: from the \fcl plugin, from the \irg lexer and parser, from the \clang parser, and from the \framac kernel (as well as from any other plugins that may be invoked, such as the \texttt{wp} plug-in). +They are controlled by a number of options within the \framac kernel and each plugin. Remember that \cl and \irg options must be put in the \lstinline|-cpp-extra-args| option. + +Output messages, including errors, are written to standard out, not to standard error, with the exception of a few fatal errors that cause immediate termination, such as the inabvility to launch \irg. These fatal errors are typically caused by misconfiguration of the installation. \subsection{Errors} -Error messages are always output. The key question is whether processing stops or continues u[on encountering an error. +Error messages are always output. +The key question is whether processing stops or continues upon encountering an error. Continuing can result in a cascade of unhelpful error messages, but stopping immediately can hide errors that occur later in source files. \begin{itemize} \item \lstinline|-Werror| is a clang and \irg option that causes any parser warnings to be treated as errors -\item \lstinline|--stop-annot-error| is a \irg option that causes prompt termination on errors (default is to continue) -\item \lstinline|-fclang-warn-key=annot-error=abort| is a \fcl plug-in option that will invoke \irg with \lstinline|--stop-annot-error|. \lstinline|error| and \lstinline|error_once| (instead of \lstinline|abort|) have the same effect; other values for the key will allow continuing after errors. +\item \lstinline|--stop-annot-error| is a \irg option that causes prompt termination on errors (the \irg default is to continue) +\item \lstinline|-fclang-warn-key=annot-error=abort| is a \fcl plug-in option that will invoke \irg with \lstinline|--stop-annot-error|. \lstinline|error| and \lstinline|error_once| (instead of \lstinline|abort|) have the same effect; other values for the key will allow continuing after errors. The default is \texttt{abort}. \end{itemize} \subsection{Warnings} -The various categories of warnings from \fcl can be seen with the command \lstinline|frama-c -fclang-warn-key help|. +The various categories of warnings from \fcl can be seen with the command \\ \centerline{\lstinline|frama-c -fclang-warn-key help|} Warning messages are emitted by default. Warning messages from \irg can be controlled with the \lstinline|-warn| option of \irg, or, equivalently, the \lstinline|-fclang-warn-key=parse| option of \fc. @@ -170,7 +179,7 @@ Warning messages from \irg can be controlled with the \lstinline|-warn| option o \item the \fc option \lstinline|-fclang-warn-key=parse=active| is the same as the \irg option \lstinline|--warn=1| \end{itemize} -\textit{The clang options are not currently integrated with the \fc warning and error key system.} +\textit{The \clang options are not currently integrated with the \fc warning and error key system.} \subsection{Informational output} @@ -186,6 +195,14 @@ invoked automatically. However, it can also be run standalone. In this mode it accepts command-line options and a single input file; it produces a C AST representing the translated \cpp, in the Cil (C Intermediate language) text format. +The exit code from \irg (and the \fclang plug-in) is +\begin{itemize} +\item 0 if processing is successful, including if only warnings or informational messages are emitted +\item 0 if there are no-fatal errors but \option{--no-exit-code} is enabled (the default) +\item 1 if there are non-fatal errors but \option{--exit-code} is enabled +\item 2 if there are fatal errors +\end{itemize} + \section{\irg specific options} \label{sec:fcloptions} @@ -215,6 +232,9 @@ In particular, a debug value of 1 shows the command-line that invokes \irg. \item \option{-{-}exit-code} -- if set, then the exit code of \irg is 1 if errors occur; this is not the default because then \lstinline|frama-c| would terminate upon any error in \lstinline|framaCIRGen| \item \option{-{-}no-exit-code} -- disables \option{-{-}exit-code}, so that the exit code is always 0. + + \item \option{-{-}annot} -- enables processing \NAME annotations (enabled by default) + \item \option{-{-}no-annot} -- disables processing \NAME annotations \end{itemize} @@ -224,7 +244,7 @@ Frama-Clang is built on the \clang \cpp parser. As such, the \irg executable accepts the clang compiler options and passes them on to clang. There are many of these. Many of these are irrelevant to \fcl as they relate to -code generation, whereas \fcl only uses clang for parsing, name +code generation, whereas \fcl only uses \clang for parsing, name and type resolution, and producing the AST. You can see a list of options by running \lstinline|framaCIRGen -help| diff --git a/doc/userman/fclangversion.tex b/doc/userman/fclangversion.tex index 2da020d30f373d9e8374c72b8606d738643b91fb..0695e3220c9a65cc9ec6d5b0429fdd2de1e29642 100644 --- a/doc/userman/fclangversion.tex +++ b/doc/userman/fclangversion.tex @@ -1,4 +1,4 @@ \newcommand{\version}{0.1-REVIEW\xspace} \newcommand{\fclangversion}{0.1.0-REVIEW\xspace} \newcommand{\fcversion}{Potassium+\xspace} -\newcommand{\clangversion}{7.0} +\newcommand{\clangversion}{7.0\xspace} diff --git a/doc/userman/grammar.tex b/doc/userman/grammar.tex index 2378f22fd679f82d179427556ac3d31ebdf5f3d2..fbe4ed558cdf762909ce04ca9398ef793fd6ace2 100644 --- a/doc/userman/grammar.tex +++ b/doc/userman/grammar.tex @@ -8,13 +8,13 @@ This material may be useful for developers and maintainers of \fclang; it is not Recall that \fclang uses clang to parse \lang and a custom parser to parse \acslpp annotations, jointly producing a representation of the \lang and \acslpp source input in the Frama-C intermediate language. The first version of the \acslpp custom parser, written during the STANCE project, used a hand-written bison-like parser, but with function pointers to record state and actions rather than using a tool-generated table to drive the parsing. This design proved to be too brittle and difficult to efficiently evolve as new features were added to \acslpp. -Consequently during the VESSEDIA project, the scanner and parser were completely rewritten, largely retaining the previous design's connections to clang, token definitions, name lookup and type resolution, ad AST generation. +Consequently during the VESSEDIA project, the scanner and parser were completely rewritten, largely retaining the previous design's connections to clang, token definitions, name lookup and type resolution, and AST generation. The new parser uses a recursive descent design in which the names of functions doing the parsing can match the names of non-terminals in the grammar. Consequently the implementation of the parser is much more readable, human checkable, and modifiable as the \acslpp language evolves. The drawback of this design is that \acslpp is not LL(1); it is not even LL(k) for fixed k. Thus some amount of backtracking is required. -The next few subsections describe the problematic aspects of \acslpp and how they are addressed. +The bulleted paragraphs below describe the problematic aspects of \acslpp and how they are addressed. The principal goal of an LL(k) formulation of a grammar is to be able to predict which grammar production is being seen in the input stream from a small amount of look-ahead. Most \acslpp constructs start with a unique keyword (e.g., clauses begin with \lstinline|requires|, \lstinline|ensures| etc.) which serves this purpopse. diff --git a/doc/userman/introduction.tex b/doc/userman/introduction.tex index 85aa5ac16df7461aa69b54eeeadb0b0fb177a9ca..3996d2201afec90b5e8293a487cf236780922c29 100644 --- a/doc/userman/introduction.tex +++ b/doc/userman/introduction.tex @@ -6,7 +6,7 @@ language~\cite{acsl}. This manual documents the \fclang plug-in of \framac, version \fclangversion. The \fclang plug-in supports the \acslpp extension of \acsl for \cpp programs and specifications; it is built on the \clang\footnote{https://clang.llvm.org/} compiler infrastructure and uses \clang for -parsing C++. The plug-in extends clang to parse \acslpp, translating it into an intermediate language for \C and \acsl. +parsing C++. The plug-in extends \clang to parse \acslpp, translating source files containing \cpp and \acslpp into an intermediate language for \C and \acsl. The \fclang plug-in intends to provide a full translation of \cpp and \acslpp into the \framac internal representation, and from there to allow \cpp programs and \acslpp specifications to be analyzed by other \framac plug-ins. \textit{This is a work in progress.} diff --git a/doc/userman/limitations.tex b/doc/userman/limitations.tex index 31f686f5d503120b71137e6db823efc7e308ae1e..588716fb1302c3c9c5c7ae294ec6114ab89d682e 100644 --- a/doc/userman/limitations.tex +++ b/doc/userman/limitations.tex @@ -2,13 +2,13 @@ The development of the \fclang plug-in is still ongoing. \fclang does not implement all of current C++ nor all of -\acslpp as defined in its language definition\cite{acslpp}. +\acslpp as defined in its language definition~\cite{acslpp}. The most important such limitations are listed in this section. -\section{Implementation of C++} - \textit{These lists are not (nearly) complete} +\section{Implementation of C++} + The following C++ features are not implemented in \acslpp. \begin{itemize} \item preprocessing is restricted within \acslpp annotations (cf. \S\ref{sec:preprocessing}) @@ -20,11 +20,38 @@ These \acslpp features are not yet implemented \begin{itemize} \item \fclang cannot process annotations that are separate from the source file +\item \acslpp specifications for standard \cpp library functions are still quite limited \item \acslpp definitions within template declarations -\item \acslpp specification for standard \cpp library functions are still quite limited \item ghost declarations \item model declarations \item set comprehensions +\item using (namespace) declarations (parsed but has no effect) +\item pure functions (parsed but incompletely implemented) +\item throws clause (parsed but not implemented in Frama-C) +\item interaction of throws and noexcept +\item parallel \textbackslash let \item frama-clang info/warn/error messages are not yet properly categorized and integrated with -fclang-log, -fclang-msg-key, fclang-warn-key. In particular, clang messages are completely independent +\item \textbackslash count and \textbackslash data are parsed but not yet implemented in \framac +\item semantics of multiple inheritance for annotations +\item formal parameters that are references have pre and post states + +\end{itemize} + +\section{\acslpp implementation} + +\begin{itemize} +\item parsing routines need work to improve robustness, to improve accuracy of locations, and to guard against leaking memory when parses fail +\item the term/predicate parsing methods should be refactored to avoid deep call stacks \end{itemize} + +%% Types for ternary, +/-, * etc. , unary & and * +%% AST for range +%% location types and AST +%% place of ext quantifiers in parsing -- should be a kind of primary I think? +%% method call disambiguation +%% use of isMethod +%% Refactor backtracking to reuse clang tokens instead of ACSL tokens + + +at 4100 \ No newline at end of file diff --git a/doc/userman/macros.tex b/doc/userman/macros.tex index 9de4399cbebcd8d46269ce8e3ae0534253f7d392..654de0cebfe611f5be088a7dd086ea71772f19a1 100644 --- a/doc/userman/macros.tex +++ b/doc/userman/macros.tex @@ -53,7 +53,7 @@ \newcommand{\changeinsection}[2]{\textbf{Section \ref{sec:#1}:} #2.} \newcommand{\todo}[1]{{\large \textbf{TODO: #1.}}} -\newcommand{\option}[1]{\textbf{#1}} +\newcommand{\option}[1]{\texttt{#1}} \newcommand{\markdiff}[1]{{\color{blue}{#1}}} \newenvironment{markdiffenv}[1][]{% diff --git a/doc/userman/main.tex b/doc/userman/main.tex index ecd4b8b730984c7200a702fd7b8cfb7be7fa95be..57c5a51dbf32369fc8a94997624c26cb9c082283 100644 --- a/doc/userman/main.tex +++ b/doc/userman/main.tex @@ -100,32 +100,34 @@ These items are not yet addressed \item Add/revise list of contributors/reviewers \item (Introduction) Verify the version of \framac and \clang supported \item Sort out the interactions among the options -verbose -debug -quiet -fclang-debug -fclang-verbose -fclang-msg-key -fclang-warn-key -\item Sort out and describe interactions among -annot -pp-annot and frama-clang options +\item -no-pp-annot is not recognized by \fclang \item Add comments about using system header files -\item Remove \$FRAMAC\_SHARE from the default command-line, and builtin.h from \$FRAMAC\_SHARE \item Comment on the demangling options \item Review the clang and \irg options -\item decribe the framaCIRGen options: -x -b -v --gen-impl-meth +\item describe the framaCIRGen options: -x -b -v --gen-impl-meth \item Comment on handling of unicode in the preprocessor \item Review the preprocessor tokens \item must the argument of -cpp-extra-args always be given with an = and quoted string \item Interactions between \_\_FC\_MACHDEP and -machdep \item Check whether static or dynamic linking is used in building +\item check behavior of -stop-annot-error +\item \lstinline|__FC_MACHDEP| vs. the option +\item fix where version option is described +\item Need to test that the various kinds of static_cast actually work as expected. +\item dynamic casting not yet working +\item rounding mode; additional builtin functions \end{itemize} Bugs \begin{itemize} \item Various parts of the tool use stdout and stderr differently -\item framaCIRGen -help expects an argument \item non-clang options not included in framaCIRGen -help \item The -o option is required -\item What is the -w option \item Fix how to do custom ASTs; The custom build instructions with a prefix to a temp directory fail -The only -\item framaCIRGen -o prints help summary twice \item it appears that trigraph processing cannot be turned off, is on by default for \irg but off by default for clang++; not sure why \item it does not seem that clang options are passed through, e.g. -fdollars-in-identifiers; \irg defaults are different than clang \item preprocessor implementation does not concatenate string literals +\item spurious error when running framaCIRGen -help \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% diff --git a/doc/userman/preprocessing.tex b/doc/userman/preprocessing.tex index d3a53d890df80c2850c1ed0e6f476c1ec4a61f1e..421c930d0cfea7270f230b83f1da65765b7c665f 100644 --- a/doc/userman/preprocessing.tex +++ b/doc/userman/preprocessing.tex @@ -31,23 +31,23 @@ The \fclang implementation operates as follows, on each \acslpp annotation comme \section{Trigraphs} -Trigraphs are defined for \cpp but will currently be removed in C++17. Since trigraph processing by clang occurs before any recognition of comments, trigraphs in \acslpp annotations are translated, if enabled in \clang. As they will be removed from \cpp, they are not recommended for use in \acslpp annotations. +Trigraphs are defined for \cpp but will currently be removed in C++17. Since trigraph processing by clang occurs before any recognition of comments, trigraphs in \acslpp annotations are translated, if enabled in \clang. As they will be removed from \cpp, they are not recommended for use in \acslpp annotations. Preprocessing of trigraphs is enabled by default. \section{Digraphs} Digraphs are alternate spellings of preprocessor tokens, in particular, of -punctuation character sequences. Digraphs in \acslpp annotations are translated just as they are in \cpp (by clang). +punctuation character sequences. Digraphs in \acslpp annotations are translated just as they are in \cpp (by \clang). \section{Preprocessor tokens} Preprocessor tokens (per CPP) belong to one of five categories. White space (space, tab) serves only to separate tokens; it is not needed between tokens whose concatenation is not a single token. Line terminators also separate tokens and also delineate certain features: preprocessing directives and string literals do not extend over more than one (logical) line. \begin{itemize} \item identifiers: character sequences that match the regular expression \\ \centerline{\texttt{[a-zA-Z\_][a-zA-Z\_0-9]*}} \\ -Dollar signs are also allowed as non-digit identifier characters if the clang option\\ \texttt{-fdollars-in-identifiers} is enabled, which it is by default. -Enable with \texttt{-fdollars-in-identifiers} ; disable with \texttt{-fno-dollars-in-identifiers} . +Dollar signs are also allowed as non-digit identifier characters if the clang option\\ \texttt{-fdollars-in-identifiers} is enabled, which it is by default. \\ +Enable with \texttt{-fdollars-in-identifiers} ; \\disable with \texttt{-fno-dollars-in-identifiers} . \item number: character sequences that match the regular expression \\ \centerline{\texttt{[.]?[0-9]([0-9a-zA-Z.]|[eEpP][+-]))*}} \\ - \item string literals: character sequence enclosed in " " or ' ' or < >, with \textbackslash " for " in a double-quoted literal (that is not a header file name) and \textbackslash ' for ' in a single-quoted literal, and \textbackslash\textbackslash for \textbackslash in either. After all other preprocessor phases are complete, the preprocessor concatenates adjacent string literals. + \item string literals: character sequence enclosed in " " or ' ' or < >, with \lstinline|\"| for \lstinline|"| in a double-quoted literal (that is not a header file name) and \lstinline|\'| for \lstinline|'| in a single-quoted literal, and \lstinline|\\| for \lstinline|\| in either. After all other preprocessor phases are complete, the preprocessor concatenates adjacent string literals. \item punctuator: all single non-alphanumeric printable characters except \verb|@|, \verb|#| and \verb|`|, and all multi-punctuation sequences that meaningful to \C/\cpp (e.g. > and >{>}= ), but not an arbitrary multi-punctuation character sequence. \item other tokens: \verb|@|, \verb|#|, \verb|`| and all non-ASCII single characters. @@ -114,8 +114,9 @@ Accordingly, \acslpp imposes constraints on the directives that may be present w \item the standard predefined macro names are permitted: \texttt{\_\_cpluscplus} (in \cpp compilers), \texttt{\_\_DATE\_\_}, - \texttt{\_\_LINE\_\_}, \texttt{\_\_TIME\_\_}, + \texttt{\_\_FILE\_\_}, + \texttt{\_\_LINE\_\_}, \texttt{\_\_STDC\_HOSTED\_\_} \end{itemize}