Skip to content
Snippets Groups Projects
Commit 2089a6d2 authored by Michele Alberti's avatar Michele Alberti
Browse files

Merge branch 'doc/julien/acas-xu' into 'master'

[DOC] CAISAR by Examples

Closes #12

See merge request laiser/caisar!53
parents 032e30d8 5684a139
No related branches found
No related tags found
No related merge requests found
Showing
with 796 additions and 54 deletions
......@@ -19,7 +19,7 @@ v2.1 or higher, which is typically avaible in all major GNU/Linux distributions.
To install CAISAR via [opam](https://opam.ocaml.org/), do the following:
```
$ opam install caisar
opam install caisar
```
### Docker image
......@@ -28,20 +28,20 @@ A ready-to-use [Docker](https://www.docker.com/) image of CAISAR is available on
[Docker Hub](https://hub.docker.com). To retrieve such an image, do the
following:
```
$ docker pull laiser/caisar:pub
docker pull laiser/caisar:pub
```
Alternatively, a [Docker](https://www.docker.com/) image for CAISAR can be
created locally by proceeding as follows:
```
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ make docker
git clone https://git.frama-c.com/pub/caisar
cd caisar
make docker
```
To run the CAISAR [Docker](https://www.docker.com/) image, do the following:
```
$ docker run -it laiser/caisar:pub sh
docker run -it laiser/caisar:pub sh
```
### From source code
......@@ -52,24 +52,24 @@ v2.1 or higher, which is typically avaible in all major GNU/Linux distributions.
To build and install CAISAR, do the following:
```
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ opam switch create --yes --no-install . 4.13.1
$ opam install . --deps-only --with-test --yes
$ make
$ make install
git clone https://git.frama-c.com/pub/caisar
cd caisar
opam switch create --yes --no-install . 4.13.1
opam install . --deps-only --with-test --yes
make
make install
```
To run the tests:
```
$ make test
make test
```
## Usage
To start using CAISAR, please run the command:
```
$ caisar --help
caisar --help
```
### Property verification
......@@ -79,7 +79,7 @@ machines (SVM).
The prototype command is:
```
$ caisar verify --prover=PROVER FILE
caisar verify --prover=PROVER FILE
```
`FILE` defines the property to verify, and it must be
......@@ -97,7 +97,7 @@ path to the prover executables should appear in the environment variable
Run the following command to confirm that CAISAR detects the installed provers:
```
$ caisar config --detect
caisar config --detect
```
The following are the provers for which a support is provided in CAISAR:
......
......@@ -74,6 +74,17 @@ command = "%e -mp %{nnet-onnx} -pp %f --timeout %t --domain zono poly --split"
driver = "caisar_drivers/pyrat_vnnlib.drv"
use_at_auto_level = 1
[ATP pyrat-acas]
name = "PyRAT"
alternative = "ACAS"
exec = "pyrat.py"
version_switch = "--version"
version_regexp = "PyRAT \\([0-9.]+\\)"
version_ok = "1.1"
command = "%e -mp %{nnet-onnx} -pp %f --timeout %t --domain zono --split --scorer coef"
driver = "caisar_drivers/pyrat.drv"
use_at_auto_level = 1
[ATP nnenum]
name = "nnenum"
exec = "nnenum.sh"
......
doc/_static/media/acas_xu.png

17.6 KiB

.. _acas_xu:
Functional properties of ACAS-Xu
********************************
ACAS-Xu stands for Aircraft Collision Avoidance System. Introduced for instance
in [Manfredi2016]_, it is a specification of a program which aim to output
signals for an aircraft in a situation where there is a potential for collision.
In the rest of this tutorial, we will use the flavour ACAS-Xu defined in
[Katz2017]_, where the authors aim to verify a neural network implementing part
of the ACAS-Xu specification. Its low dimensionality and well defined semantics
make it a *de facto* benchmark for machine learning verification.
Use case presentation
=====================
The system considers a 2D plane with two entities: the monitored airplane (the
"ownship") and an incoming airplane (the "intruder").
.. image:: _static/media/acas_xu.png
:alt: A picture with two aircrafts seen from above
In the original implementation, the system state has seven inputs:
* :math:`v_{own}`: speed of ownship
* :math:`v_{int}`: speed of intruder
* :math:`\rho`: distance from ownship to intruder
* :math:`\theta`: angle to intruder relative to ownship heading direction
* :math:`\psi`: heading angle of intruder relative to ownship heading direction
* :math:`\tau`: time until vertical separation
* :math:`a_{prev}`: previous advisory
It has five outputs, that correspond to the different direction advisories the
system can give:
* :math:`COC`: Clear Of Conflict
* :math:`WL`: Weak Left
* :math:`SL`: Strong Left
* :math:`WR`: Weak Right
* :math:`SR`: Strong Right
In the paper, the authors consider :math:`45` neural networks, for several
values of :math:`\tau` and :math:`a_{prev}`, that operate on five inputs only
while maintaining the same number of outputs. We will consider five-inputs
networks in the future.
Properties
----------
There are several functional properties one may want to verify on this system,
for instance:
* Guarantee that the system will never output COC advisory when the intruder is
nearby,
* Guarantee that the system will never output an advisory that may result in a
collision,
* Guarantee that the system will not output a strong advisory where a weak
variant would be enough.
Authors of [Katz2017]_ propose ten properties to verify. We will reproduce the
first and third properties here, and then show how to use CAISAR for verifying
whether a given neural network respects them.
**Property** :math:`\phi_1`
* **Definition.**
If the intruder is distant and is significantly slower than
the ownship, the score of a COC advisory will always be below a certain fixed
threshold.
* **Input constraints:**
* :math:`\rho \geq 55947.691`,
* :math:`v_{own} \geq 1145`,
* :math:`v_{int} \leq 60`.
* **Desired output property:**
* :math:`COC \leq 1500`.
**Property** :math:`\phi_3`
* **Definition.**
If the intruder is directly ahead and is moving towards the
ownship, the score for COC will not be minimal.
* **Input constraints:**
* :math:`1500 \leq \rho \leq 1800`,
* :math:`-0.06 \leq \theta \leq 0.06`,
* :math:`\psi \geq 3.10`,
* :math:`v_{own} \geq 980`,
* :math:`v_{int} \geq 960`.
* **Desired output property:**
* :math:`COC` is not the minimal score.
Modelling the problem using WhyML
---------------------------------
The first step for verifying anything with CAISAR is to write a specification
file that describe the problem to verify as a so-called *theory*. A theory can
be seen as a namespace inside which are defined logical terms, formulas and
verification goals. In particular, being based on the `Why3
<https://why3.lri.fr>`_ platform for deductive program verification, CAISAR
supports the Why3 specification language
`WhyML <https://why3.lri.fr/doc/syntaxref.html>`_, and inherits the Why3 standard
library of logical theories (integer, float and real arithmetic, *etc.*) and
basic programming data structures (arrays, queues, hash tables, *etc.*).
Let us try to model the property :math:`\phi_1` defined earlier. We will call
our theory ``ACASXU_P1``.
.. code-block:: whyml
theory ACASXU_P1
end
We will need to write down some numerical values. As of now, CAISAR allows
writing values using floating-point arithmetic only. Why3 defines a float type
and the relevant arithmetic operations according to the IEEE floating-point
standard in a theory, astutely called ``ieee_float``. Specifically, we will
import the ``Float64`` subtheory, that defines everything we need for 64-bit
precision floating-point numbers. We thus import it in our theory using the
``use`` keyword.
Our file looks like this so far:
.. code-block:: whyml
theory ACASXU_P1
use ieee_float.Float64
end
We would like to verify our propety given a certain neural network. To do this,
CAISAR extends the Why3 standard library for recognizing neural networks in ONNX
and NNet formats. Given a file of such formats, CAISAR internally builds a
theory named as the original neural network file, that contains the subtheories
``AsTuple`` and ``AsArray`` that provide logical symbols for describing the
input-output interface of a neural network as tuples and array, respectively. We
will only consider the ``AsTuple`` subtheory for this tutorial.
In particular, the theory built by CAISAR is equivalent to the following WhyML
file:
.. code-block:: whyml
theory NeuralNetworkFilename
theory AsTuple
type t
(* Tuple with as many elements as there are input *)
function nn_apply (t,_,...)
(* Tuple with as many elements as there are outputs *)
: (t,_,...)
end
(* Other stuff *)
end
Note how the ``AsTuple`` theory defines the ``nn_apply`` function symbol that
logically describes the input-output interface of a neural network using tuples.
More importantly, CAISAR defines this function to take in input a tuple with as
many elements as the inputs expected by the original neural network, and return
a tuple with as many elements as the outputs provided by the original neural
network.
As our neural network takes five inputs and provides five outputs, adding ``use
filename.AsTuple`` to our theory will provide a ``nn_apply`` function symbol
that takes a 5-elements tuple as input, and provides a 5-elements tuple as
output. Assuming we have a neural network named ``ACASXU_1_1.onnx``, the WhyML
file looks like this:
.. code-block:: whyml
theory ACASXU_P1
use ACASXU_1_1.AsTuple
use ieee_float.Float64
end
Now is the time to define our verification goal, that will call ``P1_1_1`` for
property :math:`\phi_1` on neural network :math:`N_{1,1}`.
We first model the inputs of the neural network :math:`\rho, \theta, \psi,
v_{own}, v_{int}` respectively as the floating-points constants :math:`x_i` for
:math:`i \in [0..4]`. Moreover, we constraint these to the range of
floating-point values each may take. According to the original authors, values
were normalized during the training of the network, and so we adapt the values
they provide in their `repository
<https://github.com/NeuralNetworkVerification/Marabou/tree/master/resources/properties>`_.
Then, we define the result of the application of ``net_apply`` on the inputs by
taking advantage of the WhyML pattern-matching, and define the output constraint
we want to enforce on the floating-point constant :math:`y_0` that we use to
model the advisory *COC*.
The final WhyML file looks like this:
.. code-block:: whyml
theory ACASXU_P1
use ACASXU_1_1.AsTuple
use ieee_float.Float64
goal P1_1_1: forall x0 x1 x2 x3 x4.
(0.5999999999999999777955395074968691915273666381835937500000000000:t) .<= x0 .<= (0.6798577687000000313588543576770462095737457275390625000000000000:t) ->
(-0.5:t) .<= x1 .<= (0.5:t) ->
(-0.5:t) .<= x2 .<= (0.5:t) ->
(0.4500000000000000111022302462515654042363166809082031250000000000:t) .<= x3 .<= (0.5:t) ->
(-0.5:t) .<= x4 .<= (-0.4500000000000000111022302462515654042363166809082031250000000000:t) ->
let (y0, _, _, _, _) = nn_apply x0 x1 x2 x3 x4 in
y0 .<= (3.9911256458999999630066213285317644476890563964843750000000000000:t)
end
This file is available, as is, under the ``/examples/acasxu/`` folder as
`property_1.why
<https://git.frama-c.com/pub/caisar/-/blob/master/examples/acasxu/property_1.why>`_.
The corresponding neural network in ONNX format is available under the
``/examples/acasxu/nets/onnx/`` folder as `ACASXU_1_1.onnx
<https://git.frama-c.com/pub/caisar/-/blob/master/examples/acasxu/nets/onnx/ACASXU_1_1.onnx>`_.
Verifying the property with CAISAR
----------------------------------
Once formalized, the specified property can be assessed by using CAISAR. We will
use the *open-source* provers CAISAR supports for verifying properties of neural
networks so to take advantage of the federating approach: whenever one prover
cannot provide an answer, another may instead. In particular, we will use
`Marabou <https://github.com/NeuralNetworkVerification/Marabou>`_ and `nnenum
<https://github.com/stanleybak/nnenum>`_.
Assuming the prover locations are available in ``PATH``, the following are the
CAISAR verification invocations using Marabou first and nnenum afterwords, for
verifying the ACAS-Xu property :math:`\phi_1`:
.. code-block:: console
$ caisar verify --prover Marabou -L examples/acasxu/nets/onnx --format whyml examples/acasxu/property_1.why -t 10m
[caisar] Goal P1_1_1: Timeout
.. code-block:: console
$ caisar verify --prover nnenum -L examples/acasxu/nets/onnx --format whyml examples/acasxu/property_1.why -t 10m
[caisar] Goal P1_1_1: Valid
Note that the previous commands set the verification time limit to 10 minutes
(*cf.* ``-t`` option), and the additional location ``examples/acasxu/nets/onnx``
(*cf.* ``-L`` option) for letting CAISAR correctly locate the neural network
file ``ACASXU_1_1.onnx`` that is used by the ``ACASXU_P1`` theory in
``property_1.why``.
Marabou is not able to prove the property valid in the specified time limit,
while nnenum does. In general, the result of a CAISAR verification is typically
either ``Valid``, ``Invalid``, ``Unknown`` or ``Timeout``. CAISAR may output
``Failure`` whenever the verification process fails for whatever reason
(typically, a prover internal failure).
Using more advanced WhyML constructs
------------------------------------
Let us model the ACAS-Xu property :math:`\phi_3`, and verify it for the neural
networks :math:`N_{1,1}` and :math:`N_{2,7}` [Katz2017]_.
From the modelling standpoint, the main evident difference concerns the desired
output property, meaining that *COC* should not be the minimal value. A
straightforward way to express this property is that the corresponding
floating-point constant :math:`y_0` is greater than or equal to at least one of
the other five outputs. This can be formalized in first-order logic as a
disjunction of clauses, that can be directly encoded into WhyML as follows:
.. code-block:: whyml
y0 .>= y1 \/ y0 .>= y2 \/ y0 .>= y3 \/ y0 .>= y4
The delicate point is how to model the same property for two different neural
networks. Of course, we could define a theory with two identical but distinct
verification goals or two entirely distinct theories in a same WhyML file.
However, these two solutions are not advisable in terms of clarity and
maintainability.
Reassuringly enough, WhyML provides all necessary features to come up with a
better solution. First, WhyML allows for naming used (sub)theories in order to
distinguish identical logic symbols coming from different theories. This is
critical for identifying the correct ``nn_apply`` symbols in the two
verification goals we will define. Second, WhyML allows for the hypotheses on
the floating-point constants modelling the neural network inputs to be exported
from the verification goal into the theory general context as axioms.
In the end, the WhyML file looks like this:
.. code-block:: whyml
theory ACASXU_P3
use ACASXU_1_1.AsTuple as N11
use ACASXU_2_7.AsTuple as N27
use ieee_float.Float64
constant x0:t
constant x1:t
constant x2:t
constant x3:t
constant x4:t
axiom H0:
(-0.3035311560999999769272506000561406835913658142089843750000000000:t) .<= x0 .<= (-0.2985528118999999924731980627257144078612327575683593750000000000:t)
axiom H1:
(-0.0095492965999999998572000947660853853449225425720214843750000000:t) .<= x1 .<= (0.0095492965999999998572000947660853853449225425720214843750000000:t)
axiom H2:
(0.4933803236000000036476365039561642333865165710449218750000000000:t) .<= x2 .<= (0.5:t)
axiom H3:
(0.2999999999999999888977697537484345957636833190917968750000000000:t) .<= x3 .<= (0.5:t)
axiom H4:
(0.2999999999999999888977697537484345957636833190917968750000000000:t) .<= x4 .<= (0.5:t)
goal P3_1_1:
let (y0, y1, y2, y3, y4) = N11.nn_apply x0 x1 x2 x3 x4 in
y0 .>= y1 \/ y0 .>= y2 \/ y0 .>= y3 \/ y0 .>= y4
goal P3_2_7:
let (y0, y1, y2, y3, y4) = N27.nn_apply x0 x1 x2 x3 x4 in
y0 .>= y1 \/ y0 .>= y2 \/ y0 .>= y3 \/ y0 .>= y4
end
Note how the two verification goals ``P3_1_1`` and ``P3_2_7`` are clearly almost
identical, but for the ``nn_apply`` logic symbol used, identifying respectively
the ``ACASXU_1_1.onnx`` and ``ACASXU_2_7.onnx`` neural networks.
We can then verify the resulting verification problem as before:
.. code-block:: console
$ caisar verify --prover Marabou -L examples/acasxu/nets/onnx --format whyml examples/acasxu/property_3.why -t 10m
[caisar] Goal P3_1_1: Timeout
[caisar] Goal P3_2_7: Valid
.. code-block:: console
$ caisar verify --prover nnenum -L examples/acasxu/nets/onnx --format whyml examples/acasxu/property_3.why -t 10m
[caisar] Goal P3_1_1: Valid
[caisar] Goal P3_2_7: Valid
.. [Manfredi2016] G. Manfredi and Y. Jestin, *An introduction to ACAS Xu and the
challenges ahead*, 2016 IEEE/AIAA 35th Digital Avionics Systems Conference
(DASC), 2016, pp. 1-9, doi: 10.1109/DASC.2016.7778055
.. [Katz2017] Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.
(2017). *Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks.*
CAV 2017, doi: 10.1007/978-3-319-63387-9_5
......@@ -14,10 +14,13 @@
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys
sys.path.append(os.path.abspath('./ext'))
# -- Project information -----------------------------------------------------
project = 'Caisar'
project = 'CAISAR'
copyright = '2022, Michele Alberti, Julien Girard, François Bobot'
author = 'Michele Alberti, Julien Girard, François Bobot'
......@@ -28,6 +31,10 @@ author = 'Michele Alberti, Julien Girard, François Bobot'
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.mathjax',
'sphinx.ext.graphviz',
# 'sphinxcontrib.bibtex',
'why3'
]
# Add any paths that contain templates here, relative to this directory.
......@@ -49,4 +56,4 @@ html_theme = 'alabaster'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
\ No newline at end of file
html_static_path = ['_static']
Table of content
================
.. toctree::
:maxdepth: 2
:caption: Contents:
installation
usage
modules
.. _examples:
CAISAR by Examples
==================
This page regroups some example use cases for CAISAR. All examples will describe
the use case, the formalization using the WhyML specification language, and the
CAISAR execution.
.. toctree::
:maxdepth: 2
:caption: CAISAR by Examples:
acas_xu
mnist
from pygments.lexer import RegexLexer, words
from pygments.token import Text, Comment, Operator, Keyword, Name, String, Number, Punctuation, Error
class WhyMLLexer(RegexLexer):
name = 'WhyML'
aliases = 'whyml'
keywords = (
'abstract', 'absurd', 'alias', 'any', 'as', 'assert', 'assume', 'at', 'axiom',
'begin', 'break', 'by', 'check', 'clone', 'coinductive', 'constant', 'continue',
'diverges', 'do', 'done', 'downto',
'else', 'end', 'ensures', 'epsilon', 'exception', 'exists', 'export',
'false', 'float', 'for', 'forall', 'fun', 'function', 'ghost', 'goal',
'if', 'import', 'in', 'inductive', 'invariant', 'label', 'lemma', 'let',
'match', 'meta', 'module', 'mutable', 'not', 'old',
'partial', 'predicate', 'private', 'pure',
'raise', 'raises', 'range', 'reads', 'rec', 'ref', 'requires', 'return', 'returns',
'scope', 'so', 'then', 'theory', 'to', 'true', 'try', 'type', 'use', 'val', 'variant',
'while', 'with', 'writes',
)
tokens = {
'root': [
(r'\s+', Text),
(r'\(\*\)', Operator),
(r'\(\*', Comment, 'comment'),
(r'\[@[^]]*\]', Comment),
(words(keywords, suffix=r'\b'), Keyword),
(r'[-~!%^&*+=|?<>/\\]', Operator),
(r'[][{};:.,()]', Punctuation),
(r"[^\W\d][\w']*", Name),
(r'\bresult\b', Name.Builtin.Pseudo),
(r'-?\d\d*([.]\d*)?([eE][+-]?\d\d*)', Number.Float),
(r'-?0[xX][\da-fA-F][\da-fA-F]*([.][\da-fA-F]*)?([pP][+-]?\d\d*)', Number.Float),
(r'0[xX][\da-fA-F][\da-fA-F_]*', Number.Hex),
(r'0[oO][0-7][0-7_]*', Number.Oct),
(r'0[bB][01][01_]*', Number.Bin),
(r'\d[\d_]*', Number.Integer),
(r"'", Keyword),
(r'"', String.Double, 'string'),
],
'comment': [
(r'[^(*)]+', Comment),
(r'\(\*', Comment, '#push'),
(r'\*\)', Comment, '#pop'),
(r'[(*)]', Comment),
],
'string': [
(r'[^\\"]+', String.Double),
(r'\\[\\"\'ntbr]', String.Escape),
(r'\\[0-9]{3}', String.Escape),
(r'\\x[0-9a-fA-F]{2}', String.Escape),
(r'\\\n', String.Double),
(r'"', String.Double, '#pop'),
],
}
from sphinx.highlighting import lexers
lexers['whyml'] = WhyMLLexer(startinline=True)
import re
from docutils import nodes
from sphinx import addnodes
from sphinx.directives import ObjectDescription
from sphinx.domains import Domain
from sphinx.roles import XRefRole
from sphinx.util.docutils import SphinxDirective
from sphinx.util.nodes import make_refnode
ws_re = re.compile(r'\s+')
class Why3ToolRole(XRefRole):
def process_link(self, env, refnode, has_explicit_title, title, target):
target = ws_re.sub(' ', target)
if target.startswith('why3 '):
target = target[5:]
return title, target
attr_re = re.compile(r'\[@(.*)\]')
class Why3AttributeRole(XRefRole):
def process_link(self, env, refnode, has_explicit_title, title, target):
target = ws_re.sub(' ', target)
m = attr_re.match(target)
if m:
target = m.group(1)
return title, target
class Why3ToolDirective(SphinxDirective):
has_content = False
required_arguments = 1
optional_arguments = 0
final_argument_whitespace = True
option_spec = {}
def run(self):
fullname = ws_re.sub(' ', self.arguments[0].strip())
targetname = '%s-%s' % (self.name, fullname)
node = nodes.target('', '', ids = [targetname])
self.state.document.note_explicit_target(node)
indexentry = '%s; command' % (fullname,)
inode = addnodes.index(entries = [('pair', indexentry, targetname, '', None)])
domain = self.env.get_domain('why3')
domain.add_object('tool', fullname, targetname)
return [inode, node]
class Why3Thing(ObjectDescription):
has_content = True
required_arguments = 1
optional_arguments = 0
final_argument_whitespace = True
option_spec = {}
thing_index = ''
thing_kind = ''
def handle_signature(self, sig, signode):
signode += addnodes.desc_name(text = sig)
return sig
def add_target_and_index(self, name, sig, signode):
targetname = '%s-%s' % (self.name, name)
signode['ids'].append(targetname)
indexentry = '%s; %s' % (name, self.thing_index)
self.indexnode['entries'].append(('pair', indexentry, targetname, '', None))
domain = self.env.get_domain('why3')
domain.add_object(self.thing_kind, name, targetname)
class Why3Attribute(Why3Thing):
thing_index = 'attribute'
thing_kind = 'attribute'
class Why3Debug(Why3Thing):
thing_index = 'debug flag'
thing_kind = 'debug'
class Why3Meta(Why3Thing):
thing_index = 'meta'
thing_kind = 'meta'
class Why3Transform(Why3Thing):
thing_index = 'transformation'
thing_kind = 'transform'
class Why3Domain(Domain):
name = 'why3'
label = 'Why3'
roles = {
'attribute': Why3AttributeRole(),
'debug': XRefRole(),
'meta': XRefRole(),
'tool': Why3ToolRole(),
'transform': XRefRole(),
}
directives = {
'attribute': Why3Attribute,
'debug': Why3Debug,
'meta': Why3Meta,
'tool': Why3ToolDirective,
'transform': Why3Transform,
}
indices = {
}
initial_data = {}
initial_data['objects'] = {
'attribute': {},
'debug': {},
'meta': {},
'tool': {},
'transform': {},
}
def get_objects(self):
for role, objects in self.data['objects'].items():
prio = 0 # self.object_types[role].attrs['searchprio']
for name, (docname, targetname) in objects.items():
yield (name, name, role, docname, targetname, prio)
def resolve_xref(self, env, fromdocname, builder, role, targetname, node, contnode):
resolved = self.data['objects'][role].get(targetname)
if resolved:
(todocname, targetname) = resolved
return make_refnode(builder, fromdocname, todocname, targetname, contnode, targetname)
return None
def add_object(self, role, name, targetname):
self.data['objects'][role][name] = (self.env.docname, targetname)
def setup(app):
app.add_domain(Why3Domain)
return {
'version': '0.1',
'parallel_read_safe': True,
'parallel_write_safe': True,
}
......@@ -21,16 +21,15 @@ Content
=======
.. toctree::
:maxdepth: 2
:caption: Contents:
:maxdepth: 3
:caption: Contents
installation
usage
modules
examples
Indices and tables
==================
* :ref:`search`
..
Indices and tables
==================
* :ref:`search`
......@@ -18,7 +18,9 @@ GNU/Linux distributions.
To install CAISAR via opam, do the following:
``$ opam install caisar``
.. code-block:: console
$ opam install caisar
Install through Docker
----------------------
......@@ -28,33 +30,41 @@ A ready-to-use Docker image of CAISAR is available on
`Docker Hub <https://hub.docker.com>`_. To retrieve it, do the
following:
``$ docker pull laiser/caisar:pub``
.. code-block:: console
$ docker pull laiser/caisar:pub
Alternatively, a Docker image for CAISAR can be
created locally by proceeding as follows:
::
.. code-block:: console
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ make docker
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ make docker
To run the CAISAR Docker image, do the following:
``$ docker run -it laiser/caisar:pub sh``
.. code-block:: console
$ docker run -it laiser/caisar:pub sh
Compile from source
-------------------
To build and install CAISAR, do the following:
::
.. code-block:: console
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ opam switch create --yes --no-install . 4.13.1
$ opam install . --deps-only --with-test --yes
$ make
$ make install
$ git clone https://git.frama-c.com/pub/caisar
$ cd caisar
$ opam switch create --yes --no-install . 4.13.1
$ opam install . --deps-only --with-test --yes
$ make
$ make install
To run the tests:
``$ make test``
.. code-block:: console
$ make test
.. _mnist:
(Local) Robustness of MNIST dataset
***********************************
CAISAR provides a convenient way for verifying (local) robustness properties of
neural networks on datasets, for classification problems only, in a specific
``CSV`` format. In particular, each of the ``CSV`` lines is interpreted as
providing the classification label in the first column, and the dataset element
features in the other columns.
We recall that a neural network is deemed robust on a dataset element whenever
it classify with a same label all other elements being at an
:math:`l_\infty`-distance of at most :math:`\epsilon \geq 0` from it. More in
general, a neural network is deemed (locally) robust on a dataset whenever the
former property is valid on all the dataset elements. The CAISAR standard
library specifies such a property in terms of the predicate ``robust``, which
CAISAR implements as a builtin.
In the following, we will describe how to use CAISAR for verifying a neural
network robust on (a fragment of) the MNIST dataset.
Use case presentation
=====================
MNIST is a dataset of handwritten digits normalized and centered to fit into
grayscale images of :math:`28 \times 28` pixels, along with the classification
labels [LiDeng2012]_. Although it is mostly irrelevant as dataset for
benchmarking machine learning models for computer vision tasks, MNIST is still
valuable for assessing robustness properties by means of formal method tools.
CAISAR provides in `mnist_test.csv
<https://git.frama-c.com/pub/caisar/-/blob/master/examples/mnist/csv/mnist_test.csv>`_
a fragment (:math:`100` images) of the MNIST dataset under the
``examples/mnist/csv`` folder. Each line in this file represents an MNIST image:
in particular, the first column represents the classification label, and the
remaining :math:`784` columns represent the greyscale value of the respective
pixels.
Properties
----------
Generally speaking, the property we are interested in verifying is the local
robustness of a machine learning model on the elements of a set. That is, the
model classifies all elements of a set being at an :math:`l_\infty`-distance of
at most :math:`\epsilon \geq 0` with a same label. A general formulation of this
latter states that, given a classifier :math:`C`, a set :math:`X`, and some
perturbation :math:`\epsilon \geq 0`, it must hold that :math:`\forall x,x'
\in X. \ \lVert x - x' \rVert_\infty \leq \epsilon \Rightarrow C(x) = C(x')`.
Since we actually deal with a *dataset* of finite elements for which we also
know the expected labels, we will instead verify a slightly different property:
given a classifier :math:`C`, an element :math:`x \in X` such that :math:`C(x) =
y`, and some perturbation :math:`\epsilon \geq 0`, it must hold that
:math:`\forall x'. \ \lVert x - x' \rVert_\infty \leq \epsilon \Rightarrow C(x)
= y = C(x')`. Obviously, such a property must be verified for all elements of a
dataset.
Modelling the problem using WhyML
---------------------------------
As described for the example on :ref:`acas_xu`, we first need to write a
specification file containing a WhyML theory to describe the verification
problem. In principle, we need to formalize the local robustness property as
well as the notions of classifier and dataset.
The CAISAR standard library `caisar.mlw
<https://git.frama-c.com/pub/caisar/-/blob/master/stdlib/caisar.mlw>`_ provides
some utilities for dealing with verification problems about classification
datasets. Of particular interest for us is the ``robust`` predicate, defined in
the theory ``DataSetProps`` as follows:
.. code-block:: whyml
predicate robust (m: model) (d: dataset) (eps: t) =
forall i: int. 0 <= i < d.data.length -> robust_at m d.data[i] eps
Note that the predicate is defined over a ``model``, a ``dataset`` and a
floating-point value ``eps``. The latter determines the perturbation
:math:`\epsilon`. The other two are custom WhyML types that respectively
formalize the notions of classifier and dataset in CAISAR. These types are
respectively defined in the ``Model`` and ``DataSetClassification`` theories.
Moreover, it is defined in terms of the predicate ``robust_at`` that formalizes
the local robustness property:
.. code-block:: whyml
predicate robust_at (m: model) (d: datum) (eps: t) =
forall x': features.
let (x, _) = d in
linfty_distance x x' eps ->
predict m x = predict m x'
Note that a ``datum`` is a dataset element given as a pair of *features* and
(classification) *label*. Morever, ``linfty_distance`` is a predicate that
describes how two arrays of floating-point values (*i.e.* ``features``) are
considered close up-to a pertubation ``eps``, while ``predict`` is a function
that formalizes the execution of a model on some features to obtain the
corresponding classification label.
In order to use the ``robust`` predicate in our WhyML specification, we need
values of types ``model`` and ``dataset``. For the former, CAISAR makes
available the constant ``model`` upon interpreting the ``AsArray`` subtheory
that is built by the extension of the Why3 standard library for recognizing
neural network ONNX and NNet formats. For the latter, the CAISAR standard
library provides the constant ``dataset`` in ``DataSetClassification`` that will
be interpreted as the actual dataset the user needs to provide via the
command-line interface when launching a CAISAR verification.
Assuming we have a neural network named ``MNIST_256_2.onnx`` for MNIST
classification, the final WhyML file for specifying its local robustness on a
dataset, with each element's feature pertubed by :math:`1 \%`, looks like this:
.. code-block:: whyml
theory MNIST
use MNIST_256_2.AsArray
use ieee_float.Float64
use caisar.DataSetClassification
use caisar.DataSetProps
goal robustness:
let normalized_dataset = min_max_scale true (0.0:t) (1.0:t) dataset in
robust model normalized_dataset (0.0100000000000000002081668171172168513294309377670288085937500000:t)
end
Note the presence of the ``min_max_scale`` function defined in
``DataSetClassification`` for normalizing all feature values in :math:`[0,1]`.
Besides classic *Min-Max scaling*, CAISAR also provides ``z_norm`` function for
*Z-normalization*.
This file is available, as is, under the ``/examples/mnist/`` folder as
`property.why
<https://git.frama-c.com/pub/caisar/-/blob/master/examples/mnist/property.why>`_.
The corresponding neural network in ONNX format is available under the
``/examples/mnist/nets/`` folder as `MNIST_256_2.onnx
<https://git.frama-c.com/pub/caisar/-/blob/master/examples/mnist/nets/MNIST_256_2.onnx>`_.
Verifying the property with CAISAR
----------------------------------
Now we may verify whether the previous robustness specification holds on the
MNIST fragment ``mnist_test.csv`` by means of the nnenum prover. This can be
done via CAISAR as follows:
.. code-block:: console
$ caisar verify --prover nnenum -L examples/mnist/nets --format whyml --dataset=examples/mnist/csv/mnist_test.csv examples/mnist/property.why
[caisar] Goal robustness: Invalid
The result tells us that there exists at least one image in ``mnist_test.csv``
for which nnenum is sure that the model ``MNIST_256_2.onnx`` is not robust with
respect to :math:`1 \%` perturbation. At the moment, CAISAR is not able to tell
which are the images in the dataset that cause such result.
.. [LiDeng2012] Li Deng, *The MNIST Database of Handwritten Digit Images for
Machine Learning Research*, IEEE Signal Process. Mag., 2012, pp.
141-142, doi: 10.1109/MSP.2012.2211477
......@@ -7,10 +7,10 @@ Property verification
---------------------
CAISAR can be used to verify properties on neural networks.
The prototype command is ``caisar verify --solver=SOLVER property.mlw``
File property.mlw defines the goal you want to verify,
The prototype command is ``caisar verify --prover=PROVER property.mlw``.
File ``property.mlw`` defines the goal you want to verify,
written in the Whyml language.
Example of .mlw files can be seen in tests folder.
Example of ``.mlw`` files can be seen in tests folder.
External verifiers detection
----------------------------
......
File added
File added
File added
File added
File added
File added
File added
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment