diff --git a/doc/interpretation.rst b/doc/interpretation.rst index dd39947ba8b09c03c8c86f0facb69425931aa4ef..ab77147e0c86aeef26a75edf2e6f91fddaf3e65c 100644 --- a/doc/interpretation.rst +++ b/doc/interpretation.rst @@ -108,7 +108,7 @@ Types (** Type describing a neural network. CAISAR is able to interpret a symbol of type nn - to get its calculus graph. **) + to get its control flow graph. **) type nn type format = ONNX | NNet @@ -117,16 +117,11 @@ Functions .. code-block:: whyml - (** Returns a symbol of type nn from the neural network located at n. - TODO: what happens when file located at n is not a neural network? - What happens if n is an invalid path (permission, non-existing file)? - **) + (** Returns a symbol of type nn from the neural network located at n. **) function read_neural_network (n: string) (f: format) : nn (** Returns a vector that represents the application of neural network nn - on input vector v. - TODO: what happens when v does not fit the size of neural network input? - **) + on input vector v.**) function (@@) (n: nn) (v: vector 'a) : vector 'a Dataset diff --git a/doc/mnist.rst b/doc/mnist.rst index c80180c615e20f6cf9aeed8fa74a23ae3ac00930..9c5525726c08336b1e97e41f2ec3977cec87b415 100644 --- a/doc/mnist.rst +++ b/doc/mnist.rst @@ -33,7 +33,8 @@ valuable for assessing robustness properties by means of formal method tools. CAISAR provides in `mnist_test.csv <https://git.frama-c.com/pub/caisar/-/blob/master/examples/mnist/csv/mnist_test.csv>`_ a fragment (:math:`100` images) of the MNIST dataset under the -``examples/mnist/csv`` folder. Each line in this file represents an MNIST image: +``examples/mnist/csv`` folder. +Each line in this file represents an MNIST image: in particular, the first column represents the classification label, and the remaining :math:`784` columns represent the greyscale value of the respective pixels, rescaled into :math:`[0, 1]`. @@ -53,6 +54,8 @@ given :math:`\epsilon \in \mathbb{R}`, a classifier is property is the following: :math:`\forall x,x' \in X. \ \lVert x - x' \rVert_\infty \leq \epsilon \Rightarrow C(x) = C(x')`. +.. _property: + Since we actually deal with a *dataset* of finite elements, we will instead verify a slightly different property: given a classifier :math:`C`, an element :math:`x \in X`, and some perturbation :math:`\epsilon \geq 0`, it must hold @@ -67,68 +70,82 @@ As described for the example on :ref:`acas_xu`, we first need to write a specification file containing a WhyML theory to describe the verification problem. In principle, we need to formalize the local robustness property as well as the notions of classifier and dataset. +The CAISAR interpretation language `caisar.mlw +<https://git.frama-c.com/pub/caisar/-/blob/master/stdlib/interpretation.mlw>`_ provide theories that defines those concepts. +We will import the relevant theories with the ``use`` keyword. +As described in :ref:`interpretation`, the ``Vector`` theory provides +a vector type, a getter (``[]``) operation and a ``valid_index`` predicate +that checks whether the get operation is in bound with the vector length. +``NeuralNetwork`` defines a type and an application function (``@@``). +We will also need integers and floating point numbers +to declare and define :math:`\epsilon`. + +.. code-block:: whyml + + use ieee_float.Float64 + use int.Int + use interpretation.Vector + use interpretation.NeuralNetwork + use interpretation.Dataset + + type image = vector t + type label_ = int -The CAISAR standard library `caisar.mlw -<https://git.frama-c.com/pub/caisar/-/blob/master/stdlib/caisar.mlw>`_ provides -some utilities for dealing with verification problems about classification -datasets. Of particular interest for us is the ``robust`` predicate, defined in -the theory ``DatasetClassificationProps`` as follows: +We will first write some predicates to take into account the fact +that we only have two available labels in the dataset sample (0, 1 and 2) +and that the +input images are normalized (floating point values between 0. and 1.). +We will also define a predicate that, given a label ``l`` and an image ``i`` checks whether the neural network ``nn`` indeed advises the correct label. .. code-block:: whyml - predicate robust (m: model) (d: dataset) (eps: t) = - forall i: int. 0 <= i < d.data.length -> robust_at m d.data[i] eps + predicate valid_image (i: image) = + forall v: index. valid_index i v -> (0.0: t) .<= i[v] .<= (1.0: t) + + predicate valid_label (l: label_) = 0 <= l <= 2 + + predicate advises (n: nn) (i: image) (l: label_) = + valid_label l -> + forall j: int. valid_label j -> j <> l -> (n@@i)[l] .> (n@@i)[j] -Note that the predicate is defined over a ``model``, a ``dataset`` and a -floating-point value ``eps``. The latter determines the perturbation -:math:`\epsilon`. The other two are custom WhyML types that respectively -formalize the notions of classifier and dataset in CAISAR. These types are -both defined in the ``DatasetClassification`` theory. +We write that :math:`\lVert x - x' \rVert_\infty \leq \epsilon` with another +predicate: + +.. code-block:: whyml + + predicate bounded_by_epsilon (i: image) (eps: t) = + forall v: index. valid_index i v -> .- eps .<= i[v] .<= eps -Moreover, it is defined in terms of the predicate ``robust_at`` that formalizes -the local robustness property: + +We can now define the property to check that is a straightforward +description of property_ .. code-block:: whyml - predicate robust_at (m: model) (d: record) (eps: t) = - forall x': features. - let (x, _) = d in - linfty_distance x x' eps -> - predict m x = predict m x' - -Note that a ``record`` is a dataset element given as a pair of *features* and -(classification) *label*. Morever, ``linfty_distance`` is a predicate that -describes how two arrays of floating-point values (*i.e.* ``features``) are -considered close up-to a pertubation ``eps``, while ``predict`` is a function -that formalizes the execution of a model on some features to obtain the -corresponding classification label. - -In order to use the ``robust`` predicate in our WhyML specification, we need -values of types ``model`` and ``dataset``. For the former, CAISAR makes -available the constant ``model`` upon interpreting the ``AsArray`` sub-theory -that is built by the extension of the Why3 standard library for recognizing -neural network ONNX and NNet formats. For the latter, the CAISAR standard -library provides the constant ``dataset`` in ``DatasetClassification`` that will -be interpreted as the actual dataset the user needs to provide via the -command-line interface when launching a CAISAR verification. - -Assuming we have a neural network named ``MNIST_256_2.onnx`` for MNIST -classification, the final WhyML file for specifying its local robustness on a -dataset, with each element's feature pertubed by :math:`1 \%`, looks like this: + predicate robust_around (n: nn) (eps: t) (l: label_) (i: image) = + forall perturbed_image: image. + has_length perturbed_image (length i) -> + valid_image perturbed_image -> + let perturbation = perturbed_image - i in + bounded_by_epsilon perturbation eps -> + advises n perturbed_image l + + predicate robust (n: nn) (d: dataset label_ image) (eps: t) = + Dataset.forall_ d (robust_around n eps) + +Finally, to instanciate this property on concrete neural networks and data +samples, we can define a goal and check the property .. code-block:: whyml - theory MNIST - use MNIST_256_2.AsArray - use ieee_float.Float64 - use caisar.DatasetClassification - use caisar.DatasetClassificationProps + goal robustness: + let nn = read_neural_network "MNIST_256_2" ONNX in + let dataset = read_dataset "mnist_test.csv" CSV in + let eps = (0.375:t) in + robust nn dataset eps - goal robustness: - robust model dataset (0.0100000000000000002081668171172168513294309377670288085937500000:t) - end -This file is available, as is, under the ``/examples/mnist/`` folder as +The final property file is available, as is, under the ``/examples/mnist/`` folder as `property.why <https://git.frama-c.com/pub/caisar/-/blob/master/examples/mnist/property.why>`_. The corresponding neural network in ONNX format is available under the @@ -139,12 +156,12 @@ Verifying the property with CAISAR ================================== Now we may verify whether the previous robustness specification holds on the -MNIST fragment ``mnist_test.csv`` by means of the nnenum prover. This can be +by means of the nnenum prover. This can be done via CAISAR as follows: .. code-block:: console - $ caisar verify --prover nnenum -L examples/mnist/nets --format whyml --dataset=examples/mnist/csv/mnist_test.csv examples/mnist/property.why + $ caisar verify --prover nnenum -L examples/mnist/nets --format whyml examples/mnist/property.why [caisar] Goal robustness: Invalid The result tells us that there exists at least one image in ``mnist_test.csv`` diff --git a/doc/usage.rst b/doc/usage.rst index 8cb9fd6829ab28f2d057fcdf93f5fb00416163b1..dce8d75beafe22e0b73ca191940270fac24d56d3 100644 --- a/doc/usage.rst +++ b/doc/usage.rst @@ -10,6 +10,25 @@ The CAISAR executable provide several subcommands, called using ``caisar [COMMAND]``. ``caisar --help`` provides a list of all available commands. Additionally, ``caisar [COMMAND] --help`` gives a synopsis for each available command. +Logging +******* + +CAISAR provides different log verbosity levels. To change the log verbosity +level, provide the option ``--verbosity`` to the ``verify`` command. Available +levels are ``quiet``, ``error``, ``warning``, ``info`` and ``debug`` (defaults +to ``warning``). + +CAISAR also provides log *tags*. Each tag adds additional information in the +log, independently of the log verbosity level. However, note that the ``debug`` +verbosity level allows logging with all tags enabled. To enable a particular +tag, use the option ``--ltag=TAG``, with ``TAG`` one of the following: + +* ``Autodetect``: display the output of the autodetect routine of CAISAR described in :ref:`prover-registration` +* ``ProverSpec``: display the actual specification provided to the prover +* ``ProverCall``: display the actual command used to call the prover +* ``InterpretGoal``: display the goal interpretation +* ``StackTrace``: display the stack trace upon a CAISAR error + .. _prover-registration: Prover registration @@ -19,9 +38,7 @@ CAISAR relies on external provers to work. You must install them first, then point CAISAR to their location. Please refer to each prover documentation to install them. -CAISAR -already support several provers, listed in -:ref:`supported-provers`. +CAISAR already support several provers, listed in :ref:`supported-provers`. Assuming you have installed a prover with an entry in the ``caisar-detection.conf`` file, you can register the prover to CAISAR using the following command: ``PATH=$PATH:/path/to/solver/executable caisar config -d``. @@ -49,25 +66,6 @@ Internally, CAISAR will translate the goals into a form that is acceptable by the targeted prover, generating a file (the ``%f`` defined in the previous section). -Logging -******* - -CAISAR provides different log verbosity levels. To change the log verbosity -level, provide the option ``--verbosity`` to the ``verify`` command. Available -levels are ``quiet``, ``error``, ``warning``, ``info`` and ``debug`` (defaults -to ``warning``). - -CAISAR also provides log *tags*. Each tag adds additional information in the -log, independently of the log verbosity level. However, note that the ``debug`` -verbosity level allows logging with all tags enabled. To enable a particular -tag, use the option ``--ltag=TAG``, with ``TAG`` one of the following: - -* ``Autodetect``: display the output of the autodetect routine of CAISAR described in :ref:`prover-registration` -* ``ProverSpec``: display the actual specification provided to the prover -* ``ProverCall``: display the actual command used to call the prover -* ``InterpretGoal``: display the goal interpretation -* ``StackTrace``: display the stack trace upon a CAISAR error - Built-in properties ------------------- In addition to all the theories provided by Why3, CAISAR @@ -135,33 +133,3 @@ request on our `forge <https://git.frama-c.com/pub/caisar/-/merge_requests>`_! Assuming you have installed a prover and you filled the ``caisar-detection.conf`` file, you can register the prover to CAISAR using the following command: ``PATH=$PATH:/path/to/solver/executable caisar config -d``. - -Property verification ---------------------- -A prominent use case of CAISAR is to model a specification for an artificial intelligence -system, and to verify its validity for a given system. - -The modelling uses `WhyML <https://why3.lri.fr/doc/syntaxref.html>`_, -a typed first-order logic language. -Example of WhyML are in the `source code <https://git.frama-c.com/pub/caisar/-/tree/master/examples/>`_. -You may also read the :ref:`examples` section of this -documentation to get a first grasp on using WhyML. - -Provided a file `trust.why` containing a goal to verify, the command ``caisar verify --prover=PROVER trust.why`` -will verify the goals using the specified prover. A list of -supported provers is available in :ref:`supported-provers`. The prover -must already be registered by CAISAR. Currently, only one -prover can be selected at a time; future work will allow -selecting multiple provers and orchestrate verification -strategies. -Internally, CAISAR will translate the goals into a form that -is acceptable by the targeted prover, generating a file (the -``%f`` defined in the previous section). - -Built-in properties -------------------- -In addition to all the theories provided by Why3, CAISAR -provide additional theories that model commonly desirable properties for machine learning programs. -All those predicates are located in the file ``stdlib/caisar.mlw``. -Among them are theories to describe classification datasets, -local and global robustness around a neighborhood. diff --git a/examples/mnist/property.why b/examples/mnist/property.why index 1642bd92159e0cd3e8ddf3eed7584d4e6b32df27..24cfa42801b4645b83ad85c6ffa5ff16313199ef 100644 --- a/examples/mnist/property.why +++ b/examples/mnist/property.why @@ -1,9 +1,39 @@ theory MNIST - use MNIST_256_2.AsArray use ieee_float.Float64 - use caisar.DatasetClassification - use caisar.DatasetClassificationProps + use int.Int + use interpretation.Vector + use interpretation.NeuralNetwork + use interpretation.Dataset + + type image = vector t + type label_ = int + + predicate valid_image (i: image) = + forall v: index. valid_index i v -> (0.0: t) .<= i[v] .<= (1.0: t) + + predicate valid_label (l: label_) = 0 <= l <= 2 + + predicate advises (n: nn) (i: image) (l: label_) = + valid_label l -> + forall j: int. valid_label j -> j <> l -> (n@@i)[l] .> (n@@i)[j] + + predicate bounded_by_epsilon (i: image) (eps: t) = + forall v: index. valid_index i v -> .- eps .<= i[v] .<= eps + + predicate robust_around (n: nn) (eps: t) (l: label_) (i: image) = + forall perturbed_image: image. + has_length perturbed_image (length i) -> + valid_image perturbed_image -> + let perturbation = perturbed_image - i in + bounded_by_epsilon perturbation eps -> + advises n perturbed_image l + + predicate robust (n: nn) (d: dataset label_ image) (eps: t) = + Dataset.forall_ d (robust_around n eps) goal robustness: - robust model dataset (0.0100000000000000002081668171172168513294309377670288085937500000:t) + let nn = read_neural_network "MNIST_256_2" ONNX in + let dataset = read_dataset "mnist_test.csv" CSV in + let eps = (0.375:t) in + robust nn dataset eps end