Skip to content
Snippets Groups Projects

[blog] add post about SuperTest collaboration

Merged Andre Maroneze requested to merge blog/supertest into master
All threads resolved!
+ 115
0
---
layout: post
author: A. Maroneze (Frama-C), V. Yaglamunov & M. Beemster (Solid Sands)
date: 2022-11-16 17:00 +0100
categories: collaboration undefined-behavior
title: "SuperTest and Frama-C: a Clash of Titans"
---
*This post has been co-written with [Solid Sands](https://solidsands.com/);
it is also available at the
[Solid Sands blog](https://solidsands.com/supertest-and-frama-c-a-clash-of-titans).*
Some time ago, the Frama-C team entered into a partner agreement with
[Solid Sands](https://solidsands.com/) to make use of
[SuperTest](https://solidsands.com/products/supertest), a very thorough test
suite for C compilers and libraries. From basic syntactic tests to
comprehensive libc functions, this suite helps to ensure the quality of
compilers and libraries as well as their conformance to different standards.
Naturally, we asked ourselves: what happens if we use SuperTest and Frama-C
to verify each other? This blog post summarizes the results of this exchange.
## Testing a verifier, verifying a test suite
Frama-C is not a compiler, but it has many similarities: it parses programs,
types them, and produces an AST. More importantly, it can analyze tests
_semantically_.
SuperTest is not a normal program, but a collection of, sometimes contrived,
tests that touch every aspect of C. It will stress any tool that processes C.
Its positive tests should be strictly conforming to the C standard.
## Given enough C programs, cracks will appear
Solid Sands has analyzed many implementations of C and C analysis tools.
It has never seen any implementation that is free of error.
That is not to say that there are no good implementations. It is just
that in the corners of the C language there are things that you can easily
trip over. Another difficult issue to get right is the precision of
diagnostics. Thus, some errors in Frama-C were to be expected as well.
From Frama-C's previous experience with test suites and benchmarks, at least
_some_ semantic bugs in SuperTest were to be expected: given a sufficiently
large set of C programs, and given the large amount of subtle semantic aspects
in the language, it is statistically unlikely to obtain such a suite completely
free of *undefined behaviors*, be them portability issues, non-deterministic
cases, or simply bugs. This was already the case with NIST's
[Juliet C/C++ testsuite](https://samate.nist.gov/SARD/test-suites/112),
[J. Moerman’s 2018 bachelor's thesis test suite (JM2018TS)(https://github.com/JMoerman/JM2018TS),
[TrailOfBits’ version of the DARPA Challenge Binaries](https://github.com/trailofbits/cb-multios),
and other test suites and benchmarks we ran Frama-C on.
## Eva to the rescue
The [Eva](https://www.frama-c.com/fc-plugins/eva.html) plug-in of Frama-C is
particularly efficient for this kind of scenario: mostly automatic execution,
no (or few) specifications needed, great precision due to small test sizes.
After an initial setup to configure parsing and adjust a few stubs (e.g. so
that Eva will consider the most general case), it is often easy to run Frama-C
and obtain precise results in a few minutes. In SuperTest, interpreting the
results is made easier by the fact that test cases are named and structured
according to the corresponding C standard features they test.
## Given enough C programs, Frama-C will fail to parse some
Before starting the test campaign, The Frama-C team expected Frama-C to _not_
completely succeed with it. Frama-C's parser is not 100% compatible with
C99/C11/etc: it tends to be too lenient. The main reasons being that:
1. Frama-C does not compile the code, so the user _will_ have to write
syntactically valid C code in the end; historically, Frama-C delegates some
checks to the compiler;
2. Frama-C focuses on _semantic_ analyses, leaving the thorny bits of
syntactical issues to other, better-suited tools. For instance, Frama-C does
not even have its own preprocessor.
Nowadays, we strive to make Frama-C as syntactically compliant as possible,
but there are still a few remaining spots requiring some polish.
This effort certainly helps with that.
## Results: unsurprising, but interesting nonetheless
Unsurprisingly, we found both kinds of issues: some semantic issues in
SuperTest, and some syntactic shortcomings in the case of Frama-C.
Actually, we found more of the latter than the former.
Fortunately, we found no major problems. The issues found in SuperTest are of
little relevance for critical and embedded systems - for instance, issues
related to [wide characters](https://en.wikipedia.org/wiki/Wide_character)
(`wchar.h`). We believe two reasons for that are:
1. Many tools and frameworks decide not to support these functions, since
they are rarely used in the domain of embedded systems;
2. The issues were related to portability, and thus undetectable in some
architectural configurations.
## Very wide coverage
SuperTest is exhaustive with respect to the C standard library, while Frama-C's
ACSL specifications are iteratively added on a best-effort basis. SuperTest
contains tests for features which are not covered by Frama-C, such as complex
numbers, some C11 constructs, and several library files and functions for
which Frama-C does not yet provide an ACSL specification. Thus SuperTest is
likely to remain relevant for future Frama-C evolutions, and reciprocally,
improving Frama-C and maintaining SuperTest up to date can help with future
findings.
## Final score: win-win
Overall, this interchange between CEA LIST and Solid Sands has proven
beneficial to both: setting up a recurrent run of Frama-C on SuperTest
prevents regressions and also benefits from evolutions to Frama-C itself;
on the other hand, Frama-C can improve its syntactic support thanks to
SuperTest.
Loading