itc-benchmarks
This is a modified version of the itc-benchmarks Github repository, with tests from the Toyota InfoTechnology Center, made available online by John Regehr.
Several patches have been applied to the original sources to deal with:
- Issues found in the original code (e.g. copy-paste typos and undefined behaviors (UBs));
- Recursive calls (commented out for running Eva);
- Portability issues (e.g. casts
pthread_t -> int
).
Renamings
-
The
.
in directory names was replaced with_
to avoid issues with analysis scripts; -
02_wo_Defects/free_nondynamically_allocated_memory.c
was renamed to02_wo_Defects/free_nondynamic_allocated_memory.c
, to match the equivalent file in01_w_Defects
.
Tests out of scope
Concurrency and recursive calls are not yet handled by Eva.
Therefore, all tests whose defects concern threads and concurrency are not
handled by Eva.
Also, some kinds of defects that do not imply undefined behavior, such as
redundant conditions in tests, are out of the scope of Frama-C/Eva.
However, such tests are still left in the code, since they allow us to detect
unexpected undefined behaviors (e.g. as detailed later for redundant_code.c
).
Applied patches
The list of applied patches is separated into three categories:
- Likely unintentional typos;
- Unforeseen undefined behaviors;
- Limitations for Eva.
The first category comprises errors which look like simple unintended typos.
The second contains some code that invokes undefined behavior, but does not seem caused by a simple typo.
Finally, the last category contains mostly commented out calls to recursive functions.
Typos
-
buffer_underrun_dynamic.c
, functiondynamic_buffer_underrun_037
:doubleptr[0][0]='T'; // the first '0' should have been 'i'
-
littlemem_st.c
, functionslittlemem_st_008_func_002
,littlemem_st_009_func_002
,littlemem_st_010_func_002
andlittlemem_st_011_func_002
: the same copy-paste error was made in these four functions, in which the global variable name copied from function 007 was not updated. This results in code referring to a different variable, leading to possible runtime errors during execution. Fixing each reference to the corresponding global variable (as was clearly intended in the code) prevents false alarms. -
redundant_code.c
, functionredundant_cond_009
:The comment before the function suggests the loop condition, in the
02_wo_Defects
version, should bea > 10
(or10 < a
). This is also closer to the code in the01_w_Defects
version. Interestingly, the inversion leads to undefined behavior, since the value ofa
will cycle through all negative values, before being decremented and wrapping around, causing a signed overflow. Fixing the condition is necessary to prevent UB in the02_wo_Defects
version.
Undefined behavior
-
data_lost.c
, functiondata_lost_006
, and alsodata_overflow
, functionsdata_overflow_024
anddata_overflow_025
:Float to int conversions which overflow (larger than the maximum integer value for the corresponding type) are undefined behavior. While the UB is deliberate in the
01_w_Defects
directory, there is also UB in the02_wo_Defects
directory. The code was patched to a value that fits in the integer type. -
invalid_memory_access.c
, functioninvalid_memory_access_003
:A char pointer
c
is passed tomalloc
thenfree
, and then assigned to another pointer (psink = c
). This last assignment (afterc
has been freed) entails the access to an indeterminate value (the dangling pointer), which Eva detects as UB. It is not necessary to keep the assignment in the01_w_Defects
directory, since there is already an UB earlier in the code. The instruction was removed from both versions of the file. -
overrun_st.c
, several functionsoverrun_st_XXX
:These functions all follow the same pattern, which results in accessing an uninitialized variable. As an example, here is the code of
overrun_st_001
(without defects):void overrun_st_001 () { char buf[5]; buf[4] = 1; sink = buf[idx]; }
idx
is a global variable, default-initialized to 0. Therefore, the last statement,sink = buf[idx]
(likely introduced to avoid compiler warnings about "unused variable buf"), will access uninitialized memory. This situation happens 37 times in the file. A simple patch in all cases consists in 0-initializing the arrays by adding= {0}
to their declarations. this removes the unintentional errors in the02_wo_defects
version, without removing the intended errors in01_w_defects
. -
st_overflow.c
, functionst_overflow_006_func_001
:Same issue as previously: a local array without initializer has its element accessed, even in the
02_wo_defects
version. The same solution is applied: the array is 0-initialized by adding= {0}
to its declaration. -
bit_shift.c
, functionbit_shift_009
:The code performs a left-shift of a signed value,
1 << (rand() % 32)
. However,1 << 31
(the maximum possible value) has typesigned int
. In a 32-bit architecture (as is assumed for this set of benchmarks), this leads to a signed overflow, which is undefined behavior. The patch consists in replacingrand() % 32
withrand() % 31
. -
st_underrun.c
, functionsst_underrun_002_func_001
andst_underrun_007_func_001
(only the02_wo_defects
version):The no-defects versions of these functions actually still contain some undefined behavior. This is the code of
st_underrun_002_func_001
:void st_underrun_002_func_001 (st_underrun_002_s_001 s) { int len = strlen(s.buf) - 1; for (;s.buf[len] != 'Z';len--) { if ( len < 0 ) break; } }
s.buf
initially contains"STRING"
(therefore no 'Z's). Whenlen
reaches 0, the test inside the loop is still false, so we return to the loop decrement,len
becomes -1, and then we tests.buf[-1] != 'Z'
. This causes an out-of-bounds access, leading to undefined behavior. The code likely intended to return -1 when the character is not found, but since the value oflen
is not actually used here, we can simply replace the inner test withlen <= 0
to avoid the undefined behavior. The same patch is applicable tost_underrun_007_func_001
. -
st_underrun.c
, functionst_underrun_007
:This function also suffers from the uninitialized access issue mentioned in file
overrun_st.c
: variablest_underrun_007_s_001
only has the first element of arraybuf
initialized (to 1), while the remaining of the array is left uninitialized. Whenst_underrun_007_func_001
is called, there is an access to the uninitialized values ofs.buf
insidestrlen
. The same patch as foroverrun_st.c
(adding= {0}
to the declaration) also works here. -
ptr_subtraction.c
, functionptr_subtraction_001
:The no-defects version contains a subtraction between pointers of different bases (
buf1 - buf2
) which itself leads to undefined behavior. It is necessary to comment out this instruction (and the one using its result).
Limitations for Eva
Eva does not handle recursive calls yet, so a few tests containing them had the recursive calls commented out:
-
invalid_memory_access.c
, functioninvalid_memory_access_005
; -
st_overflow.c
, functionst_overflow_005_func_001
; -
st_underrun.c
, functionst_underrun_005
.
Also, 2 test files had deliberate parsing errors (concerning function pointers)
which prevent Frama-C from parsing them. They were removed from the list of
files to be parsed and their functions commented out from main
.
Also, 5 tests concerning concurrency were removed because their source files
perform a non-portable cast from pthread_t
to int
(POSIX states that
the type of pthread_t
is not required to be an arithmetic type. Frama-C's
libc defines it with a struct
, which is allowed by the standard, and chosen
to deliberately help identify such non-conformance violations.
Lastly, since the tests are out of the scope of Eva anyway, removing them has
no impact in the end.
Remaining alarms
A few alarms remain in the code, due to:
-
Imprecision in the handling of the variadic function
snprintf
: this function assigns to a buffer, which is then passed tostrlen
. However, there is currently no way to emulate the precise behavior of whatsnprintf
produces, therefore we use a safe approximation in which the string is possibly non-terminated. This leads to 2 alarms (one in the caller and another in the callee). -
Signed overflows in
02_wo_Defects/redundant_code.c
, functionsredundant_cond_009
andredundant_cond_0013
:Even after fixing the loop condition as mentioned previously, Eva is not able to infer that there is no signed overflow in the loop. This is because the value of
a
is imprecise (the range[10..32767]
inside the loop) and very large, so a huge number of iterations would be necessary to obtain a precise result (that is, that the sum32767 + 32766 + ... + 10
, stored inb
, does not overflow a 32-bit int). Frama-C/Eva performs a widening which leads to a spurious alarm about the signed overflow.
"False negatives"
A few test cases in the 01_w_Defects
do not lead to alarms in Frama-C/Eva.
These test cases are related to situations in which, from the point of view
of the user, it might make sense to consider them as "defects", but from the
point of view of the C standard, they are perfectly valid situations.
One such case is in file memory_allocation_failure.c
, function
memory_allocation_failure_002
:
long long int *ptr=(long long*) malloc(MAX_VAL *sizeof(long long));
MAX_VAL
is defined as UINT_MAX
, therefore an unsigned integer type.
The test intention was that the size should "overflow".
However, unsigned overflow is allowed by the C standard, therefore
Frama-C by default does not raise any alarms.
It is possible to use option -warn-unsigned-overflow
to trigger an alarm
in this situation, but then the rest of the code needs to be patched to avoid
triggering it in other situations.
In particular, the same file memory_allocation_failure.c
contains the
following variable declaration:
static unsigned int static_var = MAX_VAL*2;
Which makes the analysis stop during global variable initialization, since a
definite alarm has been found.
Fixing such issues should allow Frama-C to detect the overflow in the call
to malloc
.