Skip to content
Snippets Groups Projects
2010-10-07-Value-analysis-tutorial-part-2.html 20.6 KiB
Newer Older
---
layout: post
author: Pascal Cuoq
date: 2010-10-07 18:23 +0200
categories: skein value
format: xhtml
title: "Value analysis tutorial, part 2"
summary: 
---
{% raw %}
<p>In order to create a tradition of providing solutions to previous quizzes, this post is a partial answer to the question in <a href="/skein/value/2010/09/30/Progress-of-the-value-analysis-tutorial2" hreflang="en">this one about Frama-C's value analysis</a>.</p> 
<p>To recap: at the end of the Boron tutorial we arrive at the <code>main</code> function below. This function is useful as an analysis context for the verification of hashing a 80-char message with Skein-256:</p> 
<pre>#include "skein.h" 
#define HASHLEN (32) 
u08b_t msg[80]; 
void main(void) 
{ 
  int i; 
  u08b_t hash[HASHLEN]; 
  for (i=0; i&lt;80; i++) 
    msg[i] = Frama_C_interval(0  255); 
  Skein_256_Ctxt_t skein_context;  
  Skein_256_Init(&amp;skein_context  HASHLEN * 8); 
  Skein_256_Update(&amp;skein_context  msg  80); 
  Skein_256_Final( &amp;skein_context  hash); 
} 
</pre> 
<p>The context analysis above allocates as a global array and initializes a message <code>msg</code> with arbitrary contents. It also allocates a structure <code>skein_context</code> as a local variable to be used by the library for ongoing computations.</p> 
<p>Our goal is now to generalize the analysis to arbitrary numbers of <code>Skein_256_Update</code> calls. Let us replace the single <code>Skein_256_Update</code> call in the above context by the loop below. This loop uses <code>Frama_C_interval(0 1)</code> as its condition. The value analysis is guaranteed to take into account the possible results 0 and 1 for all these calls. This ensures that the results of the analysis apply to all the different execution paths we are now interested in  although these paths differ in length.</p> 
<pre>  while (Frama_C_interval(0 1)) 
    { 
      for (i=0; i&lt;80; i++) 
        msg[i] = Frama_C_interval(0  255); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
</pre> 
<p>Using the same command-line as in the tutorial generates warnings about possible undefined behaviors:</p> 
<p><code>frama-c -val -slevel 100 *.c .../share/builtin.c &gt; log</code></p> 
<pre>grep ssert log 
lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + n): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )1)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )2)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )3)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )4)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )5)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )6)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )7)): assert(TODO) 
</pre> 
<p>It is possible to increase the argument of option <code>-slevel</code> in order to unroll more iterations and postpone the moment in the analysis when precision is lost and alarms occur. However  since the <code>while</code> loop is infinite  there is no value for <code>-slevel</code> that lets this loop be analyzed with precision to its end.</p> 
<p>In fact  we do not want this <code>while</code> loop unrolled. We want the value analysis to do standard value analysis magic and extract properties that are true at all iterations of the loop without studying each iteration individually. In other words  we want the value analysis not to unroll the while loop (but to continue unrolling the other  finite loops so as to remain as precise as it was for the single <code>Skein_256_Update</code> call).</p> 
<p>One way to obtain this result is to move the <code>for</code> loop outside function <code>main</code>  so that the <code>while</code> loop is the only one that remains in <code>main</code>  and to use the option <code>-slevel-function</code> introduced in Frama-C Boron. This option allows "semantic unrolling" to be tuned function by function. Here  we can use it to tell the value analysis to unroll loops except inside <code>main</code>.</p> 
<p>The complete analysis context becomes:</p> 
<pre>#include "skein.h" 
#define HASHLEN (32) 
u08b_t msg[80]; 
void arbitrarize_msg(void) 
{ 
  for (int i=0; i&lt;80; i++) msg[i]=Frama_C_interval(0  255); 
} 
u08b_t hash[HASHLEN]; 
void main(void) 
{ 
  int i; 
  u08b_t hash[HASHLEN]; 
  Skein_256_Ctxt_t skein_context;  
  Skein_256_Init(&amp;skein_context  HASHLEN * 8); 
  while (Frama_C_interval(0 1)) 
    { 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
  Skein_256_Final( &amp;skein_context  hash); 
} 
</pre> 
<p>The new command-line to use is:</p> 
<p><code>frama-c -val -slevel 100 -slevel-function main:0 *.c &gt; log</code></p> 
<p>The analysis takes a minute or so.</p> 
<pre>grep ssert log 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + n): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )1)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )2)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )3)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )4)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )5)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )6)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )7)): assert(TODO) 
lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
lib.c:18:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>The warnings from the previous attempt are still there  and they have a surprise guest at lib.c  line 18 (in function <code>memset</code>). But since the only thing we have changed since the last attempt is to prevent a loop from being unrolled  this was to be expected (you may have noticed that the analysis was a little bit faster  also as a result of not unrolling the <code>while</code> loop).</p> 
<p>Now  let us try to get rid of these warnings for good. For this purpose  we need information about what is causing them. Plunging in the code with <code>frama-c-gui</code> is a good way to obtain some of that  but this approach lacks the ability to display values of variables for different calls to <code>Skein_256_Update</code> separately. Purely as an experiment  in order to have a place to click to see the state at the first  second  ...  fifth iteration  let us change the <code>while</code> loop to:</p> 
<pre>      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
  while (Frama_C_interval(0 1)) 
    { 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
</pre> 
<p>Using the command-line <code>frama-c-gui -val -slevel 100 -slevel-function main:0 *.c</code>  the first surprise is that we have accidentally eliminated most of the alarms by hand-unrolling the <code>while</code> loop. Indeed  only the following now remains:</p> 
<pre>lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>We can stick to the initial plan and observe the states after each of the first five iterations to understand what happened. This is no time to celebrate victory yet. For starters  the verification is not done until all alarms have been eliminated. Also  our investigative instrumentation has eliminated some execution paths — the executions paths with less than five iterations — that would still need to be studied separately before the verification can be called finished.</p> 
<p>Here are the contents of variable <code>skein_context</code> before the first  second and third calls:</p> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h{.bCnt; .T[0]; } ∈ {0; } 
             .h.T[1] ∈ {8070450532247928832; } 
             .X[0] ∈ {4072681676153946182; } 
             .X[1] ∈ {5436642852965252865; } 
             .X[2] ∈ {2889783295839482789; } 
             .X[3] ∈ {6109786322067550404; } 
             .b[0..31] ∈ UNINITIALIZED 
</pre> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h.bCnt ∈ {16; } 
             .h.T[0] ∈ {64; } 
             .h.T[1] ∈ {3458764513820540928; } 
             {.X[0..3]; .b[0..15]; } ∈ [--..--] 
             .b[16..31] ∈ UNINITIALIZED 
</pre> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h.bCnt ∈ {32; } 
             .h.T[0] ∈ {128; } 
             .h.T[1] ∈ {3458764513820540928; } 
             {.X[0..3]; .b[0..31]; } ∈ [--..--] 
</pre> 
<p>Based on this information  it is now clear where the "accessing uninitialized left-value" warnings came from: there is inside the structure <code>skein_context</code> a buffer that only becomes initialized after the first two calls to <code>Skein_256_Update</code>. And these alarms are now gone because these iterations  when the buffer is not yet completely initialized  are handled separately from the latter iterations when the buffer is initialized and presumably read from at some point.</p> 
<p>Quiz 1: continuing the study of the first five calls to <code>Skein_256_Update</code>  what used to cause the warning below  and what made it disappear?</p> 
<pre>lib.c:18:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>Quiz 2: how should we get rid of the last alarm (that occurs in our file lib.c at line 10  function <code>memcpy</code> although it is caused by imprecision in the callers of that function)?</p> 
<p>The files in their state at the end of this blog post are available <a href="/assets/img/blog/imported-posts/skein_verification.zip">here</a>. To rewind to earlier stages of the verification process  simply erase <code>main.c</code>  which is the only file we have modified so far.</p>
 <p>In order to create a tradition of providing solutions to previous quizzes, this post is a partial answer to the question in <a href="/skein/value/2010/09/30/Progress-of-the-value-analysis-tutorial2" hreflang="en">this one about Frama-C's value analysis</a>.</p> 
<p>To recap: at the end of the Boron tutorial we arrive at the <code>main</code> function below. This function is useful as an analysis context for the verification of hashing a 80-char message with Skein-256:</p> 
<pre>#include "skein.h" 
#define HASHLEN (32) 
u08b_t msg[80]; 
void main(void) 
{ 
  int i; 
  u08b_t hash[HASHLEN]; 
  for (i=0; i&lt;80; i++) 
    msg[i] = Frama_C_interval(0  255); 
  Skein_256_Ctxt_t skein_context;  
  Skein_256_Init(&amp;skein_context  HASHLEN * 8); 
  Skein_256_Update(&amp;skein_context  msg  80); 
  Skein_256_Final( &amp;skein_context  hash); 
} 
</pre> 
<p>The context analysis above allocates as a global array and initializes a message <code>msg</code> with arbitrary contents. It also allocates a structure <code>skein_context</code> as a local variable to be used by the library for ongoing computations.</p> 
<p>Our goal is now to generalize the analysis to arbitrary numbers of <code>Skein_256_Update</code> calls. Let us replace the single <code>Skein_256_Update</code> call in the above context by the loop below. This loop uses <code>Frama_C_interval(0 1)</code> as its condition. The value analysis is guaranteed to take into account the possible results 0 and 1 for all these calls. This ensures that the results of the analysis apply to all the different execution paths we are now interested in  although these paths differ in length.</p> 
<pre>  while (Frama_C_interval(0 1)) 
    { 
      for (i=0; i&lt;80; i++) 
        msg[i] = Frama_C_interval(0  255); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
</pre> 
<p>Using the same command-line as in the tutorial generates warnings about possible undefined behaviors:</p> 
<p><code>frama-c -val -slevel 100 *.c .../share/builtin.c &gt; log</code></p> 
<pre>grep ssert log 
lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + n): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )1)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )2)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )3)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )4)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )5)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )6)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t_0 )7)): assert(TODO) 
</pre> 
<p>It is possible to increase the argument of option <code>-slevel</code> in order to unroll more iterations and postpone the moment in the analysis when precision is lost and alarms occur. However  since the <code>while</code> loop is infinite  there is no value for <code>-slevel</code> that lets this loop be analyzed with precision to its end.</p> 
<p>In fact  we do not want this <code>while</code> loop unrolled. We want the value analysis to do standard value analysis magic and extract properties that are true at all iterations of the loop without studying each iteration individually. In other words  we want the value analysis not to unroll the while loop (but to continue unrolling the other  finite loops so as to remain as precise as it was for the single <code>Skein_256_Update</code> call).</p> 
<p>One way to obtain this result is to move the <code>for</code> loop outside function <code>main</code>  so that the <code>while</code> loop is the only one that remains in <code>main</code>  and to use the option <code>-slevel-function</code> introduced in Frama-C Boron. This option allows "semantic unrolling" to be tuned function by function. Here  we can use it to tell the value analysis to unroll loops except inside <code>main</code>.</p> 
<p>The complete analysis context becomes:</p> 
<pre>#include "skein.h" 
#define HASHLEN (32) 
u08b_t msg[80]; 
void arbitrarize_msg(void) 
{ 
  for (int i=0; i&lt;80; i++) msg[i]=Frama_C_interval(0  255); 
} 
u08b_t hash[HASHLEN]; 
void main(void) 
{ 
  int i; 
  u08b_t hash[HASHLEN]; 
  Skein_256_Ctxt_t skein_context;  
  Skein_256_Init(&amp;skein_context  HASHLEN * 8); 
  while (Frama_C_interval(0 1)) 
    { 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
  Skein_256_Final( &amp;skein_context  hash); 
} 
</pre> 
<p>The new command-line to use is:</p> 
<p><code>frama-c -val -slevel 100 -slevel-function main:0 *.c &gt; log</code></p> 
<p>The analysis takes a minute or so.</p> 
<pre>grep ssert log 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + n): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )1)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )2)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )3)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )4)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )5)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )6)): assert(TODO) 
skein.c:53:[kernel] warning: accessing uninitialized left-value *(src + (n + (size_t )7)): assert(TODO) 
lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
lib.c:18:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>The warnings from the previous attempt are still there  and they have a surprise guest at lib.c  line 18 (in function <code>memset</code>). But since the only thing we have changed since the last attempt is to prevent a loop from being unrolled  this was to be expected (you may have noticed that the analysis was a little bit faster  also as a result of not unrolling the <code>while</code> loop).</p> 
<p>Now  let us try to get rid of these warnings for good. For this purpose  we need information about what is causing them. Plunging in the code with <code>frama-c-gui</code> is a good way to obtain some of that  but this approach lacks the ability to display values of variables for different calls to <code>Skein_256_Update</code> separately. Purely as an experiment  in order to have a place to click to see the state at the first  second  ...  fifth iteration  let us change the <code>while</code> loop to:</p> 
<pre>      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
  while (Frama_C_interval(0 1)) 
    { 
      arbitrarize_msg(); 
      Skein_256_Update(&amp;skein_context  msg  80); 
    } 
</pre> 
<p>Using the command-line <code>frama-c-gui -val -slevel 100 -slevel-function main:0 *.c</code>  the first surprise is that we have accidentally eliminated most of the alarms by hand-unrolling the <code>while</code> loop. Indeed  only the following now remains:</p> 
<pre>lib.c:10:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>We can stick to the initial plan and observe the states after each of the first five iterations to understand what happened. This is no time to celebrate victory yet. For starters  the verification is not done until all alarms have been eliminated. Also  our investigative instrumentation has eliminated some execution paths — the executions paths with less than five iterations — that would still need to be studied separately before the verification can be called finished.</p> 
<p>Here are the contents of variable <code>skein_context</code> before the first  second and third calls:</p> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h{.bCnt; .T[0]; } ∈ {0; } 
             .h.T[1] ∈ {8070450532247928832; } 
             .X[0] ∈ {4072681676153946182; } 
             .X[1] ∈ {5436642852965252865; } 
             .X[2] ∈ {2889783295839482789; } 
             .X[3] ∈ {6109786322067550404; } 
             .b[0..31] ∈ UNINITIALIZED 
</pre> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h.bCnt ∈ {16; } 
             .h.T[0] ∈ {64; } 
             .h.T[1] ∈ {3458764513820540928; } 
             {.X[0..3]; .b[0..15]; } ∈ [--..--] 
             .b[16..31] ∈ UNINITIALIZED 
</pre> 
<pre>skein_context.h.hashBitLen ∈ {256; } 
             .h.bCnt ∈ {32; } 
             .h.T[0] ∈ {128; } 
             .h.T[1] ∈ {3458764513820540928; } 
             {.X[0..3]; .b[0..31]; } ∈ [--..--] 
</pre> 
<p>Based on this information  it is now clear where the "accessing uninitialized left-value" warnings came from: there is inside the structure <code>skein_context</code> a buffer that only becomes initialized after the first two calls to <code>Skein_256_Update</code>. And these alarms are now gone because these iterations  when the buffer is not yet completely initialized  are handled separately from the latter iterations when the buffer is initialized and presumably read from at some point.</p> 
<p>Quiz 1: continuing the study of the first five calls to <code>Skein_256_Update</code>  what used to cause the warning below  and what made it disappear?</p> 
<pre>lib.c:18:[kernel] warning: out of bounds write. assert \valid(tmp); 
</pre> 
<p>Quiz 2: how should we get rid of the last alarm (that occurs in our file lib.c at line 10  function <code>memcpy</code> although it is caused by imprecision in the callers of that function)?</p> 
<p>The files in their state at the end of this blog post are available <a href="/assets/img/blog/imported-posts/skein_verification.zip">here</a>. To rewind to earlier stages of the verification process  simply erase <code>main.c</code>  which is the only file we have modified so far.</p>
{% endraw %}