2010-12-04-Unspecified-behaviors-and-derived-analyses-part-2.html

---
layout: post
author: Pascal Cuoq
date: 2010-12-04 08:56 +0200
categories: derived-analysis unspecified-behavior value
format: xhtml
title: "Unspecified behaviors and derived analyses, part 2"
summary: 
---
{% raw %}
<h2>Context</h2> 
<p>This post is a sequel and conclusion to <a href="/derived/analysis/unspecified/behavior/value/2010/10/17/Unspecified-behaviors-and-derived-analyses" hreflang="en">this remark</a>.</p> 
<h2>Example of derived analysis: slicing</h2> 
<p>When writing a Frama-C plug-in to assist in 
reverse-engineering source code  it does not really make sense to expect 
the user to check the alarms that are emitted by the value analysis. 
Consider for instance Frama-C's slicing plug-in. This plug-in produces 
a simplified version of a program. It is often applied to large unfamiliar 
codebases; if the user is at the point where he needs a slicer to make 
sense of the codebase  it's probably a bad time to require 
him to check alarms on the original unsliced version.</p> 
<p>The slicer and other code comprehension plug-ins work around this problem 
by defining the results they provide as ``valid for well-defined executions''. 
In the case of the slicer  this is really the only definition that makes 
sense. Consider the following code snippet:</p> 
<pre>  x = a; 
  y = *p; 
  x = x+1; 
  // slice for the value of x here. 
</pre> 
<p>This piece of program is begging for its second line to be removed  but 
if <code>p</code> can be the <code>NULL</code> pointer  the sliced 
program behaves differently from the original: the original program 
exits abruptly on most architectures  whereas the sliced version 
computes the value of <code>x</code>.</p> 
<p>It is fine to ignore alarms in this context  
but the user of a code comprehension plug-in based on the value analysis 
should study the categorization of alarms in the value analysis' reference manual with particular care. 
Because the value analysis is more aggressive 
in trying to extract precise information from the program than other 
analyses  the user is more likely to observe incorrect results if there 
is a misunderstanding between him and the tool about what assumptions 
should be made.</p> 
<h2>Example of benign alarm: testing <code>p&lt;=q</code></h2> 
<p>Everybody agrees that accessing an invalid pointer is an unwanted behavior  
but what about comparing two pointers with <code>&lt;=</code> in an undefined manner or 
assuming that a signed overflow wraps around in 2's complement 
representation? Function <code>memmove</code>  for instance  typically does the 
former when applied to two addresses with different bases.</p> 
<p>Currently  if there appears to be an undefined pointer comparison  the 
value analysis propagates a state that contains all possible results 
for the comparison and for the pointers. This blog post was just trying 
to scare you. It is possible to take advantage of the value analysis 
for program comprehension  and all existing program comprehension 
tools need to make assumptions about undefined behaviors. Most tools 
do not tell you if they had to make assumptions or not. The value analysis 
does: each alarm  in general  
is also an assumption. Still  as implementation progresses and the value analysis 
becomes able to extract more information from the alarms it emits  
one or several options 
to configure it either not to emit some alarms  or not to make the corresponding 
assumptions  will certainly become necessary.</p>
 <h2>Context</h2> 
<p>This post is a sequel and conclusion to <a href="/derived/analysis/unspecified/behavior/value/2010/10/17/Unspecified-behaviors-and-derived-analyses" hreflang="en">this remark</a>.</p> 
<h2>Example of derived analysis: slicing</h2> 
<p>When writing a Frama-C plug-in to assist in 
reverse-engineering source code  it does not really make sense to expect 
the user to check the alarms that are emitted by the value analysis. 
Consider for instance Frama-C's slicing plug-in. This plug-in produces 
a simplified version of a program. It is often applied to large unfamiliar 
codebases; if the user is at the point where he needs a slicer to make 
sense of the codebase  it's probably a bad time to require 
him to check alarms on the original unsliced version.</p> 
<p>The slicer and other code comprehension plug-ins work around this problem 
by defining the results they provide as ``valid for well-defined executions''. 
In the case of the slicer  this is really the only definition that makes 
sense. Consider the following code snippet:</p> 
<pre>  x = a; 
  y = *p; 
  x = x+1; 
  // slice for the value of x here. 
</pre> 
<p>This piece of program is begging for its second line to be removed  but 
if <code>p</code> can be the <code>NULL</code> pointer  the sliced 
program behaves differently from the original: the original program 
exits abruptly on most architectures  whereas the sliced version 
computes the value of <code>x</code>.</p> 
<p>It is fine to ignore alarms in this context  
but the user of a code comprehension plug-in based on the value analysis 
should study the categorization of alarms in the value analysis' reference manual with particular care. 
Because the value analysis is more aggressive 
in trying to extract precise information from the program than other 
analyses  the user is more likely to observe incorrect results if there 
is a misunderstanding between him and the tool about what assumptions 
should be made.</p> 
<h2>Example of benign alarm: testing <code>p&lt;=q</code></h2> 
<p>Everybody agrees that accessing an invalid pointer is an unwanted behavior  
but what about comparing two pointers with <code>&lt;=</code> in an undefined manner or 
assuming that a signed overflow wraps around in 2's complement 
representation? Function <code>memmove</code>  for instance  typically does the 
former when applied to two addresses with different bases.</p> 
<p>Currently  if there appears to be an undefined pointer comparison  the 
value analysis propagates a state that contains all possible results 
for the comparison and for the pointers. This blog post was just trying 
to scare you. It is possible to take advantage of the value analysis 
for program comprehension  and all existing program comprehension 
tools need to make assumptions about undefined behaviors. Most tools 
do not tell you if they had to make assumptions or not. The value analysis 
does: each alarm  in general  
is also an assumption. Still  as implementation progresses and the value analysis 
becomes able to extract more information from the alarms it emits  
one or several options 
to configure it either not to emit some alarms  or not to make the corresponding 
assumptions  will certainly become necessary.</p>
{% endraw %}