--- layout: fc_discuss_archives title: Message 15 from Frama-C-discuss on May 2020 ---
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Frama-c-discuss] Issues with analyzing the nginx webserver using the eva plugin



Hello David and André,

thank you both for your detailed responses.
After reading your advice and tinkering around some more, I figured that
Frama-C probably is (in its current state and with its focus on
verification of small(er) embedded programs) not the ideal tool to analyze
nginx *for my specific use case*. Myself having barely any experience with
static analysis and the fact that nginx is full of dynamic allocations
doesn't help either ;)

Just in case it proofes useful, I have uploaded the exact commands and
custom include stubs on the frama-c branch in this repository:
https://github.com/cmfcmf/nginx/tree/frama-c (note that this is based on
nginx release-1.17.5)
You should be able to reproduce my attempts by executing the frama.sh
script.
I had to make changes to two nginx source files related to the itimerval
struct to successfully parse the source (see
https://github.com/cmfcmf/nginx/commit/d7eb963d40aec4e67d245fb6e0bbd997aa1b6d47).
I couldn't figure out why Frama-C wouldn't recognize the it_interval
and it_value fields.

Best regards
Christian

Am Mo., 4. Mai 2020 um 15:59 Uhr schrieb Andre Maroneze <
Andre.MARONEZE at cea.fr>:

> Hello Christian,
>
>
> I'd just like to complement David's excellent answer:
>
>
> - Eva was initially developed with embedded code in mind; we are
> continuously working towards general-purpose code, including dynamic
> allocation, recursion (in the future) and library calls. Dynamic allocation
> is supported, but due to its nature, a static analyzer is forced to
> compromise: the user often has to choose between analysis efficiency,
> precision and complexity of usage. By default, Eva minimizes precision, to
> keep it efficient and relative simple to use; with more precise knowledge
> of the code, and of Eva itself, it is possible to improve efficiency,
> precision, or both; but usually dynamic memory exacerbates small losses of
> precision elsewhere, even when it is not their direct cause.
>
>
> - We are constantly adding C11 and POSIX library function specifications
> to help analyze code using them; but third-party libraries such as
> cryptographic components and other code still require some effort. Since
> Eva performs a "white-box" interprocedural analysis by default, external
> code needs to be brought into the analysis somehow; either by writing some
> stubs or by including the third-party code in the analysis.
>
>
> - Recursion is one of our future targets, but it requires considerable
> effort to adapt the tool.
>
>
> - In practice, especially for first-time users, David's remarks stand:
> nginx itself is too large and complex to be precisely analyzed by Eva.
> Spending hours on it is rarely (if ever) going to improve results. A more
> interesting approach might be to toy with option `-eva-stop-at-nth-alarm
> <N>`, which stops the analysis when the <N>-th alarm is found. You can then
> open the GUI and, in some cases, obtain some "red" (Invalid) alarms, which
> will hopefully indicate issues deserving more immediate attention. David's
> recommendation of using kill with the USR1 signal results in something
> similar: the analysis is halted and its state is saved in a file. Running
> "frama-c-gui -load" on that file will show the partial result of the
> analysis. The most common issue I've found in such case are initialization
> problems, since they happen quite early in the analysis.
>
>
> - Another thing to take into account in large analyses is the generation
> of garbled mix: there is a special warning key, `eva:garbled-mix`, which
> warns when they are formed. In some cases this is inevitable, and in a few
> cases they are harmless; but in the majority of situations, they indicate
> the analysis has already lost precision and might snowball from there. In
> some cases, tracking them can help identify early cases of imprecise
> specifications or critical loops where extra precision is needed. Note that
> removing garbled mix often has a positive impact on the efficiency of the
> analysis.
>
> - Remember that Eva, by default, runs a context akin to a "test case": the
> analyzed functions correspond to those reachable during execution of a
> 'main' function. If you have a single test file with dozens or hundreds of
> test functions, a useful approach is to split them into separate analyses,
> e.g. with #ifdefs and corresponding `-cpp-extra-args=-DTEST1`,
> `-cpp-extra-args=-DTEST2`, etc. This allows running analyses in parallel
> and prevents losses of precision from one part of the analysis to spread
> into the rest.
>
>
> Overall, I'd say that your efforts to prepare the sources and ensure Eva
> can run are themselves already something remarkable. If by chance they are
> available somewhere, we'd be interested to take a look at them and see if
> we can provide some help into how to better parametrize Eva. If so, we'll
> work on the documentation to ensure such options become more prominent in
> the future.
>
>
>
> Best regards,
>
>
>
> On 03/05/2020 18:28, David MENTRÉ wrote:
>
> Hello Christian,
>
> In short, you are starting with a very complex project : nginx is big and
> complex code and EVA is not designed to analyze such kind of code (EVA
> target is safety critical embedded code, with very few library calls, no
> dynamic memory allocation and no recursion). You might be able to analyze
> and get result on some parts of nginx, although due to the use of many
> library calls it might not be that easy.
>
> Nonetheless, here are some hints:
>
>    - At any time, you can stop an analysis using "kill -USR1
>    pid-of-frama-c-process". It saves the intermediate result in
>    analyzed.sav.errors file and you can then use "frama-c-gui -load
>    analyzed.sav.errors" to load the analyze and try to see what is going on
>    (not analyzed code appear as red unreachable code and the degeneration
>    points appear as yellow & orange)
>    - When you are seeing message like "Assigning imprecise value to ...",
>    it means EVA is losing precision on the way the program is doing memory
>    copies. Most of the time, it means that EVA has completely lost track of
>    what the program is doing and you won't get any meaningful result afterwards
>    - In the same way, if EVA runs more than a few hours, most of the time
>    you won't get much of it
>    - As far as I know, the default parameters make the quickest analysis,
>    at the expense of lack of precision. You probably won't get better than that
>    - EVA is inherently single core application. The only way to use
>    multiple core is to cut your program into smaller parts and analyze
>    separately each part
>
> If nginx has some parts of code that does not do dynamic memory allocation
> and do not manipulate too much pointers, maybe you can use EVA on them.
>
>
> Best regards,
> david
>
> Le 30/04/2020 à 14:50, Flach, Christian a écrit :
>
> Hello everyone,
>
> I am trying to use Frama-C for analyzing the nginx webserver. I have not
> used Frama-C before.
>
> My goal is to be able to answer questions like "why would the webserver
> respond with status code 200, given an incoming request and the webserver
> configuration?". On a C-level, that would mean questions like: "what
> configuration struct fields will lead to the status field of the response
> struct being set to 200?" or "which configuration struct fields influence
> the response header field in the response struct".
>
> I would hope to be able to answer these questions by using static analysis
> of nginx. Frama-C (and its eva plugin) should be able to answer these
> questions, if I'm not mistaken.
>
> I was able to successfully parse the source code with Frama-C. I got most
> of it working by following this blog post [1] and copied the other
> necessary parts from header files on my system.
>
> I am stuck with running the eva plugin. I was able to successfully run it
> when I parsed only a subset of nginx's source files. Running eva took less
> than 3 hours and seemed to work. However, when I try to run the eva plugin
> after parsing a bigger set of nginx source files, it never finishes:
>
> I have let eva run for 40 hours on a 4,4GHz CPU and it didn't finish.
> While I see lot's of logging happening in the first ~3 hours, there is a
> gap between log messages of 15-60 minutes after that.
>
> My issue is that I have no idea
> - whether the analysis is actually ongoing when running it that long or
> stuck in some sort of loop
> - how close the analysis is to finishing / how much progress has been made
> - whether there is anything I can configure to make the analysis finish
> faster or run on more than one core
>
> For reference, this is how I parse the code and execute the eva plugin
> [2]. I also see hundreds, of lines like this [3]. Maybe Frama-C is choking
> on nginx's heavy use of dynamic allocations and memory pooling?
>
> Do you have any advice for running eva on a big project like nginx? Are
> there command line options that I am missing that might help with figuring
> out if something is wrong? Are there any additional options to make the
> analysis faster by giving up some precision? I also want to make clear that
> I am not trying to proof or verify any nginx properties; approximations are
> fine to some degree.
>
>
> Best
> Christian
> (apologies if you received this email twice - it appears like I first used
> the wrong email address)
>
>
> [1]
> http://blog.frama-c.com/index.php?post/2018/07/06/Parsing-realistic-code-bases-with-Frama-C
>
>
> [2]
>
> export FRAMA_C_MEMORY_FOOTPRINT=10
>
> # Parsing
>
> CPP_EXTRA_ARGS="-include../ext-include/__fc_stubs.h -I ../ext-include -I
> src/core -I src/event -I src/event/modules -I src/os/unix -I objs -I
> src/http -I src/http/modules -DNGX_HAVE_GMTOFF=0 -DNGX_HAVE_O_PATH=0
> -DNGX_ZLIB=0 -DNGX_HAVE_TCP_INFO=0 -DNGX_HAVE_MEMALIGN=0
> -DNGX_HAVE_POSIX_MEMALIGN=0 -DNGX_HAVE_IP_PKTINFO=0
> -DNGX_HAVE_GNU_CRYPT_R=0 -DNGX_HAVE_EPOLL=0 -DNGX_HAVE_EPOLLEXCLUSIVE=0
> -DNGX_HAVE_ACCEPT4=0"
>
> frama-c -metrics -cpp-extra-args="$CPP_EXTRA_ARGS" \
>         -kernel-warn-key annot:missing-spec=abort \
>         -kernel-warn-key typing:implicit-function-declaration=abort \
>         -save=$OUTPATH/parsed.sav $FILES | tee $OUTPATH/parse.log
>
> # EVA
>
> frama-c -load $OUTPATH/parsed.sav \
>         -main main \
>         -eva \
>         -eva-ignore-recursive-calls \
>         -eva-use-spec=ngx_log_error_core \
>         -eva-warn-key builtins:missing-spec=abort \
>         -eva-warn-key alarm=inactive \
>         -save=$OUTPATH/analyzed.sav | tee $OUTPATH/analyze.log
>
> [3]
>
> [eva] src/core/ngx_string.c:273:
> Assigning imprecise value to *tmp_4
>   (pointing to ngx_errlog_module with offsets [0..792],0%8;
>   ngx_core_module with offsets [0..792],0%8;
>   init_cycle with offsets [0..2488],0%8;
>   ngx_conf_module with offsets [0..792],0%8;
>   ngx_http_header_filter_module with offsets [0..792],0%8;
>   ngx_debug_points with offsets [0..280],0%8;
>   ngx_core_commands with offsets [0..3800],0%8;
>   ngx_core_module_ctx with offsets [0..120],0%8;
>   ngx_errlog_commands with offsets [0..440],0%8;
>   ngx_errlog_module_ctx with offsets [0..120],0%8;
>   ngx_log with offsets [0..312],0%8; ngx_log_file with offsets
> [0..152],0%8;
>   ngx_conf_commands with offsets [0..440],0%8;
>   ngx_syslog_dummy_log with offsets [0..312],0%8;
>   ngx_syslog_dummy_event with offsets [0..408],0%8;
>   ngx_http_header_filter_module_ctx with offsets [0..248],0%8;
>   __malloc_ngx_alloc_l22 with offsets [0..8184];
>   __malloc_ngx_alloc_l22_2 with offsets [0..408];
>
>    ... 100s to 1000s more malloc lines go here ...
>
>   __malloc_ngx_alloc_l22_7196; __malloc_ngx_alloc_l22_7197;
>   __malloc_w_ngx_alloc_l22_7198; __malloc_w_ngx_alloc_l22_7199;
>   __malloc_w_ngx_alloc_l22_7200}
>   because of Misaligned.
>
>
>
> _______________________________________________
> Frama-c-discuss mailing listFrama-c-discuss at lists.gforge.inria.frhttps://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
>
>
>
> _______________________________________________
> Frama-c-discuss mailing listFrama-c-discuss at lists.gforge.inria.frhttps://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
>
> --
> André Maroneze
> Researcher/Engineer CEA/List
> Software Reliability and Security Laboratory
>
> _______________________________________________
> Frama-c-discuss mailing list
> Frama-c-discuss at lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/frama-c-discuss/attachments/20200510/7991193c/attachment-0001.html>