Home

The CProver User Manual - Systems Verification Group

image

Contents

1. The default library for these functions is not an exact model it rather only provides an over approximation which means that it may allow behavior not present in a real implementation The goal of the over approximation is to track the length of the strings as opposed to their content Any detail that relates to the content of zero terminated strings is therefore lost The over approximation is realized as follows the input model is augmented by additional variables that track the termination status the buffer size and the length of a string These variables may be accessed by using the following three macros 37 Example strlen e Bool __CPROVER_is_zero_string const void str Returns true if str is zero terminated e unsigned _CPROVER_zero_string_length const void str Returns the position of a zero character in str Note that this is only an upper bound for the string length e unsigned _CPROVER_buffer_size const void buffer Returns the capacity of buffer These macros are automatically inserted into the model wherever necessary This means that all three properties of strings are soundly tracked through out the program in an abstract fashion While this not only increases the precision of the analysis compared to the absence of a model for the string functions it also improves the verification performance over full concrete implementations of the string functions Figure 6 1 shows the principle of the abs
2. if top i C counter next_timeframe 60 8 2 5 Driving Inputs Similarly the C model has to synchronize on the choice of the initial value of registers if the Verilog module does not perform initialization The C program can also restrict the choice of inputs of the Verilog module This is useful for adding environment constraints As an example consider a Verilog module that has a signal reset as an input which is active low The following C fragment drives this input to be active in the first cycle and not active in any subsequent cycle CPROVER assume top resetn 0 for i 1 i lt bound i CPROVER assume top resetn next_timeframe Care must be taken to avoid passing the property vacuously This happens if no trace actually satisfies the assumptions Mapping Variables within the Module Hierarchy Verilog modules are hierarchical The extern declarations shown above only allow reading the values of signals and registers that are in the top module In order to read values from sub modules CBMC uses structures As an example consider the following Verilog file module counter input clk input 7 0 increment reg 7 0 counter initial counter 0 always Q posedge clk counter counter increment endmodule module top input clk counter cl clk 1 counter c2 clk 2 endmodule The file has two modules a top module and a counter module The counter module is in
3. a 4294967295 11111111111111111111111111111111 State 2 file div_by_zero c line 5 function main b 1 00000000000000000000000000000001 Failed assertion division by zero file div_by_zero c line 8 function main 7 2 3 Floating Point Arithmetic Operators CBMC and SATABS support the following operators on variables of the types float double and long double Operator Description a unary minus negation atb sum a b subtraction a b multiplication a b division a lt b relation a lt b relation a gt b relation a gt b relation Note that the multiplication and division operators are very expensive with respect to the size of the equation that is passed to the SAT solver Fur thermore the equations are hard to solve for all SAT solvers known to us CBMC and SATABS also support type conversions to and from integer types Different rounding modes are currently not supported 41 7 2 4 The Comma Operator 7 2 5 Type Casts 7 2 6 Side Effects CBMC and SATABS support the comma operator a b The operands are evaluated for potential side effects The result of the operator is the right operand CBMC and SATABS have full support for arithmetic type casts As an example the expression unsigned char i for an integer i is guaranteed to be between 0 and 255 in case of an eight bit character type Properties Checked For the unsigned data types the ANSI C standard requires modulo semantics i e
4. 37 37 37 39 39 39 39 39 41 42 42 42 43 44 44 44 45 45 45 45 46 46 AT 48 48 49 49 49 7 9 3 The Relational Operators on Pointers 7 9 4 Pointer Type CastS o 7 9 5 String Constants e rg a gaki 7 9 6 Pointers to Functions 7 10 Dynamic Memory e Gil Concurrency roer ora a A A a 8 Hardware and Software Equivalence and Co Verification 8 1 Introduction a e e i e p a ro ae ee Re 8 2 A Small Tutorial s saas aai aa g k e a e ee ea 8 2 1 Verilog and ANSI C o o o 8 2 2 Counterexamples 8 2 3 Using the Bound s cs corc o o 8 2 4 Synchronizing Inputs B 200 Driving Inputs s sisin ees Ga ad ei ith ed A CBMC and SATABS License Agreement B Programming APIs B 1 Language Frontends o e e B 1 1 Scanning and Parsing oi a d bpa o B 1 2 TReP iii po ee be REPS A B13 CIYDES seeretari o A ade as B 1 4 Subtypes of typet s na e aaria padma i da ie BL LOCO se m aaa a Re RES B 1 6 Expressions ooe s s Toron dk ee po a A SE B 1 7 Subtypes of exprt o pa pd dia B 1 8 Symbols and the Symbol Table B 2 Goto Programs x 0 00 pas sed a oa B 3 Static Analysis 2 4654 4408 24 48 a eo eae AG B 3 1 A Brief Introduction to Abstract Interpretation B 3 2 Class Interfaces o B33 Exampl sc0i sas aaa ia
5. B 3 4 Using the parametrized static_analysist class B 4 Propositional LOgiC eiee s o o e 56 56 57 57 58 59 60 61 63 64 64 64 65 68 69 70 70 72 73 74 TT 77 78 82 84 84 1 Introduction 1 1 Motivation Correctness of computer systems is critical in today s information society Modern computer systems consist of both hardware and software compo nents The complexity of the software embedded in devices we use everyday has risen dramatically and correctness of such embedded software is often a bigger problem than that of the underlying hardware This is especially true of software that runs on computers controlling our transportation and communication infrastructure Examples of serious software errors are easy to find Manual inspection of complex software is infeasible and costly so tool sup port is in dire need Many tools rely on engineers to provide test vectors to uncover design flaws Formal verification on the other hand is automated and tools that implement it can check the behavior of a design for any vector of inputs Numerous tools to hunt down functional design flaws in silicon have been available for many years mainly due to the enormous cost of hardware bugs The use of such tools is wide spread In contrast the market for tools that address the need for quality software is still in its infancy Research in software quality has an enormous breadth and due to space restr
6. that no overflow exception occurs Thus overflow is not checked For signed data types an overflow exception is permitted Optionally CBMC and SATABS check for such arithmetic over flows CBMC and SATABS allow all side effect operators with their respective semantics This includes the assignment operators etc and the pre and post increment and decrement operators As an example consider the following program fragment unsigned int i j i j After the execution of the program the variable i will contain the initial value of j and j will contain the initial value of j plus one CBMC generates the following equation from the program i Jo j Jo l CBMC and SATABS perform the implicit type cast as required by the ANSI C standard As an example consider the following program fragment char c int i long 1 The value of i is converted to the type of the assignment expression c i that is char type The value of this expression is then converted to the type of the outer assignment expression that is Long int type Ordering of Evaluation The ANSI C standard allows arbitrary order ings for the evaluations of expressions and the time the side effect becomes visible The only exceptions are the operators amp amp and the ternary oper ator For the Boolean operators the standard requires strict evaluation 42 7 2 7 Function Calls from left to right and that the evaluation aborts once the
7. to facilitate localization and repair of the program In order to give a brief overview of the capabilities of SATABS we start with a series of small examples The description of the installation of the tool is postponed to Chapter 2 The issue of buffer overflows has brought wide public attention A buffer is a contiguous allocated chunk of memory represented by an array or a pointer in C Programs written in C do not provide automatic bounds checking on the buffer which means a program can accidentally or maliciously write past a buffer The following example is a perfectly valid C program in the sense that a compiler compiles it without any errors int main int buffer 10 buffer 20 10 However the write access to an address outside the allocated memory region can lead to unexpected behavior In particular such bugs can be exploited to overwrite the return address of a function thus enabling the execution of arbitrary user induced code SATABS is able to detect this problem and reports that the upper bound property of the buffer is violated SATABS is capable of checking the lower and upper bounds even for arrays with dynamic size 1 5 Hardware Software Co Verification Software programs often interact with hardware in a non trivial manner and many properties of the overall design only arise from the interplay of both components CBMC and SATABS therefore support Co Verification i e are able to r
8. 2 7 1 7 2 7 3 7 4 7 5 7 6 7 7 7 8 69 5 The Eclipse User Interface 6 Build Systems and Libraries Integration into Build Systems with GOTO CC 6 1 1 Example Building wu ftpd 6 1 2 Important Notes s vss assos o o ll A O 6 2 1 Linking libraries o 6 2 2 Abstractions for Zero Terminated Strings 7 ANSI C C Language Features Basic Datatypes o y eya nsen u o cms Operators 2466 pna ra ea e A 7 2 1 Boolean Operators aoaaa 7 2 2 Integer Arithmetic Operators 7 2 3 Floating Point Arithmetic Operators 7 2 4 The Comma Operator 04 Zo Type Casts 464 644 ae eee ee ee eS 2 6 Side Hflects 2 naa ak ae as BE OE A a E ES 2 7 Function Calls s s 260044 rosa Control Flow Statements 000 7 3 1 Conditional Statement laoa E CUEN e 8 on Whee Oe eee a dw ae a a dio BOBO on Sate he ee ue ee ee SRE Se ee a a 7 3 4 break and continue aac BHICC woos teu poi oe Oe eG AN a ae 3 00 LOOPS sora dim ee woe Se ee ae le ee ech Non Determinism nee a ee ws AA Assumptions and Assertions o e o ATAS es ea a roca Oe ee dl lo ee eee DIPMCUULCS a eng ee os Oe Gs hn Re ee ROS UNIONS sa aed e ee Ee Re eS A aw eS POINTES e e ae ic o ee Se ae EA 7 9 1 The Pointer Data Type 7 9 2 Pointer Arithmetic 31 36 36 36 36
9. 32 usecount tsk X Common launch options CBMC launch options SATABS launch options V Check assertions I Check array bounds I Check division by zero I Check pointers J Check arithmetic overflow Machine word width 16 bits 32 bits 64 bits Endianess Unspecified Little endian Big endian Main function main Restore defaults Files Selection Includes and Defines Options Figure 5 3 Specifying the command line options for SATABS After some time SATABS should report that the claim can be violated see Figure 5 7 12 SATABS provides a detailed execution trace that explains how the assertion can be violated The Trace view shown in Figure 5 7 can be used to step through the counterexample and provides the values of the variables in each step Since the program is not actually executed you can step forwards as well as backwards a feature which makes it much easier to understand the counterexample and find the error in the program 3 Therefore you cannot change the values of the variables as you can do in a debugger 33 Create manage and run configurations y E cbmcsatabs Eclipse Application ES Java Applet y Ejava Application Figure 5 4 Running SATABS in Eclipse int dummy_ open struct inode inode struct file filp 1 assert MAJOR inode gt i_rdev dummy major MOD_INC_USE_COUNT Check Selection Check by File Check by
10. 4 3 2 it is highly recommendable to use formal verificat as early as possible in your development cycle Unit testing is used in most software development projects and static verification with SAT ABS can be very well combined with this task Unit testing relies on a number test cases that yield the desired code coverage Such test cases are imlemented by a software testing engineer in terms of a test harness aka test driver and a set of function stubs Typically a slight modification to the test harness allows it to be used with SATABS Replacing the explicit input values with non deterministic inputs as explained in sections 4 3 1 and 4 3 2 guarantees that SATABS will try to achieve full path and state coverage due to the fact that predicate abstraction implicitely detects equivalence classes However it is not guaranteed that SATABS terminates in all cases Keep in mind that you must not make any assumptions about the validity of the claims if SATABS did not run to completion 26 or WN FR Oo OoN 10 11 12 13 14 15 16 Iy 18 19 20 21 22 23 24 25 26 27 28 29 30 31 extern int dummy_major int locked int init_module void locked FALSE int dummy_open struct inode inode struct file filp assert MAJOR inode gt i_rdev dummy major MOD_INC_USE_COUNT if locked return 1 locked TRUE return 0 success unsigned int dummy_read struct file filp char buf int max
11. Ng Arjuna Ekanayake and Barry Pangrle RTL C based methodology for designing and verifying a multi threaded processor In Proc of the 39th Design Automation Conference pages 123 128 ACM Press 2002 85
12. aborts with an unwinding assertion violation In order to disable this test run CBMC with the parameter no unwinding assertions For an unwinding bound of one no bug is found But already for a bound of two CBMC detects a trace that violates an assertion Without unwinding assertions CBMC does not prove the program correct but it can be helpful to find program bugs 3 1 5 A Note About Compilers and the ANSI C Library Most C programs make use of functions provided by a library instances are functions from the standard ANSI C library such as malloc or printf The verification of programs that use such functions has two requirements 1 Apppropriate header files have to be provided These header files contain declarations of the functions that are to be used 2 Appropriate definitions have to be provided Most C compilers come with header files for the ANSI C library functions We briefly discuss how to obtain install these library files Linux Linux systems that are able to compile software are usually equipped with the appropriate header files Consult the documentation of your dis tribution on how to install the compiler and the header files First try to compile some siginificant program before attempting to verify it Windows On Microsoft Windows CBMC SATABS are preconfigured to use the compiler shipped with Microsoft s Visual Studio Visual Studio Express is sufficient and is available for download for free from the Mi
13. declaration of the interface class symbolt public typet type exprt value std string name 73 Symbol Table B 2 Goto Programs std string base_name n Symbol names are unique Scopes are handled by adding prefixes to sym bols int main int argc char argv Symbol name c main 0 alice char alice 0 Symbol base alice Symbol name c main 1 alice int alice 0 Symbol base alice A symbol table is an object of class contextt This class is declared in util context h The code below shows a partial declaration of the interface class contextt public Insert the symbol bool add const symbolt amp symb Insert symb into the table and erase it New_symbol points to the newly inserted element bool move symbolt amp symbol symbolt amp new_symbol Insert symb into the table Then symb is erased bool move symbolt amp symb Return the entry of the symbol with given name const irept amp value const std string amp mame const J Goto programs are a representation of the control flow graph of a pro gram that uses only guarded goto and assume statements to model non sequential flow The main definition can be found in goto programs goto_program_template h which is a template class The concrete instan tiation of the template that is used in the framework can be found in goto programs goto program h A si
14. execution of programs for comparison with RTL is common practice 3 1 The previous work focuses on a small subset of ANSI C that is particu larly close to register transfer language Thus the designer is often re quired to rewrite the C program manually in order to comply with these constraints We extend the methodology to handle the full set of ANSI C language features This is a challenge in the presence of complex dynamic data structures and pointers that may dynamically point to multiple objects Furthermore our methodology allows arbitrary loop constructs The following Verilog module implements a 4 bit counter module top input clk reg 3 0 counter initial counter 0 always Q posedge clk counter counter l1 endmodule CBMC can take Verilog modules as the one above as additional input Similar as in co simulation the data in the Verilog modules is available to the C program by means of global variables For the example above the following C fragment shows the definition of the variable that holds the value of the counter register struct module_top unsigned int counter J extern const struct module_top top Using this definition the value of the counter register in the Verilog frag ment above can be accessed as top counter Please note that the name of the variable must match the name of the top module The C program only has a view of one state of the Verilog model The Verilog model makes a tra
15. if statements The guard is then set to the negated condition of the statement and goto target is set to bypass the conditionally executed code if this guard evaluates to true ASSUME An assumption statement that restricts viable paths reaching the in struction location to the ones that make the expression guard evaluate to true ASSERT An assertion whose guard is checked for validity when the instruction is reached 75 RETURN FUNCTION END ASSIGN SKIP OTHER A A return statement in a function Denotes the end of a function A variable assignment No operation Any operation not covered by enum goto_program_instruction_typet number of convenience functions in instructiont such as is_goto is_assume etc simplify type queries The following code segment shows a partial interface declaration of goto_program_template and instructiont template lt class codeT class guardT gt class goto_program_templatet public list of instruction type typedef std list lt class instructiont gt instructionst a reference to an instruction in the list typedef typename std list lt class instructiont gt iterator targett Sequential list of instructions representing sequential program flow instructionst instructions typedef typename std map lt const _targett unsigned gt target_numberst A map containing the unique number of each target target_ numberst target_numbers Get
16. input i this pointer is redirected to a local variable y The memory pointed to by p is then deallocated using free CBMC detects that there is an illegal execution trace in case that the input i is true void malloc unsigned void f _Bool i int xp int y p malloc sizeof int 10 if i p ky x error tf p points to y free p Notice that the standard semantics of malloc allow a return value of NULL if the allocation fails for any reason This is not part of the model that ships with CBMC and SATABS as too many programs rely on such faults not occurring Thus bugs relating to out of memory scenarios may be missed by CBMC or SATABS A simple but effective way to mend this is to replace malloc by a custom function say my_malloc that returns NULL non deterministically void malloc unsigned Bool nondet_bool void my_malloc unsigned s if nondet_bool return 0 return malloc s SATABS is able to verify concurrent multi threaded programs with shared variable communication between the threads Threads are created with the functions provided by the PTHREAD library As an example consider the following program include lt pthread h gt int g void thread void xarg g 2 54 int main pthread_t idl pthread_create amp idl NULL thread NULL this may fail g l assert g 1 SATABS also supports mutual exclusion via the pthread_m
17. instead of SMV SPIN Available from http spinroot com spin whatispin html SPIN is an explicit state model checker Since predicate abstrac tion generates abstract programs with many uninitialized vari ables explicit state model checking algorithms do not scale very well for this purpose We recommend to use one of the symbolic model checking tools mentioned above 4 Now you can execute SATABS Try running satabs on the small gt 14 examples presented in the tutorial Section 1 4 If you use the SMV model checker the only parameters you have to specify are the names of the files that contain your program 2 3 Installing the Eclipse Plugin As mentioned above we provide a graphical user interface which is re alized as a plugin to the Eclipse framework Eclipse is available at http wiw eclipse org We do not provide installation instructions for Eclipse basically you only have to download the current version and extract the files to your hard disk and assume that you have already installed the cur rent version You need version 3 2 or better the plugin does not work with version 3 1 In case you are running Windows make sure that the path containing the Visual Studio is in the PATH environment variable To install the Eclipse SATABS plugin perform following steps 1 In Eclipse open the menu Help Software Updates Find and Install 2 Select the radio button Search for new features to install 3 In the
18. int n if locked n nondet_int assume n gt 0 amp amp n lt max writing to the buffer is not modeled here return n return 1 int dummy_release struct inode inode struct file filp if locked MOD_DEC_USE_COUNT locked FALSE return 0 return 1 Figure 4 5 Sources driver c for the dummy device driver 27 1 int main int argc char argv 2 int rval size 3 struct file my_file 4 char buffer we do not model this buffer 5 struct inode inode 6 unsigned char random dummy_major register_chrdev 0 dummy 8 inode i_rdev dummy_major lt lt MINORBITS 9 init_module 10 assign arbitrary values 11 my_file f_mode nondet_uint 12 my_file f_pos nondet_uint 13 do 14 random nondet_uchar I5 assume 0 lt random amp amp random lt 3 16 switch random 17 case 1 18 rval dummy_open amp inode amp my file 19 break 20 case 2 21 count dummy read amp my file buffer BUF_SIZE 22 break 23 default 24 dummy_release amp inode amp my_file 25 26 while random locked 27 cleanup module 28 unregister_chrdev dummy_major dummy 29 return 0 30 Figure 4 6 Sources for the model checking harness 28 1 2 3 4 5 6 7 8 9 10 OCmANDOTKWNHY EH 10 11 12 13 14 15 int main int argc char argv char settings MAX SETTINGS M
19. now of type T bool has location locationt 1 returns true if the location 1 is found in the state map clear O this member function clears the state_map and sets the attribute initialized to false statet casts and duplicates the state s to make_temporary_ state statet amp s a state of type T specialization of abstract_domain baset bool merge statet a statet kb modifies the state a so to be the least upper bound of both states a and b i e U a b where a b C Da It is needed to obtain the new state after a join in the control flow of a goto program for instance The return value is true if a has been modified generate state locationt 1 sets the state associated to the location 1 to the infinmum of the abstract do main concrete class of type T which is a specialization of the abstract class abstract_domain_baset 81 B 3 3 Examples Member function Description fixedpoint computes the least fixed point of the sys goto programt tem of equations defined by the constraints goto_program on each state of each location of the goto goto_functionst program goto_program goto_functions Member Functions Before running a static analysis one has to specify the abstract domain D which will approximate a concrete domain De To do so we need to refine the abstract_domain_baset class with some specialization This instanti atable class will be passed via the
20. result is known The operands of the expression c x y must be evaluated as follows first c is evaluated If c is true x is evaluated and y otherwise As an example assume that a pointer p in the following fragment may point to either NULL or a valid active object Then an if statement as follows is valid since the evaluation must be done from left to right and if p points to NULL the result of the Boolean AND is known to be false and the evaluation aborts before p is dereferenced if p NULL amp amp p 5 For other operators such as addition no such fixed ordering exists As an example consider the following fragment int g int f g 1 if f0 g 1 In this fragment a global variable g is assigned to by a function f and just before an if statement Furthermore g is used in an addition expression in the condition of the if statement together with a call to fO If fO is evaluated first the value of g in the sum will be one while it is two if g is evaluated first The actual result is architecture dependent Properties Checked CBMC and SATABS model this problem as follows One option allows setting a fixed ordering of evaluation for all operators The other option allows checking for such artifacts CBMC and SATABS can assert that no side effect affects the value of any variable that is evaluated with equal priority This includes changes made indirectly by means of pointers In the example thi
21. return middle return 1 If you run CBMC on this function you will notice that the unwinding does not stop The built in simplifier is not able to determine a run time bound for this loop The unwinding bound has to be given as a command line argument cbmc binsearch c function binsearch unwind 5 CBMC not only verifies the array bounds note that this actually depends on the result of the right shift but also checks that enough unwinding is done i e it proves arun time bound For any lower unwinding bound there are traces that require more loop iterations Thus CBMC will produce an appropriate counterexample However CBMC can also be used for programs with unbounded loops In this case CBMC is used for bug hunting only CBMC does not attempt to find all bugs Consider the following program Bool nondet_bool Bool LOCK 0 Bool lock if nondet_bool assert LOCK LOCK 1 return 1 return 0 void unlock assert LOCK LOCK 0 int main unsigned got_lock 0 int times while times gt 0 if lock got_lock 4 x critical section 13 if got_lock 0 unlock got_lock times ES The while loop in the main function has no useful run time bound Thus the unwind parameter has to be used in order to prevent infinite unwinding However you will note that CBMC will detect that not enough unwinding is done and
22. some of the GCC header files In a later section we provide a more complex example where this is necessary 7 The Options tab see Figure 5 3 allows you to specify the command line options for SATABS This tab reflects exactly the parameters presented in Section 4 2 Make sure that checking of assertions is activated 8 Create a launch configuration by clicking the green run button 07 Figure 5 4 shows the dialog that allows you to create a new run con figuration for SATABS Name the new configuration Satabs and make sure that the satabs executable is selected not cbmc 9 Now run SATABS on usecount tsk by clicking oO again SATABS generates two claims for the driver example one for each assertion 10 Verify the first assertion by clicking on driver c with the right mouse button and selecting Check Selection As in the previous section SAT ABS takes a few seconds the verify that the assertion cannot be vi olated This is indicated by the symbol Y that is displayed in the left most column of the claim Furthermore assertions that have al ready been verified are highlighted in green in the source file 11 Repeat the same step for the second assertion This time SATABS takes a bit longer to come to a conclusion The log view in Figure 5 6 shows that SATABS needs several refinement abstraction iterations 2 In the source file view the assertion is highlighted yellow indicating that it has not been checked so far
23. table A call to parse generates the parse tree of the program The conversion into the symbol table is performed during type checking which is done by a call to the typecheck method The symbol table is a map from identifiers to the symbolt data structure include lt iostream gt include lt fstream gt include lt sstream gt include lt string gt include lt ansi c ansi_c_language h gt include lt util cmdline h gt include lt util config h gt int main int argc const charx argv Command line parse I incl_dir filet cmdlinet cmdl cmdl parse argc argv I config init 64 B 1 2 IRep if cmdl isset I config ansi_c include_paths cmdl get_values I Set language to C std auto_ptr lt languaget gt clang new_ansi_c language Symbol table contextt my _context for cmdlinet argst iterator sit cmdl args begin sit cmdl args end sit 4 Source code stream std ifstream in sit gt c_str Parse clang gt parse in std cerr Typecheck clang gt typecheck my_context sit std cerr Do some final adjustements clang gt final my_context std cerr my context show std cout return 0 The parse trees are implemented using a class called irept Its declaration and definiton can be found in the files util irep h and util irep cpp The code below shows some details o
24. the HDL implementation i e the product using the ANSI C imple mentation as a reference 4 Establishing the consistency does not require a formal specification However formal methods to verify either the hardware or software design are still desirable Related Work There have been several attempts in the past to tackle the problem In 7 a tool for verifying the combinational equivalence of RTL C and an HDL is described They translate the C code into HDL and use standard equivalence checkers to establish the equivalence The C code has to be very close to a hardware description RTL level which implies that the source and target have to be implemented in a very similar way There are also variants of C specifically for this purpose The SystemC standard defines a subset of C that can be used for synthesis 6 Further variants of ANSI C for specifying hardware are SpecC and Handel C among others The concept of verifying the equivalence of a software implementation and a synchronous transition system was introduced by Pnueli Siegel and Shtrich man 5 The C program is required to be in a very specific form since a mechanical translation is assumed In 2 Currie Hu and Rajan transform DSP assembly language into an equation for the Stanford Validity Checker However problems involving bit vector overflow are not detected and while loops are not supported 56 8 2 A Small Tutorial 8 2 1 Verilog and ANSI C The symbolic
25. the current exprt instance into a typecast The old value of the instance is appended as the single operand of the typecast i e the result is a typecast expression of the old expression to the indicated type void make_not Turns the current exprt instance into an expression with label not of the same type as the original expression The old value of the instance is appended as the operand of the not node If the original expression is of type bool the result represents the negation of the original expression with the following simplifications possibly applied e aay J e true false e false true void negate Turns the current exprt instance into a negation of itself depending on its type e For boolean expressions make_not is called 71 B 1 7 Subtypes of exprt e For integers the current instance is turned into a numeric negation expression unary of its old value Chains of unary nodes and negations of integer constants are simplified e For all other types irept make_nil is called bool sum const exprt amp expr bool mul const exprt amp expr bool subtract const exprt amp expr Expect the this object and the function argument to be constants of the same numeric type Turn the current exprt instance into a constant expression of the same type whose value edge points to the result of the sum product or difference of the two expressions If the operatio
26. window that pops up select New remote site and enter the URL http www cprover org satabs plugin lin for Linux or http www cprover org satabs plugin win for Windows or http www cprover org satabs plugin osx for MacOS X into the URL field Provide a name for the SATABS update site e g SATABS plugin see Figure 2 1 Install Update sites to visit Select update sites to visit while looking for new features Sites to include in search O A Eclipse org update site New Remote Site New Local Site New Update Site New Archived Site Name SATABS plugin URL http www inf ethz ch personal julah projects plugins Ed Import sites Export sites IV Ignore features not applicable to this environment Figure 2 1 Installing the Eclipse plugin for SATABS 4 Select the newly added update site and press Finish gt 5 2 4 Installing goto cc 5 Select the feature org feature CProver_version and clock Next Read the license thoroughly before you agree see also Chapter A and install the plugin by clocking Finish You will see a warning that the plugin is not digitally signed confirm with Install The plugin will be downloaded automatically It contains the Windows and Linux executables for SATABS Note that we do not provide the SMV executable nor the preprocessors Unless you have already added the model checker of your choice e g SMV to yo
27. 4 Double click the element usecount tsk in the Navigator to show the Eclipse view that provides the settings for this task Figure 5 2 shows the Files Selection tab which allows you to select the source files you want to run SATABS on Click on the Change button to select the directory where the source files you want to verify are located Choose the driver example described in the previous section 5 Select the source files spec c and driver c see Figure 5 2 Do not select the header files as the C preprocessor already includes the files 1 Eclipse is an open sourced integrated development environment and can be downloaded from http www eclipse org 31 CbmcSatabs usecount tsk Eclipse SDK File Edit Navigate Search Project Run Window Help Ci i al O 7 9 2 Alle e e e IF amp alee Project root directory rgw Developer cprover docjmanual driver change 2 gt y Compiler2 localhost Y DummyDriver v E shome georgw Developer cprover doc manual driver M spec c B project usecount tsk F driver h I kdew_t h F modules h Files Options Figure 5 2 Selecting source files using the Eclipse plugin for SATABS 6 The Includes and Defines tab can be used to specify include paths and add definitions This is useful if you want to replace the standard header files by a specific SATABS enhanced version Due to the GNU extensions to C SATABS fails to parse
28. AX LEN numSet getConfig settings if numSet 1 logEntry Missing config file exit 1 Figure 4 7 First few lines of main of AEON aeon c reading rc file handling missing options int getConfig char settings MAX_SETTINGS MAX_LEN char home MAX_LEW FILE fp rc file handler int numSet 0 number of settings strcpy home getenv HOME get home path strcat home aeonrc full path to rc file fp fopen home r if fp NULL return 1 no cfg ERROR while fgets settings numSet MAX_LEN 1 fp amp amp numSet lt MAX_SETTINGS numSet fclose fp return numSet Figure 4 8 Function getConfig of AEON lib_aeon c 29 OCOmAnNaOoaKWN Fe 10 11 12 13 14 15 16 17 18 19 20 21 N e oonan AWW 10 11 12 char strcpy char dest const char src int i for i 0 i destli src i if src i 0 break char strcat char dest const char src int i Jj i 0 J 05 while dest i 0 i do dest i srcl jl itt Je while src j 0 return dest Figure 4 9 Functions strcpy and strcat stubs c unsigned int nondet_uint void define SATABS_MAX BUF_LEN 65535 char getenv const char name char buffer size_t buf_size nondet_uint assume buf_size gt 1 amp amp buf_size lt SATABS_MAX_BUF_LEN 1 buffer char mal
29. BS reports VERIFICATION SUCCESSFUL after a few iterations How ever if we try to verify claim 2 SATABS reports that the property in line 14 in file spec c Figure 4 4 function unregister chrdev is violated i e the assertion is FALSE therefore the VERIFICATION FAILED Furthermore SATABS provides a detailed description of the counterexample i e the execution trace that violates the property On this trace dummy_open is called twice leading to a usecount of 2 the second call of course fails with rval 1 but the counter is increased nevertheless Then dummy_release is called to release the lock on the device Finally the loop is left and the call to unregister_chrdev results in a violation of the assertion since usecount is still 1 even though locked 0 4 3 2 Example Buffer Overflow in a Mail Transfer Agent The example presented in Section 4 3 1 is obviously a toy example and can hardly be used to convince your project manager to use static verification in your next project Even though we recommend to use formal verification and specification already in the early phases of your project the sad truth is that in most projects verification of any kind is still pushed to the very end of the development cycle Therefore this section is dedicated to the verification of legacy code However the techniques presented here can also be used for unit testing We explain how to model check AEON version 0 2a a small mail transfer ag
30. Property Check All Stop Selection Stop All Stop Session Terminate Session Reset Session Figure 5 5 Using SATABS to verify an assertion 34 int unregister_chrdev unsigned int major const char name if MOD_IN_USE ERROR assert 0 Claims SATABS usecount tsk Check Selection Check by File Check by Property Check All Stop Selection Stop All Stop Session Terminate Session Reset Session O rie Property Description expression O assertion assertion unsigned int inode i_rdev gt gt 8 unsigned intid assertion assertion Problems fe Log gt Ae CEGAR Loop Iteration 28 Running Cadence SMY smy force sift Cadence SMY produced counterexample Simulating abstract transitions of counterexample on concrete program Spurious transition Found Trace is spurious Refining transition CEGAR Loop Iteration 29 Running Cadence SMY smv force sift Figure 5 6 SATABS is checking an assertion iterations shown in log case 2 assume lock_held count dummy read my file buffer BUF_SIZE break case 3 dummy release inode my file lock_held FALSE E EU Claims SATABS usecount tsk 23 Check Selection Check by File Check by Property Check All Stop Selection Stop All Stop Session Terminate Session Reset Session E File Proper Description ression driver c assertion assertion unsigned int i
31. The CPROVER User Manual SATABS Predicate Abstraction with SAT CBMC Bounded Model Checking Contents Introduction 4 Wed Motivation socios a Rd air da Gem do cd 4 4 1 2 Bounded Model Checking with CBMC 5 1 3 Automatic Program Verification with SATABS 5 Ld AcShort Tutorial J44 4449 04 08 0 Hae ou aS 5 1 5 Hardware Software Co Verification 6 Installation 7 2 1 Installing CBM Oor i ti h da AOS T 2 2 Installing SATABS 7 2 3 Installing the Eclipse Plugin o 9 2 4 Installing GOTO 0C oooooooooo ooo o 10 CBMC Bounded Model Checking for C C 11 31 A Short Tutorial a socorro Oe gee eds 11 3 1 1 First Steps e esos Behe e a A 11 3 1 2 Verifying Modules 12 3 1 3 Loop Unwinding i enis ka dre pa y e e 12 3 1 4 Unbounded Loops oaoa e e 13 3 1 5 A Note About Compilers and the ANSI C Library 14 3 2 Command Line Interface o o o 15 Verifying C C Programs with SatAbs 16 4 1 Background mission 16 4 1 1 Abstraction and Refinement 16 AN2 Properties s koe nk Oe Shae Bo hee Shaye ee SY 18 4 2 Using the Command Line Interface o 19 43 Tutorials a e AA A 20 4 3 1 Example Reference Counting in Linux Device Drivers 21 4 3 2 Example Buffer Overflow in a Mail Transfer Agent 23 4 33 Unit Testing with SATABS 26 6 1 6
32. abel of the node is equal to nil The second method returns false if the label of the node is equal to nil 66 const irep idt amp id void id const irep_idt amp _data The first method returns a constant reference to the label of the node The second method sets the label of the node const irept amp find const irep_namet amp name const irept amp add const irep_namet amp name const irep_idt amp get const irep_namet amp name const e The first method looks for an edge with label name and returns the corresponding child If no edge with label name is found then nil_rep is returned e The second method does the same as the first except that if no edge with label name if found then a new child is created and returned e The third method does the same as the first except that the label of the child is returned instead of a reference If no edge with label name is found then an empty string is returned void set const irep_namet amp name const irep_idt amp value void set const irep_namet name const long value void set const irep_namet amp name const irept amp irep These methods create a new edge with label name If the second argument is an object of class irept then it is assigned to the new child If the second argument is a string then it is set as node label of the new child If the second argument is a number then it is converted to a string and set as node label of the ne
33. alysis baset class 79 Attributes Member function Description initialize goto program for each location of the goto_program this member function generates an associated state see generate_state below and sets the attribute initialized to true initialize goto function for each location of each function in goto functions this member func tion generates an associated state see generate_state below and sets the attribute initialized to true locationt successor 1 returns the successor location of 1 in the control flow graph CFG bool visit 1 kworking set goto_program amp goto_functions marks the state associated to the location 1 as visited then crawls through the suc cessor locations of 1 and for each pair con sisting of the location 1 and one of its suc cessor merges their corresponding states abstract_domain_baset element If the state of 1 has been changed or its successor location has not yet been seen then it puts this successor location in the working set Returns true if the state of 1 has been changed See the definition of merge for more details statet get_state 1 pure member function defined in special izations of this class see the interface of the specialization static_analysist be low generate_states amp goto_program for each location of the goto_program this member function sets the corresponding state to the infin
34. amic allocation This is described in section 7 10 CBMC and SATABS allow the use of unions to use the same storage for multiple data types Internally CBMC actually shares the literals used to represent the variables values among the union members 48 7 9 Pointers Properties Checked CBMC and SATABS do not permit the use of unions for type conversion as this would result in architecture dependent behavior Specifically if a member is read the same member must have been used for writing to the union the last time 7 9 1 The Pointer Data Type 7 9 2 Pointer Arithmetic Pointers are commonly used in ANSI C programs In particular pointers are required for call by reference and for dynamic data structures CBMC and SATABS provide extensive support for programs that use pointers according to rules set by the ANSI C standard including pointer type casts and pointer arithmetic The size of a pointer e g sizeof void is by default 4 bytes This can be adjusted using a command line option Conversion of pointers from and to integers The ANSI C standard does not provide any guarantees for the conversion of pointers into integers However CBMC and SATABS ensure that the conversion of the same ad dress into an integer yields the same integer The ANSI C standard does not guarantee that the conversion of a pointer into an integer and then back yields a valid pointer CBMC and SATABS do not allow this construct CBMC and SATABS supp
35. at ends in a state which violates the property In our example the program trace ends in the faulty array access It also shows the values the input variables must have for the bug to occur In this example argc must be one to trigger the out of bounds array access If you change the branch condition in the example to argc gt 2 the bug is fixed and CBMC will report a successful verification run In the example above we used a program that starts with a main func tion However CBMC is aimed at embedded software and these kinds of programs usually have different entry points Furthermore CBMC is also useful for verifying program modules Consider the following example called file2 c int array 10 int sum unsigned i sum sum 0 for i 0 i lt 10 i sum array i In order to set the entry point to the sum function run cbmc file2 c function sum You will note that CBMC unwinds the for loop in the program As CBMC performs Bounded Model Checking all loops have to have a finite upper run time bound in order to guarantee that all bugs are found CBMC actually checks that enough unwinding is performed As an example consider the program binsearch c int binsearch int x int a 16 signed low 0 high 16 while low lt high signed middle low high low gt gt 1 if a middle lt x high middle else if a middle gt x 12 3 1 4 Unbounded Loops low middle 1 else a middle x
36. ation_exprt predicate_exprt Convenience functions to create and ma nipulate binary expressions of type bool equality_exprt binary_relation_exprt Convenience functions to create and ma nipulate equality expressions such as a b ieee_float_equal_exprt binary_relation_exprt Convenience functions to create and manipulate equality expressions between floating point numbers index_exprt Represents an array access expression such as alij Convenience functions array and index for accessing the array ex pressions and indexing expression typecast_exprt Represents a cast to the type of the expres sion and_exprt implies_exprt or_exprt not_exprt Representations of logical operators with convenience constructors address_of_exprt Representation of a C style La address of operation Convenience function object for accessing operand dereference_exprt Representation of a C style a pointer dereference operation Convenience func tion object for accessing operand if_exprt Representation of a conditional expre sion with convenience functions cond true_case and false_case for access ing operands member_exprt Represents a some_struct some_field member access B 1 8 Symbols and the Symbol Table Symbol A symbol is an object of class symbolt This class is declared in util sym bol h The code below shows a partial
37. bmc filei c show claims CBMC will print the list of properties it checks Note that it prints a claim labeled with array argv upper bound together with the location of the faulty array access As you can see CBMC largely determines the property it needs to check itself Examples for user specified properties are given in Sec 7 5 Note that these claims need not necessarily correspond to bugs these are just potential flaws Whether one of these claims corresponds to a bug needs to be determined by further analysis The first step of this analysis is symbolic simulation which corresponds to a translation of the program into a formula The formula is then combined with the property Let s run the symbolic simulation cbmc filei c show vcc With this option CBMC performs the symbolic simulation and prints the verification conditions on the screen A verification condition needs to be proven to be valid in order to assert that the corresponding property holds Let s run the verification 1 This is realized by means of a preliminary static analysis which relies on computing a fixed point on an abstract domain 11 3 1 2 Verifying Modules 3 1 3 Loop Unwinding cbmc filel c CBMC transforms the equation you have seen before into CNF and passes it to a SAT solver It can now detect that the equation is actually not valid and thus there is a bug in the program It prints a counterexample trace i e a program trace th
38. can choose between following alternatives a Cadence SMV Available from http www kenmcmil com smv html1 Cadence SMV is a commercial model checker The free version that is available on the homepage above must not be used for commercial purposes read the license agreement thoroughly be fore you download the tool The documentation for SMV can be found in the directory where you unzip untar SMV under smv doc smv We recommend to add the smv binary lo cated in smv bin relative to the path where you unpacked it to your PATH SATABS uses Cadence SMV by default NuSMV Available from http nusmv irst itc it NuSMV is the open source alternative to SMV Installation in structions and documentation can be found on the NuSMV hom peage We recommend to use NuSMV on platforms where SMV is not available e g MAC OS X Again the NuSMV binary should be added to your PATH Use the option modelchecker nusmv to instruct SATABS to use NuSMV BOPPO Available from http www cprover org satabs BOPPO is a model checker that uses SAT solving algorithms BOPPO relies on a built in SAT solver and Quantor a solver for quantified boolean formulas that is currently bundled with BOPPO but also available separately from http fmv jku at quantor We recommend to add both tools to your PATH By default SAT ABS uses the Cadence SMV model checker Use the option modelchecker boppo when you call SATABS and want it to use BOPPO
39. ce shows a value for size less then 10 Furthermore CBMC and SATABS check that the size of arrays with dy namic size is non negative As an example consider the following fragment 47 7 7 Structures 7 8 Unions signed size nondet_int char a size For this fragment CBMC produces a counterexample in which the size of the array a is negative CBMC and SATABS allow arbitrary structure types The structures may be nested and may contain arrays The sizeof operator applied to a structure type yields the sum of the sizes of the components However the ANSI C standard allows arbitrary padding between components In order to reflect this padding the sizeof operator should return the sum of the sizes of the components plus a nondeterminis tically chosen non negative value Recursive Structures Structures may be recursive by means of pointers to the same structure As an example consider the following fragment void malloc unsigned struct nodet struct nodet xn int payload 3 int main unsigned i struct nodet xlist void x 0 struct nodet xnew_node for i 0 i lt 10 i new_node malloc sizeof new_node new_node gt n list list new_node The fragment builds a linked list with ten dynamically allocated elements Structures with Dynamic Array The last component of an ANSI C structure may be an incomplete array an array without size This incomplete array is used for dyn
40. crosoft webpage Visual Studio installs the usual set of header files together with the compiler However Visual Studio requires a large set of environment variables for the compiler to function correctly It is therefore recommended to run CBMC or SATABS from the Visual Studio Command Prompt which can be found in the menu Visual Studio Tools Note that in both cases only header files are available CBMC SATABS only come with a small set of definitions which includes functions such as malloc Detailed information about the built in definitions is in Chapter 6 14 3 2 Command Line Interface This section describes the command line interface of CBMC Like a C com piler CBMC takes the names of the c source files as arguments Additional options allow to customize the behavior of CBMC Option Description I path set include path C C D macro define preprocessor macro C C program only only show program expression function name set main function name all claims keep all claims unwind nr unwind nr times unwindset nr unwind given loop nr times show claims only show claims dimacs generate CNF in DIMACS format document subgoals generate subgoals documentation slice remove unrelated assignments no assertions ignore assertions no unwinding assertions do not generate unwinding assertions no bounds ch
41. e Example satabs show claims filel c file2 c This call instructs SATABS to generate and display a list of claims for the program that is contained in filet c and filel c If for instance you wanted to verify the 27 claim in this list you could do this by calling satabs claim 2 filel c file2 c Verifying claims separately is highly recommended for reasons of scalability SatAbs Parameters There are several command line parameters that allow you to control the behaviour of SATABS We divide them into two groups The switches in the first group have an impact on how SATABS actually behaves The remaining switches tell SATABS to provide more information and are therefore useful for debugging or inspecting the verifi cation results Calling satabs help lists the switches described below 1 This restriction is not imposed by the verification algorithms that are used by SAT ABs they also work on assembly code The reason is simply that so far no assembly language frontend is available for SATABS 19 4 3 Tutorials Parameters to modify behavior 16 32 function name claims lt n gt modelchecker name iterations The int type of the system you run your pro gram on is assumed to have a size of 16 bits and 32 bits respectively This information is neces sary in order to model overflows accurately Set the name of the main function i e the entry point of the program By default main will b
42. e considered to be the main function Verify only a single claim lt n gt denotes an index into the list of claims that SATABS prints when called with the parameter show claims It is recommended to verify claims separately in order to avoid scalability problems Specify the model checker that SATABS shall use Currently the values boppo cadence smv nusmv cmu smv spin and satmc are supported Allows you to specify the maximum number of refinement iterations By default the number of iterations is 50 Parameters to increase verbosity show claims show goto program show value sets Lists the claims see Section 4 1 2 that SATABS will try to verify for the given program Allows you to inspect the goto program that corresponds to your input program Goto programs correspond to the original input pro gram but the function calls have been inlined and control flow statements have been replaced by conditional jumps guareded gotos to ba sic blocks Show the data used for pointer analysis Structured output can be obtained from SATABS using the option xml ui Any output from SATABS e g counterexamples will then use an XML representation Similar to unit testing the model checking approach requires the user to clearly define what parts of the program should be tested This requirement has following reasons e Despite recent advances the size of the programs that model checkers can cope wi
43. e name given must match the name of the module in the Verilog file Multiple Verilog files can be given on the command line The bound parameter is not to be confused with the unwind parameter While the unwind parameter specifies the maximum unwinding depth for loops within the C program the bound parameter specifies the number of times the transition relation of the Verilog module is to be unwound For the given example the verification is successful If the first assertion is changed to assert top counter 10 and the bound on the command line is changed to 6 CBMC will produce a counterexample CBMC produces two traces One for the C program which matches the traces described earlier and a separate trace for the Verilog module The values of the registers in the Verilog module are also shown in the C trace as part of the initial state 58 8 2 3 Using the Bound Initial State bound 6 00000000000000000000000000000110 counter 0 1 2 3 4 5 6 Failed assertion assertion line 6 function main Transition system state 0 counter 0 0000 Transition system state 1 counter 1 0001 Transition system state 2 counter 2 0010 Transition system state 3 counter 3 0011 Transition system state 4 counter 4 0100 Transition system state 5 counter 5 0101 Transition system state 6 counter 6 0110 The following program is using the bound variable to check the counter value in all cycle
44. e equations are hard to solve for all SAT solvers known to us As an example consider the following program int main unsigned char a b unsigned int result 0 i a nondet_uchar b nondet uchar for i 0 i lt 8 i if b gt gt i amp 1 result a lt lt i assert result axb The program nondeterministically selects two 8 bit unsigned values and then uses shift and add to multiply them It then asserts that the result i e the sum matches a b Although the resulting SAT instance has only about 1400 variables it takes 12 minutes to solve using Chaff Properties Checked Optionally CBMC and SATABS allow checking for arithmetic overflow in case of signed operands In case of the division and the remainder operator CBMC and SATABS check for division by zero This check can be disabled using a command line option As an example the following program nondeterministically selects two un signed integers a and b It then checks that either of them is non zero and then computes the inverse of a b 40 int main unsigned int a b c a nondet_uint b nondet_uint if a gt 0 b gt 0 c 1 atb However due to arithmetic overflow when computing the sum the division can turn out to be a division by zero CBMC generates a counterexample as follows for the program above Initial State c 0 00000000000000000000000000000000 State 1 file div_by_zero c line 4 function main
45. e function It then asserts that the right function was called int global int f global 0 int g global 1 int h global 2 ptr f gt typedef int xf g h y fptr table void select unsigned x if x lt 2 table x assert global x 7 10 Dynamic Memory CBMC and SATABS allow programs that make use of dynamic memory al location e g for dynamically sized arrays or data structures such as lists or graphs As an example the following fragment allocates a variable number of integers using malloc writes one value into the last array element and then deallocates the array void malloc unsigned void f unsigned int n int xp p malloc sizeof int n p n 1 0 free p Properties Checked CBMC and SATABS check array bounds of dynam ically allocated arrays and it checks that a pointer pointing to a dynamic object is pointing to an active object i e that the object has not yet been freed and that it is not a static object Furthermore CBMC and SATABS check that an object is not freed more than once In addition to that CBMC can check that all dynamically allocated mem ory is deallocated before exiting the program i e CBMC can prove the absence of memory leaks 53 7 11 Concurrency As an example the following fragment dynamically allocates memory and stores the address of that memory in a pointer p Depending on an
46. e value 0 may fail while _ CPROVER _assume value 0 assert value lt 10 passes When using the _ CPROVER_assume statement it must be ensured that there still exists a program trace that satisfies the condition Otherwise any prop erty will pass vacuously This should be checked by replacing the property by false If no counterexample is produced the assumptions eliminate all program paths CBMC and SATABS allow arrays as defined by the ANSLC standard This includes multi dimensional arrays and dynamically sized arrays Dynamic Arrays The ANSI C standard allows arrays with non constant size as long as the array does not have static storage duration i e is a non static local variable Even though such a construct has a potentially huge state space CBMC and SATABS provide full support for arrays with non constant size The size of the Boolean equation that is generated does not depend on the array size but rather on the number of read or write accesses to the array Properties Checked CBMC and SATABS check both lower and upper bound of arrays even for arrays with dynamic size As an example consider the following fragment unsigned size nondet_uint char a size a 10 0 In this fragment an array a is defined which has a nondeterministically chosen size The code then accesses the array element with index 10 CBMC produces a counterexample with an upper array bound error on array a The tra
47. eason about a C C program together with a circuit description given in Verilog These co verification capabilities can also be applied to perform refinement proofs Software programs are often used as high level descriptions of cir cuitry While both describe the same functionality the hardware implemen tation usually contains more detail It is highly desirable to establish some form for equivalence between the two descriptions Hardware Software co verification and equivalence checking with CBMC and SATABS are described in Chapter 8 2 Installation 2 1 Installing CBMC 2 2 Installing SatAbs This Chapter provides step by step installation instructions for CBMC and SATABS CBMC and SATABS are command line tools but a graphical user interface is also available Both CBMC and SATABS require a code pre processing environment com prising of a suitable preprocessor and an a set of header files These pro grams and files usually come with a compiler In addition to that the SATABS verification system relies on an additional model checker for the abstract models It is modular and can be used with different model checking tools Currently SATABS supports the model checkers Cadence SMV NuSMV CMU SMV SPIN and BOPPO The license for CBMC and SATABS is provided in Chapter A Note that the Windows and Linux x86 binaries are also contained in the Eclipse plugin Therefore if you intend to run CBMC or SATABS exclu sively within Ec
48. eck do not do array bounds check no div by zero check do not do division by zero check no pointer check do not do pointer check bound nr number of transitions beautify greedy beautify the counterexample greedy heuristic beautify pbs beautify the counterexample PBS cvc output subgoals in CVC syntax smt output subgoals in SMT syntax outfile Filename output subgoals to given file 16 32 64 set width of machine word little endian allow little endian word byte conver sions big endian allow big endian word byte conversions show symbol table show symbol table show goto functions show goto program floatbv use genuine IEEE floating point arith metic ppc macos set MACOS PPC architecture 1386 macos set MACOS 1386 architecture 1386 linux set Linux 1386 architecture default no arch don t set up an architecture arrays uf none never turn arrays into uninterpreted functions arrays uf always always turn arrays into uninterpreted functions interpeter do concrete execution Structured output can be obtained from CBMC using the option xml ui Any output from CBMC e g counterexamples will then use an XML representation 15 4 Verifying C C Programs with SatAbs 4 1 Background This section provides background information on how SATABS opera
49. ecked for satisfiability by using an efficient SAT procedure If the formula is satisfiable a counterexample is extracted from the output of the SAT procedure If the formula is not satisfiable the program can be unwound more to determine if a longer counterexample exists In many engineering domains real time guarantees are a strict requirement An example is software embedded in automotive controllers As a conse quence the loop constructs in these types of programs often have a strict bound on the number of iterations CBMC is able to formally verify such bounds by means of unwinding assertions Once this bound is established CBMC is able to prove the absence of errors A more detailed description of how to apply CBMC to program verification is in Chapter 3 1 3 Automatic Program Verification with SatAbs gt 4 1 1 4 A Short Tutorial gt 2 In many cases lightweight properties such as array bounds do not rely on the entire program A large fraction of the program is rrelevant to the property SATABS exploits this observation and computes an abstraction of the program in order to handle large amounts of code In order to use SATABS it is not necessary to understand the abstraction refinement process For the interested reader a high level introduction is provided in Chapter 4 1 Just as CBMC SATABS attempts to build counterexamples that refute the property If such a counterexample is found it is presented to the engineer
50. ed by the number of possible execution paths and the undecidability caused by infinite models however is a major hurdle toward scalability to large software systems To overcome these difficulties a particular approach is to approximate the concrete domain De via a so called abstract domain Da i e a concrete domain of values De is replaced by an abstract domain of descriptions of values D Even if an abstract domain is not as precise as its concrete counterpart it still permits to answer some questions using static analysis for instance The framework of abstract interpretation was introduced and formalized by P Cousot and R Cousot in 1977 1 It relies on lattice theory and Ga lois connections and we refer to 2 for an in depth description of abstract interpretation Nevertheless we will now explain the basics needed to un derstand the class interface of abstract_domain_baset static_analysis and its base class static_analysis_baset TT Definitions To be a useful abstraction an abstract domain D should form a lattice The latter is a partially ordered set with the pre order relation C where every subset S C Da has a least upper bound denoted by US as well as a greatest lower bound MS The supremum T of D is the least upper bound of Da and the infimum L of Da is the greatest lower bound of Da The abstract domain is then denoted by D E 1 T U M In the context of abstract interpretation the pre order relation E can intuit
51. elements and MAX_LEN is defined to be 512 in aeon h Therefore SATABS would have to run through at least 512 iterations only to verify or reject one of the more than 700 claims Does this fact defeat the purpose of static verification SATABS provides a mechanism to bypass this problem by providing a de tection for deep loops For this purpose SATABS has to rely on a close cooperation with the abstract model checker Currently the only model 25 checker that provides loop detection is BOPPO please refer to Chapter 2 to find out where to obtain and how to install BOPPO The switch loop detection tells SATABS to activate the detection of deep loops Furthermore we tell SATABS to use BOPPO instead of the default model checker SMV satabs claim 5 modelchecker boppo loop detection I include aeon c base64 c lib_aeon c stubs stubs c This time SATABS will tell us that it found a potential buffer overflow e Violated property file stubs c line 8 column 10 function c strepy dereference failure array home upper bound dest amp home 0 i gt 512 Furthermore SATABS provides a counterexample trace that demonstrates how the buffer overflow be reproduced If you use the Eclipse plugin as described in Chapter 5 you can step through this counterexample although it might be a bit tedious to step through 512 iterations of the loop in strcpy 4 3 3 Unit Testing with SatAbs As mentioned in Section
52. ent written by Piotr Benetkiewicz freshmeat net claims that AEON is a good choice for hardened or minimalistic boxes Our first naive attempt to verify AEON using satabs show claims aeon c base64 c lib_aeon c fails due to the following reasons 1 Parsing errors On some of the architectures that SATABS supports the frontend of SATABS fails to parse the ANSI C standard header files 3 available for download on the SATABS homepage and on http freshmeat net projects aeon 23 that are distributed with your compiler of choice e g GCC makes use of extensions of the C programming language that cannot be handled by SATABS 2 Missing library functions As already stated on page 19 SATABS is unable to find the source code for library functions like strcmp getenv and strtok To solve the first of these problems we provide a set of header files that are suitable for the SATABS C parser The header files can be downloaded from the SATABS homepage Assuming that you have unpacked these files to the directory include relative to the source files of AEON you can now instruct SATABS to use these header files as follows satabs show claims I include aeon c base64 c lib_aeon c After complaining that the bodies of approximately 30 library functions are missing SATABS will provide more than 150 claims for AEON Now do you have to provide a body for all missing library functions There is no easy answer to this qu
53. ently not interested in the potential buffer overflow that may result from a call to this function Note the usage of the function nondet_int in line 17 This is an internal SATABS function that nondeterministi cally returns an arbitrary integer value The function assume line 18 2 http research microsoft com SLAM 21 1 int usecount int dummy_major extern int locked 4 int register chrdev unsigned int major const char name 5 1 6 usecount 0 T if major 0 8 return MAJOR NUMBER 9 return major 10 11 int unregister_chrdev unsigned int major const char name 12 13 if MOD_IN_USE 14 ERROR assert 0 15 16 else 17 return 0 18 Figure 4 4 Sources spec c for the un registering of device drivers tells SATABS to ignore all traces that do not adhere to the given as sumption Therefore whenever the lock is held dummy_read will re turn a value between 0 and max If the lock is not held then dummy_read returns 1 5 dummy release in driver c If the lock is held then dummy_release decreases the usecount releases the lock and returns 0 Otherwise the function returns 1 We now want to check if any valid sequence of calls of the dispatch func tions Figure 4 5 can lead to the violation of the assertion in line 14 in Figure 4 4 Obviously a call to dummy_open that is immediately followed by a call to unregister_chrdev violates the assertion Therefore we rule out this
54. estion but a viable answer would be most likely not It is necessary to understand how SATABS handles functions without bodies It simply assumes that such a function returns an arbitrary value but that no other locations than the one on the left hand side of the assignment are changed Obviously there are cases in which this assumption is unsound since the function potentially modifies all memory locations that it can somehow address Consider the first few lines of the main function of AEON in Figure 4 7 and the function getConfig in Figure 4 8 The function getConfig makes calls to strcpy strcat getenv fopen fgets and fclose It is very easy to provide an implementation for the functions from the string library string h and we do so in Figure 4 9 The implementation of getenv is not so straight forward The man page of getenv which we obtain by entering man 3 getenv in a Unix or cygwin command prompt tells us getenv searches the list of environment variable names and values us ing the global pointer char environ for a variable whose name matches the string at NAME If a variable name matches getenv returns a pointer to the associated value SATABS has no information whatsoever about the content of environ Even if SATABS could access the environment variables on your computer a suc cessful verification of AEON would then only guarantee that the claims for this program hold on your computer with a
55. f class rept class irept public typedef std vector lt irept gt subt typedef std map lt irep name_string irept gt named subt public class dt public unsigned ref_count dstring data named_subt named sub named_subt comments subt sub 65 Interface of Class irept pa protected dt x data oe Every node of any tree is an object of class irept Each node has a pointer to an object of class dt The dt objects are used for storing the actual content of nodes Objects of class dt are dynamically allocated and can be shared between nodes A reference counter mechanism is implemented to automatically free unreachable dt objects Copying a tree is an O 1 operation The field data of class dt is a hashed string representing the label of the nodes The fields named_sub comments and sub are links to childs Edges are either labeled with a string or ordered The string labeled edges are stored in the map comments if their first character is Otherwise they are stored in the map named_sub The labels of edges are unique for a given node however their ordering is not preserved The field sub is a vector of nodes that is used for storing the ordered children The order of edges of this kind is preserved during copy operands Figure B 2 Tree for the expression a b with int a char 6 virtual bool is_nil const virtual bool is_not_nil const The first method returns true if the l
56. he successor loca tion to_l Generally these guards vary de pending on the location type i e if it is a conditional assignment or declaration lo cation exprt get_guard returns the guard expression of a condi from to tional location from in the control flow of the corresponding goto program jumping to to exprt get_return _lhs if to is a location after a function call re to turns the right hand side of the expression that assigns the returned values to the cor responding variables of the caller Member Functions The base class static_analysis baset The class static_analysist and its base static_analysis_baset are used to perform the static analysis with the appropriate abstract domain Type definition Description locationt a location is an instruction list of a goto program statet as already mentioned above a location is an iterator pointing to an instruction of an instruction list of a goto program working sett set of locations that still have to be vis ited by the visit member function during the calculation of the fixed point by the fixed_point member function Type Definitions Attribute Description namespacet ns namespace that is to be considered during the analysis of the corresponding goto pro gram bool initialized specifies if for each location of the goto pro gram a state has been generated see the generate_states member function of the static an
57. ictions we focus the presentation using two criteria 1 We believe that any form of quality requires a specific guarantee in theory and practice 2 The sheer size of software designs requires techniques that are highly automated In practice quality guarantees usually do not refer to total correctness of a design as ensuring the absence of all bugs is too expensive for most applications In contrast a guarantee of the absence of specific flaws is achievable and is a good metric of quality This manual documents two programs that try to achieve formal guarantees of the absence of specific problems CBMC and SATABS The algorithms implemented by CBMC and SATABS are complementary and often one tool is able to solve a problem that the other cannot solve Both CBMC and SATABS are verification tools for ANSI C C programs They verify array bounds buffer overflows pointer safety exceptions and user specified assertions Both tools model integer arithmetic accurately and are able to reason about machine level artifacts such as integer overflow CBMC and SATABS are therefore able to detect a class of bugs that has so far gone unnoticed by many formal verification tools 1 2 Bounded Model Checking with CBMC gt 3 CBMC implements a technique called Bounded Model Checking BMC In BMC the transition relation for a complex state machine and its speci fication are jointly unwound to obtain a Boolean formula which is then ch
58. inates There fore SATABS allows to specify an upper bound for the number of iterations Note that whenever this upper bound is reached and no counterexample was found that does not necessarily mean that there is none In this case you cannot make any conclusions at all with respect to the correctness of the 17 4 1 2 Properties m m m refine m m b refine m m m m 1 2 3 level of abstraction abstract gt iterations Figure 4 3 Iterative Abstraction Refinement input program We have mentioned properties several times so far but we never explained what kind of properties SATABS can verify We cover this topic in more detail in this section While users of SATABS almost never have to be concerned about the underlying refinement abstraction algorithms under standing the classes of properties that can be verified is crucial SATABS allows the verification of following properties e Buffer overflows For each array SATABS checks whether the upper and lower bounds are violated whenever the array is accessed i e whenever the program reads from or writes to the buffer e Pointer safety SATABS searches for null pointer dereferences e Divison by zero SATABS checks whether there is a path in the program that executes a divison by zero e User specified assertions This is the most generic class of sup ported properties SATABS checks for assertion violations The user can u
59. ines like CC gcc by CC goto cc A helpful command that accomplishes this task successfully for many projects is the following for iin find name Makefile do sed e s 7 s CC t 1goto cc g i i done Some software projects come with their own libraries also the goal may be to analyze a library by itself For this purpose it is possible to use GOTO CC to link multiple model files into a library of model files An object file can then be linked against this model library For this purpose GOTO CC also supports a pure linker mode To apply this linker mode create a link to the GOTO CC binary by the name of goto 1d alternatively copy the GOTO CC binary if your system does not support links The goto 1d tool can now be used as a seamless replacement for the 1d tool present on most Unix based systems The default linker may need to be replaced by goto 1d in the build script which can be achieved in much the same way as replacing the compiler see Section 6 1 2 6 2 2 Abstractions for Zero Terminated Strings Overview CBMC SATABS and GOTO CC come with a built in library for some of the string functions defined in string h In case the input files do not provide function definitions for any of those functions the built in libraries are automatically added to improve the precision of the analysis Also the abstract function bodies contain assertions enforcing preconditions of the string functions
60. invalid sequence of calls by ensuring see line 26 in Figure 4 6 that no device is unregistered while still being locked The model checking harness that calls the dispatching functions is shown in Figure 4 6 The function main in spec c gives an example of how these functions are called First a character device dummy is registered The major number is stored in the inode structure of the device The values for the file structure are assigned non deterministically In line 14 the variable random is assigned non deterministically Subsequently in line 15 the value of random is re stricted to be 0 lt random lt 3 by a call to assume Whenever the value of random is not in this interval the corresponding execution trace is simply 22 pruned by SATABS Depending on the value of random the harness calls either dummy_open dummy_read or dummy_close Therefore if there is a sequence of calls to these three functions that leads to a violation of the assertion in unregister_chrdev in line 28 then SATABS will eventually consider it If we ask satabs to show us the verification claims show claims for our example we obtain 1 Claim 1 In File driver c Figure 4 5 line 7 assertion Grinode i_rdev gt gt 8 dummy_major 2 Claim 2 In File spec c Figure 4 4 line 14 assertion FALSE It seems obvious that claim 1 can never be violated SATABS confirms this assumption We call satabs claim 1 driver c spec c and SATA
61. ively be interpreted as a precision pre ordering which tells us if a description of values a Da is more less or equally precise than another description of values b Da B 3 2 Class Interfaces The base class abstract_domain_baset This abstract class establishes the base for the data structure needed to abstract the concrete computation domain Other classes will derive from it in order to refine the abstract domain we would like to implement Type definition Description locationt A location is an iterator pointing to an in struction of an instruction list of a goto program Type Definitions Attribute Description bool seen specifies if an instance element of the abstract domain abstract_domain_baset has already been visited by the visit member function of the abstract static analysis baset class Attributes Member function Description initialize amp ns 1 pure member function that will be defined in specializations of this class It is in tended to set the state see below statet associated to the location 1 to the infin mum L of the lattice Da abstract do main 78 Member function Description transform ns pure virtual member function that will from 1 to_1 be defined in specializations of this class It is intended to strengthen the state of this with respect to the guard that ap plies when the program transitions from the location from_1 to t
62. lipse you can skip the installation of the command line tools However you still have to install a model checking tool as described below 1 Download CBMC The binaries are available from http www cs cmu edu modelcheck cbmc The Windows and Linux x86 binaries are also contained in the Eclipse plugin for CBMC Therefore if you intend to run CBMC exclusively within Eclipse you can skip the extraction of the binaries from the archive 2 Unzip untar the archive in a directory of your choice We recommend to add this directory to your PATH The Windows version of CBMC requires the preprocessor cl exe which is part of Visual Studio The path to cl exe must be part of the PATH environment variable of your system 1 Download SATABS The binaries are available from http www cprover org satabs The Windows and Linux x86 binaries are also contained in the Eclipse plugin for SATABS Therefore if you intend to run SATABS exclusively within Eclipse you can skip the extraction of the binaries from the archive You still have to install a model checking tool as described below 2 Unzip untar the archive in a directory of your choice We recommend to add this directory to your PATH The Windows version of SATABS requires the preprocessor cl exe which is part of Visual Studio The path to cl exe must be part of the PATH environment variable of your system 3 You need to install a Model Checker in order to be able to run SATABS You
63. loc buf_size sizeof char buffer buf_size 1 0 return buffer Figure 4 10 Function getenv stubs c 30 5 The Eclipse User Interface gt 2 4 3 1 Many readers will agree that command line interfaces are a bit archaic nowa days Therefore we provide a user interface for CBMC and SATABS that integrates the command line tools seamlessly into Eclipse t We assume that you have already installed Eclipse as well as the SATABS plugin The installation of the plugin is covered by Chapter 2 and installation instruc tions for Eclipse can be found on the Eclipse homepage We assume some basic familiarity with Eclipse This section provides a short step by step introduction to the user interface and is based on the SATABS example in Section 4 3 1 The files used for the example are available for download at http www cprover org satabs examples linux_toy_driver 1 Start Eclipse 2 Create a new SATABS project as shown in Figure 5 1 Name your project DummyDriver CbmcSatabs Eclipse SDK S Edit Navigate Search Project Run Window Help ESA E Project New Project trl HUE New Task Open File ES Other Ctrl N Figure 5 1 Creating a new SATABS project 3 Once you have created a SATABS project right click on the element DummyDriver in the Eclipse Navigator or alternatively click on the File menu and select New New Task to create a new SATABS task Name the task usecount
64. mum of the abstract domain D see the member function generate_state of the static_analysist class below generate_states amp goto_ functions for each location of each function in goto functions this member function sets the corresponding state to the infinmum L of the abstract domain D insert 1 generates the state that is an element of the abstract domain at location 1 see generate_state below Member Functions The class static_analysist This class performs the static analysis with the abstract domain specified through its parameter T The latter should be an instantiatable class inher 80 iting from abstract_domain_baset Type definition Description state_mapt A state map is a location associated with its abstract domain element i e an instance of the abstract domain baset class through some specialization subtyp ing of type T Type Definitions Attribute Description state_mapt state map Maps an abstract domain element to a lo cation of a goto program see the type def inition above Attributes Member function Description statet get_state locationt 1 searches for the location 1 in the state map and returns the associated state which is an element of the abstract_domain_baset_class T operator locationt 1 searches for the location 1 in the state_map like get_state does but returns the asso ciated state which is
65. n For example the following fragment uses a void pointer to store the addresses of both char and int type objects int nondet_int void xp int i char c int main int inputl input2 z inputl nondet_int input2 nondet_int p inputl void amp i void x amp c 50 if input2 z x int p else z char x p CBMC produces the following counterexample Initial State c 0 00000000 i 0 00000000000000000000000000000000 p NULL State 1 file line 10 function main input1 0 00000000000000000000000000000000 State 2 file line 11 function main input2 1 00000000000000000000000000000001 State 3 line 13 function main Failed assertion dereference failure wrong object type line 16 function main Note that the ANSI C standard allows the conversion of pointers to struc tures to another pointer to a prefix of the same structure As an example the following program performs a valid pointer conversion typedef struct int i char j 83 typedef struct int i prefix int main S x prefix xp p prefix amp x p gt i 1 5l 7 9 5 String Constants ANSI C implements strings of characters as an array Strings are then often represented by means of a pointer pointing to the array Array bounds violations of string arrays are the leading cause of security holes in Internet software such as servers or web browsers CBMC and SATABS provide full s
66. n fails for some reason e g the types are different true is returned bool is_constant const Returns true if the expression label is constant bool is_boolean const Returns true if the label of the type is bool bool is_false const bool is_true const The first function returns true if the expression is a boolean constant with value false The second function returns true for any boolean constant that is not of value false bool is_zero const bool is_one const The first function returns true if the expression represents a zero numeric constant or if the expression represents a null pointer The second function returns true if the expression represents a numeric constant with value 1 A number of subtypes of exprt provide further convenience functions for edge access or other specialized behaviour Class Description util std_expr h transt Represents a SMV style transition sys tem with invariants invar initial state init and transition function trans true_exprt Boolean constant true expression false_exprt Boolean constant false expression 72 Class Description symbol_exprt Represents a symbol e g a variable oc currence convenience function for manip ulating identifier edge set_identifier and get_identifier predicate_exprt Convenience constructors to create expres sions of type bool binary_rel
67. ngle instruction in a goto program is rep 74 resented by the class goto_programt instructiont whose definition can be found again in goto programs goto_program_template h In the class goto _programt the control flow graph is represented as a mix ture of sequential transitions between nodes and non sequential transitions at goto nodes The sequential flow of the program is captured by the list instructions that is a field of the class goto_programt Transitions via goto statements are represented in the list targets which is a field of the class goto programt instructiont i e each goto instruction carries a list of possible jump destinations The latter list targets is a list of iter ators which point to elements of the list instructions An illustration is given in Figure B 6 goto _programt list lt list lt instructiont gt iterator gt targets non sequential program flow Figure B 6 Representation of program flow in goto_programt Instructions can have a number of different types as represented by enum goto_program_instruction_typet and can be accessed via the field type in instructiont These include GOTO Represents a non deterministic branch to the instructions given in the list targets Goto statements are guarded i e the non deterministic branch is only taken if the expression in guard evaluates to true other wise the program continues sequentially Guarded gotos can be used for example to model
68. node i_rdev gt gt 8 unsigned int d X spec c assertion assertion FALSE idummy_release filp 8my_file y nusecount 1 Variable locked 0 gt cilocked int 3 j struct inode main andom 0 s cimain O my f i s struct file uunregister_chrdev major 42 c dummy_open inode i struct inode tunregister_chrdev name 8 dummy 0 chmainiO lock held int failure assertion Figure 5 7 Stepping through a counterexample in Eclipse 35 6 Build Systems and Libraries 6 1 Integration into Build Systems with goto cc Existing software projects usually do not come in a single source file that may simply be passed to a model checker They rather come in a multitude of source files in different directories and refer to external libraries and system wide options A build system then collects the configuration options from the system and compiles the software according to build rules The most prevalent build tool on Unix based systems surely is the make utility This tool uses build rules given in a Makefile that comes with the software sources Running software verification tools on projects like these is greatly simplified by a compiler that first collects all the necessary models into a single model file GOTO CC is such a model file extractor which can seamlessly replace gcc in Makefiles The normal build system for the project may be used to build the software but the result will be a model file wi
69. nsigned char type a command line option allows to change this setting There is also support for the floating point data types float double and long double By default CBMC and SATABS use fixed point arithmetic for these types Variables of type float have by default 32 bits 16 bits integer part 16 bits fractional part variables of type double and long double have 64 bits In addition to the types defined by the ANSI C standard CBMC and SAT ABS support the following types which are Microsoft C extensions __int8 _ int16 _int32 and _int64 These types define a bit vector with the given number of bits CBMC and SATABS support all ANSI C Boolean operators on scalar vari ables a b Operator Description la negation a amp amp b and a llb or 7 2 2 Integer Arithmetic Operators CBMC and SATABS support all integer arithmetic operators on scalar vari ables a b 39 Operator Description a unary minus negation a b sum a b subtraction axb multiplication a b division ab remainder a lt lt b bit wise left shift a gt gt b bit wise right shift akb bit wise and alb bit wise or ab bit wise xor a lt b relation a lt b relation a gt b relation a gt b relation Note that the multiplication division and reminder operators are very ex pensive with respect to the size of the equation that is passed to the SAT solver Furthermore th
70. nsition once the function next_timeframe is called As CBMC performs Bounded Model Checking the number of timeframes available for analysis must be bounded SATABS has no such restriction As it is desirable to change the bound to adjust it to the available computing capacity the bound is given on the command line and not as part of the C program This makes it easy to use only one C program for arbitrary bounds The actual bound is available in the C program using the following declaration 57 8 2 2 Counterexamples extern const unsigned int bound Also note that the fragment above declares a constant variable of struct type Thus the C program can only read the trace values and is not able to modify them We will later on describe how to drive inputs of the Verilog module from within the C program As described in previous chapters assertions can be used to verify properties of the Verilog trace As an example the following program checks two values of the trace of the counter module void next_timeframe struct module_top unsigned int counter extern const struct module_top top int main next_timeframe next_timeframe assert top counter 2 next_timeframe assert top counter 3 The following CBMC command line checks these assertions with a bound of 20 hw cbmc counter c counter v module top bound 20 Note that a specific version of CBMC is used called hw cbmc The modul
71. nst_iterator map_itt for map_itt map_it other sign_lattice begin map_it other sign_lattice end map_it if the lattice map also has an entry for the term vairable if amp sign_lattice map_it gt first NULL 10 switch map_it gt second select the corresponding lattice element least upper bound of the pair 4 this other Report any change case BOTTOM sign_lattice map_it gt first other sign_lattice map_it gt first any _changes true break 20 83 case TOP do nothing break case PLUS if sign_lattice map_it gt first MINUS sign_lattice map_it gt first TOP any changes true 30 else if sign_lattice map_it gt first TOP sign_lattice map_it gt first TOP any _changes true case MINUS if sign lattice map _it gt first PLUS sign lattice map it gt first TOP 40 any changes true other cases do nothing if the lattice map has no such entry then simply add it else sign_lattice map_it gt first other sign_lattice map_it gt first 50 return any_changes Defining the initialize member function void sign_mapt initialize const namespacet amp ns locationt 1 typedef std set lt irep_idt gt const_iterator irep_itt for each local variable at location add an entry variable BOTTOM in the sign_map we initialize to the infimum of the lattice for irep_itt irep_it l gt local_va
72. on column Figure B 4 A location tree Expressions The class exprt inherits from class irept Expressions have operands and a type This is modeled with two edges labeled operands and type respectively The class exprt only adds specialized methods for accessing operands and type information to the interface of irept operands Figure B 5 A binary expression Interface of class exprt explicit exprt const irep_idt amp id Creates an exprt object with a given label and no type exprt const irep_idt id const typet amp type Creates an exprt object with a given label and type const typet amp type const typet amp type Return a reference to the type node 70 bool has_operands const Return true if the expression has operands const operandst amp operands const Return a reference to the vector of operands const exprt amp op0 const exprt kopl const exprt amp op2 const exprt amp op3 exprt amp op0 exprt amp opl exprt amp op2 exprt amp op3 gt J Return a reference to a specific operand void make_true void make_false void make _bool bool value Turn the current exprt instance into a expression of type bool with label constant and a single edge labeled value which points to a new node with label either true or false void make_typecast const typet amp type Turns
73. ort the ANSI C pointer arithmetic operators As an example consider the following fragment int array 10 xp int main array 1 1 p amp array 0 ptt assert p 1 7 9 3 The Relational Operators on Pointers The ANSI C standard allows comparing to pointers using the relational operators lt lt gt lt Properties Checked The standard restricts the use of these operators to pointers that point to the same object CBMC and SATABS enforce this restriction by means of an automatically generated assertion 49 7 9 4 Pointer Type Casts CBMC and SATABS provide full support for pointer type casts as described by the ANSI C standard As an example it is a common practice to convert a pointer to e g an integer into a pointer to void and then back Note that pointer type casts are frequently used for architecture specific type conversions e g to write an integer byte wise into a file or to send it over a socket int i char xp p char x 4 i for j 0 j lt 4 j x write xp p The result is architecture dependent In particular it exposes the endianess of the architecture CBMC and SATABS support these constructs when enabled by a command line option The command line option specifies the memory model little endian or big endian Properties Checked CBMC and SATABS check that the type of the object being accessed matches the type of the dereferencing expressio
74. parameter T of the static_analysist class In the following sections we will show what has to be done in order to implement the simple following example borrowed from 3 Sign Analysis The goal of sign analysis is to track the sign of each expression in a program To do so we define the lattice L 0 T where each element stands for subsets of the concrete domain of signed integers represents the set of strictly positive the set of strictly negative values whereas 0 represents the singleton set 0 containing only zero T denotes any value i e the expression is known to have a varying sign and L denotes no value i e the expression has an unknown value Since we want to track the sign of each expression at a program location the full lattice is give by the map lattice Vars gt 1 0 T Hence at each program location 1 we assign a table that associates a sign to each variable Refining the abstract_domain_baset class Class interface A derivation of the base class abstract_domain_baset implementing our sign analysis should have a data structure representing the sign lattice and the full map lattice we defined above Furthermore it should define the following pure virtual member functions from the base class e transform namespacet amp ns locationt from_1 locationt to_1 and e initialize namespacet amp ns locationt 1 which purposes are described above Since the merge member function of the static_analysis
75. ram can be inspected by using the show claims option of SATABS SATABS is able to check several claims at once However as soon as a violation of one claim is found it is reported A single claim can be verifie by using the claim lt n gt option of SATABS where lt n gt denotes the index of the claim in the list obtained by calling SATABS with the show claims flag Whenever a claim is violated SATABS reports a feasible path that leads to a state in which the predicate that corresponds to the violated claim evaluates to false SATABS cannot check programs that use functions that are only available in binary compiled form At the moment libarary functions for which no C source code is available have to be replaced by stubs The usage of stubs and harnesses as known from unit testing also allows to check more complicated properties like for example if function fopen is always called before fclose This technique is explained in detail in Section 4 3 4 2 Using the Command Line Interface This section explains the usage of the command line version of SATABS The executable satabs is called with a set of optional parameters which are described below followed by the file s that shall be verified Currently SATABS cannot verify the files of the program under test separately If a program comprises more than one file then all these files have to be verified in one step by providing all file names in one single command lin
76. reference to the subtype node The second method returns a reference to the vector of subtypes A number of subtypes of typet exist which allow convenient creation and manipulation of typet objects for special types Class Description util std_types h bool_typet Boolean type symbol_typet Symbol type Has edge identifier to a string value which can be accessed with get_identifier and set_identifier struct_typet Represent a struct resp union types Con union typet venience functions to access components components code_typet The type of a function procedure Con venience functions to access arguments and return_type array_typet Convenience function size to access size of the array pointer_typet Pointer type subtype stores the type of the object pointed to reference typet Represents a reference type subtype stores the type of the object referenced to bv_typet Represents a bit vector type with variable width fixed_bv_typet Represents a bit vector that encodes a fixed point number floatbv_typet Represents a bit vector that encodes a floating point number string _typet Represents a string type 69 B 1 5 B 1 6 Location The class locationt inherits from the class irept It is used to store locations in text files It adds specialized methods to manipulate the edges named file line column function functi
77. replacing the goto statement by assert FALSE The break and continue statements are replaced by equivalent goto state ments as described in the ANSI C standard CBMC and SATABS provide full support for switch statements including fall through In Bounded Model Checking the transition system is unwound up to a finite depth In case of C programs this means that for and while are unwound 45 up to a certain depth In many cases CBMC is able to automatically detect the maximum number of times a loop can be executed This includes while loops and loops with modifications to the loop counter inside the loop body even when done indirectly using a pointer However in case of loops that have no pre set bound e g loops iterating on dynamic data structures the user must specify a bound by means of the unwind command line argument CBMC will then unwind the loops up to that bound and check that the number is large enough by means of an unwinding assertion SATABS uses abstraction and thus requires no depth bound The verifica tion result is valid for any number of iterations 7 4 Non Determinism CBMC and SATABS allow to model user input by means of non deterministic choice functions The names of these functions have the prefix nondet_ The value range generated is determined by the return type of the function As an example int nondet_int returns a nondeterministically chosen value of type int The functions are b
78. responding device can be accessed through special files in the filesystem they are conventionally located in the dev directory If a process accesses a device file the kernel calls the corresponding open read and write functions of the device driver Since a driver must not be released by the kernel as long as it is used by at least one process the device driver must maintain a usage counter in mod ern Linux kernels this is done automatically however drivers that must maintain backward compatibility have to adjust this count We provide a skeleton of such a driver The driver contains following func tions 1 register chrdev in spec c Figure 4 4 Registers a character de vice In our implementation the function sets the variable usecount to zero and returns a major number for this device a constant if the user provides 0 as argument for the major number and the value specified by the user otherwise 2 unregister_chrdev in spec c Figure 4 4 Unregisters a character device This function asserts that the device is not used by any process anymore we use the macro MOD_IN_USE to check this 3 dummy_open in driver c Figure 4 5 This function increases the usecount If the device is locked by some other process dummy_open returns 1 Otherwise it locks the device for the caller 4 dummy_read in driver c Figure 4 5 This function simulates a read access to the device In fact it does nothing since we are cur r
79. riables begin irep_it l gt local_variables end irep it sign_lattice irep_it BOTTOM 10 B 3 4 Using the parametrized static_analysist class Once the abstract domain D has been implemented through the special ization sign_mapt we apply it by refining the parametrized class static_analysist lt sign_mapt gt B 4 Propositional Logic 84 Bibliography 1 C Blank H Eveking J Levihn and G Ritter Symbolic simulation techniques state of the art and applications In International Work shop on High Level Design Validation and Test pages 45 50 IEEE 2001 David W Currie Alan J Hu and Sreeranga Rajan Automatic formal verification of dsp software In Proceedings of the 37th Design Automa tion Conference DAC 2000 pages 130 135 ACM Press 2000 Kiyoharu Hamaguchi Symbolic simulation heuristics for high level de sign descriptions with uninterpreted functions In International Work shop on High Level Design Validation and Test pages 25 30 IEEE 2001 Carl Pixley Guest Editor s Introduction Formal Verification of Com mercial Integrated Circuits IEEE Design amp Test of Computers 18 4 4 5 2001 A Pnueli M Siegel and O Shtrichman The code validation tool CVT automatic verification of a compilation process Int Journal of Software Tools for Technology Transfer STTT 2 2 192 201 1998 http www systemc org Luc S m ria Andrew Seawright Renu Mehra Daniel
80. rogram However whenever the model checker m is unable to find an exe cution trace that violates the given property p we can conclude that there is no such trace in the original program either The feasibility of counterexamples is checked by symbolic simulation per formed by component in Figure 4 1 If the counterexample is indeed feasible SATABS found a bug in the original program and reports it to the user Infeasible counterxamples that originate from abstract behaviours that re sult from the omission of details and are not present in the original program are never reported to the user Instead the information is used in order to refine the abstraction such that the spurious counterexample is not part of the refined model anymore For instance the reason for the infeasibility of Y X Y in Figure 4 2 is that neither y nor ya can be reached from z Therefore the abstraction can be refined by partitioning X The refinement steps are illustrated in Figure 4 3 The first step 1 is to generate a very coarse abstraction with a very small state space This abstraction is then successively refined 2 3 until either a feasible coun terexample is found or the abstract program is detailed enough to show that there is no path that leads to a violation of the given property The problem is that this point is not necessarily reached for every input program i e it is possible that the the abstraction refinement loop never term
81. s void next_timeframe extern const unsigned int bound struct module_top unsigned int counter J extern const struct module_top top int main unsigned cycle for cycle 0 cycle lt bound cycle assert top counter cycle amp 15 next_timeframe 59 CBMC performs bounds checking and restricts the number of times that next_timeframe can be called SATABS does not require a bound and thus next_timeframe can be called aribrarily many times 8 2 4 Synchronizing Inputs The example above is trivial as there is only one possible trace The ini tial state is deterministic and there is only one possible transition so the verification problem can be solved by mere testing Consider the following Verilog module module top input clk input i reg 3 0 counter initial counter 0 always Q posedge clk if i counter counter 1 endmodule Using the C program above will fail as the Verilog module is free to use zero as value for the input i This implies that the counter is not incremented The C program has to read the value of the input i in order to be able to get the correct counter value void next_timeframe extern const unsigned int bound struct module_top unsigned int counter Bool i y extern const struct module_top top int main unsigned cycle unsigned C_counter 0 for cycle 0 cycle lt bound cycle assert top counter C_counter 15
82. s is realized by write protecting the variable g during the execution of f This rules out programs that show architecture dependent behavior due to the ordering of evaluation While such programs are still valid ANSI C programs we do not believe that programs showing architecture dependent behavior are desirable CBMC and SATABS support functions by inlining No modular approach is done CBMC and SATABS preserve the locality of the parameters and the non static local variables by renaming 43 As an example the following program calls the functions f and g twice While f uses a static variable which is not renamed between calls gO uses a true local variable which gets a new value for each call int f static int s 0 S int main assert f first call to f assert f second call to f sl leal assert assert first call to g second call to y CBMC supports Recursion by finite unwinding as done for while loops CBMC checks that enough unwinding is done by means of an unwinding assertion section 7 3 6 provides more details SATABS does not support recursion 7 3 Control Flow Statements 7 3 1 Conditional Statement 7 3 2 return CBMC and SATABS allow the use of the conditional statement as described in the ANSI C standard Properties Checked CBMC and SATABS generate a warning if the as signment operator is used as condition of a control flow statement such as if or
83. se the assert function to specify arbitrary conditions that have to hold at certain points in the program A more detailed list of properties checked by SATABS can be found in Chapter 7 All the properties described above are reachability properties They are always of the form Is there a path such that property is violated The counterexamples to such properties are always paths Users of the Eclipse plugin for SATABS can step through counterexamples in a way that is similar to debugging programs The installation of this plugin is explained in Section 2 In general properties are specified by predicates Examples for such predi cates are e i gt 0 and gt MAX 1 Such predicates are automatically in troduced by SATABS whenever an array a of size MAX is accessed with index i i e whenever a i occurs in the program They make sure that the lower and upper array bounds are not violated 18 e m 2 0 This predicate is introduced if the user has stated by means of an assertion that m has to be odd i e if assert m 2 occurs in the program SATABS calls these predicates claims Each claim is associated to a specific line of code i e a claim is violated when the predicate can become false at the corresponding program location Claims are implicitely stated within the program Currently there is no possibility to introduce claims other than by adding assertions to the program The claims for a prog
84. specific set of environment vari ables We have to assume that environ contains environment variables that have an arbitrary content of arbitrary length The content of environment variables is not only arbitrary but could be malefic since it can be modified by the user SATABS provides several functions to model user input one of which namely nondet_uint is used in the implementation of getenv in Figure 4 10 The prototype declaration of the function nondet_uint tells us that nondet_uint will return an unsigned integer Since we do not provide a body to this function SATABS will consider all possible return values In Figure 4 10 we use nondet_uint to determine the length of the the string that getenv 24 returns The subsequent call to assume guarantees that buf_size is at least one since we need to zero terminate the string and is bounded by SATABS_MAX_BUF_LEN In our first approximation of the behaviour of getenv we completely ignore the content of the string We could model the function fgets in a similar manner but before we do this let us have another look at the claims that Satabs generates if we provide the implementations from the string library and for getenv The file stubs c can be downloaded from the SATABS homepage satabs show claims I include aeon c base64 c lib_aeon c stubs stubs c Now SATABS will generate approximately 700 claims Most of these claims require that we verify that the upper and lower bo
85. ssion 5 We must be notified by email at kroening cs cmu edu after you install the program for any purpose THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FIT NESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT INDIRECT INCIDENTAL SPECIAL EXEMPLARY OR CONSEQUENTIAL DAMAGES INCLUDING BUT NOT LIMITED TO PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE DATA OR PROFITS OR BUSINESS INTERRUPTION HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUD ING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE 63 Appendix B Programming APIs B 1 Language Frontends B 1 1 Scanning and Parsing IRep Trees scan S gt A K Parse Tree file c Figure B 1 From C source code file to IRep The sources of the C frontend are located in the ansi c directory It uses a standard Flex Bison setup for scanning and parsing the files The Bison grammar produces a tree representation of the input program The typechecker annotates this parse tree with types and generates a symbol table The following code illustrates how to use the frontend for parsing files and for translating them into a symbol
86. stantiated twice within the top module A reference to the register counter within the C program would be ambiguous as the two 61 module instances have separate instances of the register CBMC and SAT ABS use the following data structures for this example void next_timeframe extern const unsigned int bound struct counter unsigned char increment unsigned char counter E struct module_top struct module_counter cl c2 y extern const struct module_top top int main next_timeframe next_timeframe next_timeframe assert top cl counter 3 assert top c2 counter 6 The main function reads both counter values for cycle 3 A deeper hierarchy modules in modules is realized by using additional structure members Writing these data structures for large Verilog designs is error prone and thus CBMC can automatically generate them The declarations above are generated using the command line hw cbmc gen interface module top hierarchy v Mapping Verilog Vectors to Arrays or Scalars In Verilog a defini tion such as wire 31 0 x can be used for arithmetic e g x 10 and as array of Booleans e g x 2 ANSI C does not allow both so when mapping variables from Verilog to C the user has to choose one option for each such variable As an example the C declaration unsigned int x will allow using x in arithmetic expressions while the C declaration Bool x 32 will allo
87. t class will call a merge member function of our abstract_domain_baset class we also have to define such a merge function with the following interface e bool merge statet amp other as one can guess from the interface description above after a call to this member function the receiver this becomes the least upper bound of the abstract_domaint element other and the receiver this One possible implementation of such a derivation from the base class is showed in the following listing 82 ifndef SIGN_MAP_H_ define SIGN_MAP_H_ ttinclude expr h ttinclude irep h ttinclude static_analysis h ttinclude goto_functions h ttinclude lt map gt class sign_mapt public abstract_domain_baset 10 public typedef enum signt ZERO PLUS MINUS TOP BOTTOM sign 20 typedef std map lt irep_idt signt gt sign_latticet an abstract domain element maps a sign to a given set of expressions typically those handling integers at a given location sign_latticet sign_lattice virtual void initialize const namespacet amp ns locationt 1 30 virtual void transform const namespacet amp ns locationt from_l locationt to_l bool merge sign mapt amp other endif SIGN_MAP_H_ Defining the merge member function bool sign_mapt merge sign_mapt amp other boolean tracking the changes to the state of this bool any_changes false typedef std map lt irep_idt signt gt co
88. tes The reader who is only interested in running SATABS may skip to the next section Even for very trivial C programs it is impossible to exhaustively examine their state space which is potentially unbounded However not all details in a C program necessarily contribute to a bug so in theory it is sufficient to examine the parts of the program that are somehow related to a bug In practice many static verification tools such as lint try to achieve this goal by applying heuristics This approach comes at a cost bugs might be overlooked because the heuristics do not cover all relevant aspects of the program Therefore the conclusion that a program is correct whenever such a static verification tool is unable to find an error is invalid 4 1 1 Abstraction and Refinement counter example 0 property violated feasible property refinement feasibility Pa abstract p to be details d infeasibl analysis 6 explanation checked AN infeasible y E violation m orig a calculate ire abstract Tr check found model abstraction model m property property holds OK Figure 4 1 Counterexample Guided Abstraction Refinement Scheme A more sophisticated approach that has been very successful recently is to generate a sound abstraction of the original program see Figure 4 1 In this context soundness refers to the fact that the abstract program contains at least all relevant behavio
89. th suitable detail for verification as opposed to a flat executable program 6 1 1 Example Building wu ftpd 6 1 2 Important Notes 1 Download the sources of wu ftpd from ftp ftp wu ftpd org pub wu ftpd wu ftpd current tar gz 2 Unpack the sources by running tar xfz wu ftpd current tar gz 3 Change to the source directory by entering e g cd wu ftpd 2 6 2 4 Configure the project for verification by running configure YACC byacc CC goto cc host none none none 5 Build the project by running make This creates multiple model files in the src directory Among them is a model for the main executable ftpd 6 Run a model checker e g CBMC on the model file cbmc binary src ftpd More elaborate build or configuration scripts often make use of features of the compiler or the system library to detect configuration options automat ically e g in a configure script Replacing gcc by GOTO CC at this stage may confuse the script or detect wrong options For example missing library functions do not cause GOTO cc to throw an error only to issue a warning Because of this configuration 36 6 2 Libraries 6 2 1 Linking libraries scripts sometimes falsely assume the availability of a system function or library In the case of this or similar problems it is more advisable to configure the project using the normal routine and replacing the compiler setting manually in the generated Makefiles e g by replacing l
90. th is still restricted e Typically you want to verify your program and not the libraries that it uses these are usually assumed to be correct e SATABS cannot verify binary libraries e SATABS does not provide a model for the hardware e g hard disk input output devices the tested program runs on Since SATABS is supposed to examine the behaviour of the tested program for all 20 4 3 1 Example Reference possible inputs and outputs it is reasonable to model input and output by non deterministic choice This section provides an introduction to model checking real C programs with SATABS It starts with a small example that is based on Linux device drivers Counting in Linux Device Drivers Microsoft s SLAM toolkit has been successfully used to find bugs in Win dows device drivers SLAM automatically verifies device driver whether a device driver adheres to a specifications SLAM provides a test harness for device drivers that calls the device driver dispatch routines in a non deterministic order Therefore the model checker examines all combinations of calls Motivated by the success this approach we provide an example based on Linux device drivers Dynamically loadable modules enable the Linux Kernel to load device drivers on demand and to release them when they are not needed anymore When a device driver is registered the kernel provides a major number that is used to uniquely identify the device driver The cor
91. the successors of a given instruction void get_successors targett target targetst amp successors class instructiont public codeT code identifier of enclosing function irep_idt function location in the source file locationt location type of instruction goto _program instruction _typet type Guard statement for gotos assume assert guardT guard 76 targets for gotos targetst targets set of all predecessors sequential and gotos std set lt targett gt incoming_edges a globally unique number to identify a program location It is guaranteed to be ordered in program order within one goto_program unsigned location_number a globally unique number to identify loops unsigned loop_number true if this is a goto jumping back to an earlier instruction in the sequential program flow bool is_backwards_goto const B 3 Static Analysis B 3 1 A Brief Introduction to Abstract Interpretation The theory of abstract interpretation approximates the semantics of for mulas and in our case of programs Usually the semantics of a program is defined by a concrete domain D and the relationships within this domain changing over time by executing some instructions Given an interpreta tion I based on the domain D the question of whether a property expressed by a logical formula p holds i e De I E p can be answered via model checking The state explosion entail
92. tract string functions in the built in libraries 1 inline unsigned strlen const char s 2 3 _ CPROVER_HIDE 4 _ CPROVER_assert __CPROVER_is_zero_string s 5 strlen zero termination 6 return __CPROVER_zero_string_length s 7 Figure 6 1 The abstract definition of strlen Programs may call the standard library function strlen without linking to a model file of the system library The abstract definition of the function is automatically added to the input model Note that in this example the return value of the strlen function is not computed by looping over the string The function simply returns a value read modeled by additional program variables which may be non deterministic Disabling built in libraries and string abstraction The addition of the built in libraries or the abstraction of strings can can be disabled using the following two command line options e no library Prevents built in libraries from being added e no string abstraction Disables abstract tracking of string properties 38 7 ANSI C C Language Features 7 1 Basic Datatypes 7 2 Operators 7 2 1 Boolean Operators CBMC and SATABS support the scalar data types as defined by the ANSI C standard including Boo1 By default int is 32 bits wide short int is 16 bits wide and char is 8 bits wide Using a command line option these default widths can be changed By default char is signed Since some architectures use an u
93. uilt in i e the prototype is sufficient CBMC and SATABS will evaluate all traces arising from the possible choices 7 5 Assumptions and Assertions CBMC and SATABS check assertions as defined by the ANSI C standard The assert statement takes a Boolean condition and the tools check that this condition is true for all runs of the program The logic for assertions is the usual ANSI C expression logic In addition to the assert statement the _ CPROVER_assert statement can be used to annotate the assertions with a comment CPROVER assert x amp 1 x divisible by 2 CBMC and SATABS also provide the _CPROVER assume statement The _ CPROVER_assume statement restricts the program traces that are consid ered and allows assume guarantee reasoning Similar to an assertion an as sumption takes a Boolean expression as argument Intuitively the _CPROVER_assume statement aborts the program successfully if the condition evaluates to false If the condition evaluates to true the execution continues As an example the following function first nondeterministically picks an integer value It then assumes that the integer is in a specific range and returns the value int one_to_ten int value nondet_int CPROVER assume value gt 1 amp amp value lt 10 return value 46 7 6 Arrays Note that the assume statement is not retro active with respect to assertions E g assert value lt 10 _ CPROVER _assum
94. unds of buffers or arrays are not violated Let us look at the first few claims that SATABS generates e Claim 4 file stubs c line 8 column 10 function c strepy dereference failure array home lower bound i lt 0 dest amp home 0 e Claim 5 file stubs c line 8 column 10 function c strepy dereference failure array home upper bound dest amp home 0 i gt 512 The variable home looks familiar We encountered it line 7 of the func tion getConfig in Figure 4 8 The function getenv in combination with functions strcpy strcat or sprintf is indeed often the source for buffer overflows Therefore we try to use SATABS to check if the upper bound of the array home satabs claim 5 I include aeon c base64 c lib_aeon c stubs stubs c SATABS runs for quite a while and will eventually give up telling us that it s upper bound for abstraction refinement iterations has been exceeded This is not exactly the result we were hoping for and we could now increase the bound for iterations with help of the iterations command line switch of SATABS Before we do this let us investigate why SATABS failed to provide a sat isfying result The function strcpy contains a loop that counts from 1 to the length of the input string Predicate abstraction the mechanism SAT ABS is based on is unable to detect such loops and will therefore unroll the loop body as often as necessary The array home has MAX_LEN
95. upport for string constants usable either in initializers or as a constant As an example the following fragment contains a string array s which is initialized using a string constant Then a pointer p is initialized with the address of s and the second character of s is modified indirectly by dereferencing p The program then asserts this change to s char s abc int main char p s x write to p 1 py assert s l J y Properties Checked CBMC and SATABS perform bounds checking for string constants as well as for normal arrays In the following fragment a pointer p is pointing to a string constant of length three Then an input i is used as address of an array index operation CBMC and SATABS assert that the input i is not greater than four the string constant ends with an implicit zero character char p abc void f unsigned int i char ch results in bounds violation with i gt 4 ch p i In addition to that CBMC and SATABS check that string constants are never written into by means of pointers pointing to them 7 9 6 Pointers to Functions CBMC and SATABS allow pointers to functions and calls through such a pointer The function pointed to may depend on nondeterministically chosen inputs As an example the following fragment contains a table of three function pointers The program uses a function argument to index 52 the table and then calls th
96. ur PATH you should do so now In the Eclipse window select the menu point Windows Preferences choose the CBMC and SATABS preferences and add the corresponding PATH environment vari able A small tutorial on how to use the Eclipse plugin is provided in Chapter 5 1 Download GOTO CC The binaries are available from http www cprover org goto cc Unzip untar the archive in a directory of your choice We recommend to add this directory to your PATH The Windows version of GOTO CC requires the preprocessor cl exe which is part of Visual Studio Express The path to cl exe must be part of the PATH environment variable of your system Chapter 6 covers the integration of GOTO CC into build systems 10 3 CBMC Bounded Model Checking for C C 3 1 A Short Tutorial 3 1 1 First Steps Like a compiler CBMC takes the names of c files as command line ar guments CBMC then translates the program and merges the function definitions from the various c files just like a linker But instead of pro ducing a binary for execution CBMC performs symbolic simulation on the program As an example consider the following simple program named file1 c int puts const char xs int main int argc char argv int i if argc gt 1 puts argv 2 Of course this program is faulty as the argv array might have only one element and then the array access argv 2 is out of bounds Now run CBMC as follows c
97. urs i e bugs that are present in the original program In Figure 4 1 the component labelled a is responsible for strip ping details from the original program The number of possible behaviours increases as the number of details in the abstract program decreases Intu itively the reason is that whenever the model checking tool lacks the infor mation that is necessary to make an accurate decision on whether a branch 16 of an control flow statement can be taken or not both branches have to be considered In the abstract program a set of concrete states is subsumed by means of a single abstract state Consider Figure 4 2 The concrete states xz and zz are mapped to an abstract state X and similarly Y subsumes y and y2 However all transitions that are possible in the concrete program are also possible in the abstract model The abstract transition X Y summarizes the concrete transitions x gt y and z gt y2 and Y X corresponds to ya gt t2 The behaviour X Y X is feasible in the original program because it maps to 11 gt y2 gt 12 However Y X gt Y is feasible only in the abstract model Figure 4 2 An Example Mapping from Concrete States to Abstract States The consequence is that the model checker labeled as m in Figure 4 1 pos sibly reports a spurious counterexample We call a counterexample spurious whenever it is feasible in the current abstract model but not in the original p
98. utex_lock and pthread_mutex_unlock functions It provides an option that automati cally generates assertions for data races CBMC currently does not support concurrency 55 8 Hardware and Software Equivalence and Co Verification 8 1 Introduction A common hardware design approach employed by many companies is to first write a quick prototype that behaves like the planned circuit in a lan guage like ANSI C This program is then used for extensive testing and debugging in particular of any embedded software that will later on be shipped with the circuit An example is the hardware of a cell phone and its software After testing and debugging of the program the actual hard ware design is written using hardware description languages like VHDL or Verilog Thus there are two implementations of the same design one written in ANSI C which is written for simulation and one written in register transfer level HDL which is the actual product The ANSI C implementation is usually thoroughly tested and debugged Due to market constraints companies aim to sell the chip as soon as possi ble i e shortly after the HDL implementation is designed There is usually little time for additional debugging and testing of the HDL implementa tion Thus an automated or nearly automated way of establishing the consistency of the HDL implementation is highly desirable This motivates the verification problem we want to verify the consistency of
99. w accessing the individual bits of x using the syntax x bit The gen interface option of CBMC will generate the first variant if the vector has the same size as one of the standard integer types and the second option if not so This choice can be changed by adjusting the declaration accordingly 62 Appendix A CBMC and SATABS License Agreement C 2001 2008 Daniel Kroening Edmund Clarke Computer Systems Institute ETH Zurich Computer Science Department Carnegie Mellon University All rights reserved Redistribution and use in source and binary forms with or without modification are permitted provided that the following conditions are met 1 Redistributions of source code must retain the above copyright notice this list of conditions and the following disclaimer 2 Redistributions in binary form must reproduce the above copyright notice this list of conditions and the following disclaimer in the doc umentation and or other materials provided with the distribution 3 All advertising materials mentioning features or use of this software must display the following acknowledgement This product includes software developed by Daniel Kroening Edmund Clarke Computer Systems Institute ETH Zurich Computer Science Department Carnegie Mellon University 4 Neither the name of the University nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permi
100. w child void remove const irep_namet amp name This method looks for an edge with label name and removes it 67 B 1 3 Types Interface of class typet void move_to_sub irept kirep void move_to_named_sub const irep_namet amp name irept amp irep The first method creates a new ordered edge with a child equal to irep Then it sets rep to nil The index of the edge is equal to the size of vector sub before the call The second method does the same but for labeled edges void swap irept amp irep Exchange the content of the invoked node with the one of irep void make_nil Set the label of the node to nil and remove all outgoing edges const subt amp get_sub const named_subt amp get_named _sub const named _subt amp get_comments Return a constant reference to sub named_sub and comments respectively The class typet inherits from irept Types may have subtypes This is modeled with two edges named subtype and subtypes The class typet only add specialized methods for accessing the subtype information to the interface of rept subtype subtypes Figure B 3 A Type Tree 68 B 1 4 Subtypes of typet bool has_subtype const bool has_subtypes const The first method returns true if the a subtype node exists is not nil The second method returns true is a subtypes node exists typet amp subtype typest amp subtypes The first method returns a
101. while The return statement without value is transformed into an equivalent goto statement The target is the end of the function The return statement with value is transformed into an assignment of the value returned and the goto statement to the end of the function 44 7 3 3 goto 7 3 4 break and continue 7 3 5 switch 7 3 6 Loops Properties Checked CBMC and SATABS enforce that functions with a non void return type return a value by means of the return statement The execution of the function must not end by reaching the end of the function This is realized by inserting assert FALSE at the end of the function CBMC and SATABS report an error trace if this location is reachable As an example consider the following fragment int f int c nondet_int if c 1 return c int main int i i f i In this fragment f may exit without returning a value CBMC produces the following counterexample State 1 file no return c line 2 function f c 1 00000000000000000000000000000001 Failed assertion end of function assertion file no return c line 6 function f While only few C programs make use of goto statements CBMC and SAT ABS provide full support for such programs CBMC distinguishes forward and backward jumps In case of backward jumps the same technique used for loops is applied the loop is unwound a given number of times and then we check that this number of times is sufficient by

Download Pdf Manuals

image

Related Search

Related Contents

Manuel de maintenanceTwister/Twister-Junior  APW Wyott GGT-18H User's Manual  Bose Acoustimass 15 Speakers  1603 Series  Manual_HD1040B_18109700    8月号 - JA高岡  BDI Avion 8925  News & Information 新商品  

Copyright © All rights reserved.
Failed to retrieve file