Validating Hare’s Sort Module using Symbolic Execution
Hare is a minimalist systems programming language built around the QBE compiler infrastructure. Similar to LLVM, QBE provides an intermediate representation, which is emitted by the Hare compiler. As a little hobby project of mine, I have written a software analysis framework for the QBE intermediate representation called quebex. The following illustrate utilization of quebex to prove properties about Hare programs.
Hare’s Sort Module
The Hare programming language provides a small standard library.
This standard library contains a sort module that implements different sorting algorithms.
Among other things, it provides a function called sort::inplace that sorts a “slice of items in place” without performing any memory allocations.
A simple program sorting an array of integers, using this function, looks as follows:
use sort;
use sort::cmp;
export fn main() void = {
let ints: [6]int = [23, 1, 5, 42, 0, 7];
sort::inplace(ints, size(int), &cmp::ints);
};
Now, let’s assume we want to test the implement of sort::inplace to make sure it works as intended.
We could write several unit tests.
However, we would have to write these tests manually and don’t know when we have written “enough” tests.
Alternatively, we could go into a property testing direction: we specify a property that needs to hold for each output, automatically generate inputs, and check if the property holds for each generated input.
For a sorting algorithm implementation, the central property we aim to satisfy is that the resulting list is sorted.
That is, assuming ascending sort, each element must be greater than its predecessor.
Conveniently, Hare’s sort module already provides a sort::sorted predicate for this purpose.
To abort program execution when the property doesn’t hold, we can invoke the function as follows:
assert(sort::sorted(ints, size(int), &cmp::ints));
We now have our property, but how do we generate the inputs?
Symbolic Execution
Existing property testing frameworks like QuickCheck and Hypothesis generate inputs using random testing. In our case, these frameworks would generate random integers. Conceptually, this is very similar to fuzzing and has the same drawback: the input generation does not take the program structure into account. Ideally, we want to generate inputs in a way that they trigger code in the software under test that we haven’t tested with a previous input.
This sort of input generation can be achieved using a software testing technique called symbolic execution. Contrary to fuzzing, symbolic execution generates inputs based on the branch points in a software under test. Every time it encounters a branch condition, it checks for a satisfying assignment of this condition that triggers an execution path that hasn’t been tested previously. Under the hood, it performs formal reasoning about the branch condition using a constraint solver.
Symbolic execution presupposes a transformation of the program’s semantics to a representation that can be understood by the solver.
At this point, we come back to quebex: it transforms QBE programs to such a constraint solver representation.
Since the Hare compiler emits QBE, we can use symbolic execution via quebex to validate Hare’s sort::inplace implementation.
For this purpose, we need to write a test harness that identifies the inputs:
use sort;
use sort::cmp;
use types::c;
export @symbol("quebex_make_symbolic") fn make_symbolic(
ptr: *opaque,
nelem: size,
elsiz: size,
name: const *c::char
) void;
export fn main() void = {
let ints: [6]int = [0...];
make_symbolic(&ints, len(ints), size(int), c::nulstr("ints\0"));
sort::inplace(ints, size(int), &cmp::ints);
assert(sort::sorted(ints, size(int), &cmp::ints));
};
This is very similar to the code we seen above, expect the ints array is now passed to a make_symbolic function.
This function is intercepted by quebex and essentially causes it to formally reason about branch conditions depending on these values.
Thereby, quebex will explore all reachable program paths through the sort::inplace function and check the sort::sorted property for each.
Execution of Hare using Quebex
We now need to run this test harness with quebex.
For this purpose, we first need to emit a QBE representation of this Hare program, which also includes the utilized standard library modules.
For the details, refer to a previous note of mine, but with Hare 0.26.0 it boils down to the following compile.sh shell script:
#!/bin/sh
set -e -x
hare build -t ssa main.ha
cat $(grep -Frl 'function w $sort.cmp.ints' ~/.cache/hare) \
$(grep -Frl 'function $sort.inplace' ~/.cache/hare) \
$(grep -Frl 'function l $types.c.nulstr' ~/.cache/hare) \
$(grep -Frl 'function $rt.memcpy' ~/.cache/hare) \
main.ssa > main.std.ssa
This gives us a standalone main.std.ssa file which we can then pass to quebex-symex for symbolic execution:
$ time -p quebex-symex \
--write-tests /tmp/inplace-test-cases \
--dump-smt2 /tmp/solver-representation.smt2 \
main.std.ssa
---
Amount of paths: 4683
real 33.40
user 18.15
sys 1.44
This tells us that quebex found 4683 execution paths through the sort::inplace function for an int-array of size 6 in 33 seconds; the defined property holds on each of them 🎉.
For each execution path, it generated a test input that is available in the KTest format within /tmp/inplace-test-cases.
We could, for example, now generate unit tests from these files.
Further, it also stored the constraint solver representation in /tmp/solver-representation.smt2 this is written to disk in the SMT-LIB format.
Limitations
Since, in this case, symbolic execution performs a complete program analysis, we now have proof that the sort::inplace function from the Hare standard library does indeed return a sorted array.
However, due to the setup we have used, this proof comes with the following caveats:
- The illustrated setup only validates sorting of integers, the function could be incorrect for other types.
- The setup only checks the sort module with a fixed-size array, it may be incorrect for other array sizes.
- The QBE semantics implemented by quebex may be incorrect and may not match those of the compiler backend.