A Description of KLEE’s KTest Format
KTest is the format used by the KLEE symbolic execution engine to describe generated test inputs. The format is very simple but (as far as I know) not documented outside the KLEE source code. However, the format is also useful for other test generator tools because they can:
- Reuse the existing tooling provided by KLEE.
- Utilize inputs generated by KLEE, or run KLEE with their generated inputs.
This is an attempt to document this format, both in a human-readable and machine-readable form. Ideally, contributing to further spread and utilization of the KTest format. The formal, machine-readable description leverages Kaitai Struct to describe the format using YAML and can be used to automatically generate parsers for different programming languages.
The Format
The KTest format is a binary format, encoded in big-endian byte order. The format defines the following fields:
- Header:
- The magic ASCII value ‘KLEE’.
- The version number (see below).
- Arguments: List of command-line arguments passed to the test program.
- The number of arguments as an unsigned 32-bit integer.
- The specified amount of arguments, each represented as a string.
- symArgvs: Don’t know what this does,
ktest-tooldoesn’t use it. - symArgvLen: Don’t know what this does,
ktest-tooldoesn’t use it. - Objects: List of test inputs.
- The amount of objects as an unsigned 32-bit integer.
- The specified amount of objects (see below).
Objects are encoded as follows:
- The name of the object, represented as a string.
- The size of the object’s value in bytes, encoded as a 32-bit unsigned integer.
- The value of the object (i.e., the specified amount of raw bytes).
Strings are encoded as follows:
- The size of the string in bytes.
- The given amount of bytes, each representing an ASCII character.
Defined Versions
As apparent from the prior description, different versions of the format exist:
- Version 1: This was probably the original version of the format.
- Version 2: Added the
symArgvsandsymArgvLenfields to the binary format. - Version 3: Changed the format’s header from
BOUT\ntoKTest(see commit 489f3d4).
The initial check-in of the KLEE source code already used version 2 of the format.
Appendix: Formal Description
The quebex symbolic execution engine provides a Haskell implementation of the format in Data.KTest.
The following YAML formally describes the format using Kaitai Struct:
Description of the KTest format using Kaitai Struct.
meta:
id: ktest
file-extension: ktest
license: Unlicense
ks-version: 0.9
endian: be
seq:
- id: header
type: header
- id: arguments
type: arguments
- id: sym_argvs
type: u4
if: header.version >= 2
- id: sym_argvs_len
type: u4
if: header.version >= 2
- id: objects
type: objects
types:
header:
seq:
- id: magic
# TODO: Also support 'BOUT\n` (see above).
contents: 'KTEST'
- id: version
type: u4
valid:
min: 0
max: 3
arguments:
seq:
- id: num_args
type: u4
- id: value
type: str_with_len
repeat: expr
repeat-expr: num_args
objects:
seq:
- id: amount
type: u4
- id: value
type: object
repeat: expr
repeat-expr: amount
object:
seq:
- id: name
type: str_with_len
- id: bytes
type: bytes_with_len
bytes_with_len:
seq:
- id: len
type: u4
- id: value
size: len
str_with_len:
seq:
- id: len
type: u4
- id: value
type: str
encoding: ASCII
size: len