-
Notifications
You must be signed in to change notification settings - Fork 1
Converting assembly to litmus
This page contains notes on how to convert the assembly output of compilers such as gcc and clang to litmus tests that herdtools7 can understand. Eventually, we'll use them to build an automatic conversion as part of act
.
We assume that the assembly is the result of compiling a memalloy witness (using the latest dev branch and -DNO_PTHREADS
, giving us the following structural information:
- the program contains n functions, each labelled P0, P1, ..., Pn;
- the program has no external dependencies (it'll call into
stdatomic.h
, but these calls usually get inlined in the assembly output); - there is an expected postcondition, which we can extract from the corresponding litmus test if needed (or the pthreads stub in the C witness).
The general process for litmus-ifying assembly seems to be this:
- Find the start and end of each thread function, and split out the threads accordingly;
- Remove metadata, spurious labels, stack manipulation, and calling-convention instructions;
- Simplify any syntactic sugar that
herd7
doesn't understand; - Replace variable references with ones
herd7
can understand, removing any offset calculating code accordingly; - Normalise syntax (capitalise/prefix register names, remove dots in labels, etc.);
- Insert header;
- Tabulate threads;
- Insert postcondition.
In act
, at time of writing, most of this (that has been implemented) happens in the Sanitiser.ml
module.
herd7
doesn't seem to model call stacks in the general case. Since (at time of writing) memalloy
models thread-local variables on the heap, the stacks don't actually get used, and so we can eliminate anything that touches them. This might need to change if the upstream behaviour changes.
To make step 5 easier, we need to tell the compiler not to generate position-independent code. This corresponds to -fno-pic
on GCC-likes.
When in -fpic
mode, compilers calculate the position of heap variables by calculating offsets from some combination of offset tables, the program counter, and suchlike. This'll make it much harder to detect heap variable uses when litmusifying the assembly.
When in -fno-pic
mode, the compiler tends to just use fixed offsets from a specific label (and, in some architectures, even mentions the variable by name when referencing it). This is much easier to work with.
Herd doesn't understand the following syntactic sugar:
- Identifiers starting with anything other than a letter, or containing
$
(mangle);
-
sub
(useadd
instead?); - Most of AT&T syntax (output Intel instead);
-
gas
-style compiler directives (things starting with.
) (remove); - Hexadecimal literals (AT&T:
0xDEADBEEF
, Intel:DEADBEEFh
) (convert to decimal); - Register names that aren't
UPPERCASE
(lowercase them); -
gas
-style comments (remove or convert to OCaml-style).
Herd doesn't (seemingly) implement stacks, so anything that manipulates the stack needs rewriting. While memalloy
witnesses don't use stack variables, the compiler (in all its wisdom) might do, so we need to handle them in a rudimentary way.
At time of writing, the right way to do this (not yet implemented) seems to be:
- Remove anything that mentions the stack pointer (eg.
%esp
in x86).memalloy
witnesses don't call anything once stripped of any test harness code. - Remove anything that involves calling or returning from a call (eg.
call
,enter
,leave
,ret
in x86?) - Convert every reference to an automatic variable into a per-thread heap location. This will probably depend a lot on how the particular compiler handles automatic variables. In GCC x86, for example, they tend to be offsets from the base pointer
%ebp
, which is easy enough to detect and convert (eg,-8(%ebp)
on thread 0 could becomet0s8
). This, of course, goes out of the window quickly if the emitted code modifies the base pointer.
The architecture notes below mention possible compilers for each architecture. These are from the perspective of an x86-64
machine running Debian GNU/Linux buster.
To install a GCC cross-compiler foo-bar-baz-gcc
on Debian, use sudo apt install gcc-foo-bar-baz
.
Known as X86
in litmus tests.
Possible compilers: clang
, gcc
- Pass
-m32
to GCC-likes to force 32-bit compilation. - Need to output Intel syntax: Herd's understanding of AT&T syntax only stretches as far as understanding
$
-prefixed numbers and size suffixes. -
herd7
doesn't understand the 8-bit sub-registers (AL
,AH
, etc.), the stack pointers (EBP
,ESP
, etc), or the 64-bit registers (RAX
,RBX
, etc). It probably doesn't understand MMX, x87, etc, but I haven't checked.
Variable references are easy to handle, so long as -fpic
is off (they become DWORD PTR varname
in Intel syntax, and just the variable name otherwise).
As of time of writing, herd7
doesn't understand CMPXCHG
, which makes translating read-modify-write programs impossible.
Herd doesn't seem to support x86-64 yet.
Possible compilers: clang
, gcc
Possible compilers: arm-linux-gnueabihf-gcc
arm-none-eabi-gcc
(but see below). Others, such as armel
compilers, might also work.
- The
arm-none-eabi
version of GCC emits cmpxchg as a call into the C library, but doesn't do this for atomic loads and stores.arm-linux-gnueabihf
doesn't, and is probably the compiler to use. - Heap variable references don't mention their variable by name, and instead become register+immediate offset references. In
gcc
, the register gets loaded (eventually, modulo register spilling) with the value of an 'anchor' label, and metadata below it stores information about which variable corresponds to which offset. We can probably reconstitute the variable references from this.
Not tested yet.
Known as PPC
in litmus tests.
Possible compilers: powerpc-linux-gnu-gcc
-
gcc
emits variable references in two steps: it'll shift half of the variable address into a register, then add the other half as part of the load or store instruction. It isn't immediately clear how these translate intoherd7
. -
herd7
doesn't understand some syntactic sugarings thatgcc
is fond of emitting. -
gcc
(and probablyclang
) seems to emit instructions thatherd7
doesn't understand, even when desugared. For example, the above idiom generateslis
(load immediate shifted), which is sugar foraddis
(add immediate shifted);herd7
doesn't understand the shifted form ofaddi
. -
herd7
requires registers to have aR
prefix;gcc
emits them without (as raw numbers).