We want our pass to add a function to the binary that performs this logic:
void traceMemory(void *Addr, uint64_t Value, bool IsLoad) {
if (IsLoad)
fprintf(_MemoryTraceFP, "[Read] Read value 0x%lx from address %p\\n", Value, Addr);
else
fprintf(_MemoryTraceFP, "[Write] Wrote value 0x%lx to address %p\\n", Value, Addr);
}
Notice how the calls to fprintf
write to a FILE *_MemoryTraceFP
- where should this file pointer come from? We could insert fopen/fclose
logic directly into traceMemory
- but then we would be opening and closing our log on each memory access, which is wasteful.
The preferable solution is to define a global FILE *_MemoryTraceFP
- and initialize it just once. Much like C, LLVM IR does not allow for top-level initialization of a global variable like this, i.e. we cannot define
FILE *_MemoryTraceFP = fopen(...);
on a global scope. So where should we initialize the file pointer?
The appropriate place for this initialization is right at the beginning of main
. Since we’re implementing a module pass, we’ll need to identify the module that defines main
- and add initialization opcodes to main
's entrypoint.
main
ModuleAll we want to do is add a file pointer to the main
-containing-module’s global scope - so that we can use it in our traceMemory
function.
Let’s start fleshing out our pass’s run
function:
llvm::PreservedAnalyses run(llvm::Module &M,
llvm::ModuleAnalysisManager &) {
Function *main = M.getFunction("main");
if (main) {
addGlobalMemoryTraceFP(M);
errs() << "Found main in module " << M.getName() << "\\n";
return llvm::PreservedAnalyses::none();
} else {
errs() << "Did not find main in " << M.getName() << "\\n";
return llvm::PreservedAnalyses::all();
}
And let’s define addGlobalMemoryTraceFP
as follows:
const std::string FilePointerVarName = "_MemoryTraceFP";
void addGlobalMemoryTraceFP(llvm::Module &M) {
auto &CTX = M.getContext();
M.getOrInsertGlobal(FilePointerVarName, PointerType::getUnqual(Type::getInt8Ty(CTX)));
GlobalVariable *namedGlobal = M.getNamedGlobal(FilePointerVarName);
namedGlobal->setLinkage(GlobalValue::ExternalLinkage);
}
In essence, all we do is define an externally-linked int8 *_MemoryTraceFP
and add it to our module using llvm::Module::getOrInsertGlobal
.
<aside>
💡 What linkage should our global file pointer have? We’ll only need it inside traceMemory
- which we can add to the same compilation module as main
.
traceMemory
will need external linkage - because we’ll want to call it from all modules - but the file pointer itself should have internal linkage.
Why then do we use GlobalValue::ExternalLinkage
and not GlobalValue::InternalLinkage
? Because LLVM seems to have weird behavior - I suspect a bug (I used the LLVM 13 toolchain) - where if we use InternalLinkage
we get this error:
”Global is external, but doesn't have external or weak linkage!”
</aside>
We can see the effects of this pass:
> cat main.c
int main() { return 0; }
> clang -S -emit-llvm main.c
> opt -load-pass-plugin ./lib/libMemoryTrace.so -passes=memory-trace main.ll -S
...
@_MemoryTraceFP = external global i8*
..
Cool! Now let’s add an initialization of this global variable to our main
function:
main