We want our pass to add a function to the binary that performs this logic:

void traceMemory(void *Addr, uint64_t Value, bool IsLoad) {
    if (IsLoad)
        fprintf(_MemoryTraceFP, "[Read] Read value 0x%lx from address %p\\n", Value, Addr);
    else
        fprintf(_MemoryTraceFP, "[Write] Wrote value 0x%lx to address %p\\n", Value, Addr);
}

Notice how the calls to fprintf write to a FILE *_MemoryTraceFP - where should this file pointer come from? We could insert fopen/fclose logic directly into traceMemory - but then we would be opening and closing our log on each memory access, which is wasteful.

The preferable solution is to define a global FILE *_MemoryTraceFP - and initialize it just once. Much like C, LLVM IR does not allow for top-level initialization of a global variable like this, i.e. we cannot define

FILE *_MemoryTraceFP = fopen(...);

on a global scope. So where should we initialize the file pointer?

The appropriate place for this initialization is right at the beginning of main. Since we’re implementing a module pass, we’ll need to identify the module that defines main - and add initialization opcodes to main's entrypoint.

Step One - Add A Global File Pointer to Our main Module

All we want to do is add a file pointer to the main-containing-module’s global scope - so that we can use it in our traceMemory function.

Let’s start fleshing out our pass’s run function:

llvm::PreservedAnalyses run(llvm::Module &M,
                        llvm::ModuleAnalysisManager &) {
    Function *main = M.getFunction("main");
    if (main) {
            addGlobalMemoryTraceFP(M);
            errs() << "Found main in module " << M.getName() << "\\n";
            return llvm::PreservedAnalyses::none();
    } else {
            errs() << "Did not find main in " << M.getName() << "\\n";
            return llvm::PreservedAnalyses::all();
    }

And let’s define addGlobalMemoryTraceFP as follows:

const std::string FilePointerVarName = "_MemoryTraceFP";

void addGlobalMemoryTraceFP(llvm::Module &M) {
    auto &CTX = M.getContext();

    M.getOrInsertGlobal(FilePointerVarName, PointerType::getUnqual(Type::getInt8Ty(CTX)));

    GlobalVariable *namedGlobal = M.getNamedGlobal(FilePointerVarName);
    namedGlobal->setLinkage(GlobalValue::ExternalLinkage);
}

In essence, all we do is define an externally-linked int8 *_MemoryTraceFP and add it to our module using llvm::Module::getOrInsertGlobal.

<aside> 💡 What linkage should our global file pointer have? We’ll only need it inside traceMemory - which we can add to the same compilation module as main.

traceMemory will need external linkage - because we’ll want to call it from all modules - but the file pointer itself should have internal linkage.

Why then do we use GlobalValue::ExternalLinkage and not GlobalValue::InternalLinkage? Because LLVM seems to have weird behavior - I suspect a bug (I used the LLVM 13 toolchain) - where if we use InternalLinkage we get this error:

”Global is external, but doesn't have external or weak linkage!”

</aside>

We can see the effects of this pass:

> cat main.c
int main() { return 0; }

> clang -S -emit-llvm main.c

> opt -load-pass-plugin ./lib/libMemoryTrace.so -passes=memory-trace main.ll -S
...

@_MemoryTraceFP = external global i8*

..

Cool! Now let’s add an initialization of this global variable to our main function:

Step Two - Initializing the Global File Pointer in main