Step 1 - What do we want to do?

LLVM Passes work in the context of a pass manager - which is documented as a manager that “Manages a sequence of passes over a particular unit of IR.” There are four types of pass managers:

  1. FunctionPassManager
  2. ModulePassManager
  3. LoopPassManager
  4. CGSCCPassManager - which are used to analyze and manipulate the call graph’s SCCs (strongly connected components)

Since we want to integrate our pass into one of these pass managers, we need to decide what sort of pass we’re implementing.

For implementing memory traces, we really just want to go over opcodes - so perhaps the most appropriate pass is a function pass, which will allow us to go over the opcodes of each function like so:

void OurPass::run(llvm::Function &Func,
                llvm::FunctionAnalysisManager &) {
    for (auto &BB : Func) {
        for (auto &Inst : BB) {
            // Run custom logic on the instruction
        }
    }
}

This is almost what we want, but not quite. We also want to do the tracing itself - which involves an external call to a function like fprintf. To make use of a utility function like this - which might not even be included or defined in the code we’re compiling - we need to use LLVM’s Module::getOrInsertFunction function - which either retrieves the function from the compilation module or adds the prototype we specify to the module.

Thus, to support the usage of C library functions, we need to implement a module pass.

Our pass will essentially do three things:

  1. Make sure that fprintf is defined in the compilation unit
  2. Go over every function in the module
    1. Go over every instruction in the function
      1. If the instruction is a memory access - generate a call to fprintf that traces this access

Step 2 - Boilerplate

LLVM Passes involve a lot of boilerplate code - both in terms of the surrounding compiling-and-running infrastructure, and in terms of the code itself. We’ll hide all of the ugly cmake details inside a toggle:

The LLVM Pass Implementation