LLVM Pass System

The LLVM Pass framework is the infrastructure that enables code analysis and transformation. Understanding the pass system is essential for implementing and using obfuscation techniques in Obfussor.

What is an LLVM Pass?

An LLVM Pass is a unit of compilation work that performs analysis or transformation on LLVM IR. Passes are:

  • Modular: Self-contained units of functionality
  • Composable: Can be combined in sequences
  • Reusable: Can be applied to different modules
  • Analyzable: Can depend on other passes

Pass Types

1. Module Pass

Operates on entire modules (all functions and globals):

struct MyModulePass : public ModulePass {
  static char ID;
  
  bool runOnModule(Module &M) override {
    // Process all functions in module
    for (Function &F : M) {
      // Process function
    }
    return true; // Module was modified
  }
};

Use Cases:

  • Inter-procedural analysis
  • Global transformations
  • Call graph construction

2. Function Pass

Operates on individual functions:

struct MyFunctionPass : public FunctionPass {
  static char ID;
  
  bool runOnFunction(Function &F) override {
    // Process all basic blocks
    for (BasicBlock &BB : F) {
      // Process basic block
    }
    return true; // Function was modified
  }
};

Use Cases:

  • Intra-procedural optimizations
  • Function-level obfuscation
  • Local analysis

3. BasicBlock Pass

Operates on individual basic blocks:

struct MyBasicBlockPass : public BasicBlockPass {
  static char ID;
  
  bool runOnBasicBlock(BasicBlock &BB) override {
    for (Instruction &I : BB) {
      // Process instruction
    }
    return true; // Basic block was modified
  }
};

Use Cases:

  • Local optimizations
  • Instruction-level transformations

4. Loop Pass

Operates on loop structures:

struct MyLoopPass : public LoopPass {
  static char ID;
  
  bool runOnLoop(Loop *L, LPPassManager &LPM) override {
    // Process loop
    for (BasicBlock *BB : L->blocks()) {
      // Process blocks in loop
    }
    return true;
  }
};

Use Cases:

  • Loop optimizations
  • Loop obfuscation
  • Loop vectorization

Pass Manager

The Pass Manager orchestrates pass execution:

Legacy Pass Manager (Pre-LLVM 14)

legacy::PassManager PM;
PM.add(createPromoteMemoryToRegisterPass());
PM.add(new MyCustomPass());
PM.run(Module);

New Pass Manager (LLVM 14+)

ModulePassManager MPM;
FunctionPassManager FPM;

// Add function passes
FPM.addPass(SimplifyCFGPass());
FPM.addPass(InstructionCombiningPass());

// Add function pass manager to module pass manager
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));

// Run passes
ModuleAnalysisManager MAM;
MPM.run(Module, MAM);

Pass Dependencies

Passes can declare dependencies on other passes:

void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
  // This pass requires dominator tree
  AU.addRequired<DominatorTreeWrapperPass>();
  
  // This pass preserves CFG
  AU.setPreservesCFG();
  
  // This pass doesn't modify anything
  AU.setPreservesAll();
}

// Using the analysis
bool MyPass::runOnFunction(Function &F) {
  DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
  // Use dominator tree...
}

Common Analysis Passes

Dominator Tree

Computes dominance relationships:

DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();

if (DT.dominates(BB1, BB2)) {
  // BB1 dominates BB2
}

BasicBlock *IDom = DT.getNode(BB)->getIDom()->getBlock();

Loop Information

Analyzes loop structure:

LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();

for (Loop *L : LI) {
  BasicBlock *Header = L->getHeader();
  unsigned Depth = L->getLoopDepth();
  // Process loop
}

Alias Analysis

Determines memory aliasing:

AliasAnalysis &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();

if (AA.alias(Ptr1, Ptr2) == AliasResult::NoAlias) {
  // Pointers don't alias
}

Call Graph

Represents function call relationships:

CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();

for (auto &Node : CG) {
  Function *F = Node.first;
  for (auto &CallRecord : *Node.second) {
    Function *Callee = CallRecord.second->getFunction();
  }
}

Writing a Custom Pass

Step 1: Define Pass Class

#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"

namespace {
  struct CountInstructionsPass : public FunctionPass {
    static char ID;
    CountInstructionsPass() : FunctionPass(ID) {}
    
    bool runOnFunction(Function &F) override {
      unsigned Count = 0;
      for (BasicBlock &BB : F) {
        Count += BB.size();
      }
      errs() << "Function " << F.getName() 
             << " has " << Count << " instructions\n";
      return false; // Didn't modify the function
    }
  };
}

char CountInstructionsPass::ID = 0;

Step 2: Register the Pass

static RegisterPass<CountInstructionsPass> X(
  "count-instructions",
  "Count instructions in functions",
  false,  // Only looks at CFG
  true    // Analysis pass
);

Step 3: Build and Load

# Build pass as shared library
clang++ -shared -fPIC MyPass.cpp -o MyPass.so \
  `llvm-config --cxxflags --ldflags`

# Load and run pass
opt -load MyPass.so -count-instructions < input.bc > output.bc

Pass Scheduling

The pass manager schedules passes optimally:

Module Pass 1
  Function Pass A (on each function)
  Function Pass B (on each function)
Module Pass 2
  Function Pass C (on each function)

This minimizes:

  • Redundant analysis
  • Cache misses
  • Compilation time

Obfuscation Passes

Control Flow Flattening Pass

struct FlatteningPass : public FunctionPass {
  bool runOnFunction(Function &F) override {
    // Don't flatten already flat functions
    if (isAlreadyFlat(&F)) return false;
    
    // Split basic blocks
    std::vector<BasicBlock*> Blocks;
    for (BasicBlock &BB : F) {
      Blocks.push_back(&BB);
    }
    
    // Create switch variable
    AllocaInst *SwitchVar = 
      new AllocaInst(Type::getInt32Ty(F.getContext()));
    
    // Create dispatcher block
    BasicBlock *Dispatcher = 
      BasicBlock::Create(F.getContext(), "dispatcher", &F);
    
    // Build switch instruction
    SwitchInst *Switch = SwitchInst::Create(
      SwitchVar, DefaultBlock, Blocks.size(), Dispatcher);
    
    // Update blocks to branch to dispatcher
    for (unsigned i = 0; i < Blocks.size(); ++i) {
      // Modify terminator to update state and branch to dispatcher
      // ... implementation details ...
    }
    
    return true;
  }
};

String Encryption Pass

struct StringEncryptionPass : public ModulePass {
  bool runOnModule(Module &M) override {
    for (GlobalVariable &GV : M.globals()) {
      if (!GV.hasInitializer()) continue;
      
      Constant *Init = GV.getInitializer();
      if (ConstantDataArray *CDA = dyn_cast<ConstantDataArray>(Init)) {
        if (CDA->isString()) {
          // Encrypt the string
          std::string Original = CDA->getAsString().str();
          std::vector<uint8_t> Encrypted = encryptString(Original);
          
          // Replace with encrypted version
          Constant *NewInit = ConstantDataArray::get(
            M.getContext(), Encrypted);
          GV.setInitializer(NewInit);
          
          // Insert decryption code at usage sites
          insertDecryptionCode(&GV, M);
        }
      }
    }
    return true;
  }
};

Pass Options and Configuration

Passes can accept options:

static cl::opt<unsigned> ObfuscationLevel(
  "obf-level",
  cl::desc("Obfuscation intensity level (1-5)"),
  cl::init(3)
);

struct ConfigurablePass : public FunctionPass {
  bool runOnFunction(Function &F) override {
    unsigned Level = ObfuscationLevel;
    // Apply obfuscation based on level
    return true;
  }
};

Use from command line:

opt -load ObfPass.so -my-pass -obf-level=5 < input.bc > output.bc

Pass Debugging

# Print IR after each pass
opt -print-after-all -O2 input.ll -S -o output.ll

# Print only specific pass
opt -print-after=my-pass input.ll -S -o output.ll

Verify IR

# Run verifier after each pass
opt -verify-each -O2 input.ll -S -o output.ll

Debug Pass Execution

#define DEBUG_TYPE "my-pass"

LLVM_DEBUG(dbgs() << "Processing function: " << F.getName() << "\n");
LLVM_DEBUG(dbgs() << "Found " << Count << " instructions\n");

Enable debug output:

opt -debug -debug-only=my-pass -my-pass < input.bc > output.bc

Best Practices

1. Preserve Analysis When Possible

void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
  AU.setPreservesCFG(); // If CFG unchanged
  AU.addPreserved<LoopInfoWrapperPass>(); // If loops unchanged
}

2. Update Analysis After Modification

DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();

// Modify IR
BasicBlock *NewBB = SplitBlock(BB, I, &DT);

// DT is automatically updated

3. Use LLVM IR Builder

IRBuilder<> Builder(Context);
Builder.SetInsertPoint(InsertBefore);

Value *Sum = Builder.CreateAdd(A, B, "sum");
Value *Product = Builder.CreateMul(Sum, C, "product");

4. Handle Edge Cases

bool runOnFunction(Function &F) override {
  // Skip declarations
  if (F.isDeclaration()) return false;
  
  // Skip functions with specific attributes
  if (F.hasFnAttribute("no-obfuscate")) return false;
  
  // Process function
  return true;
}

Integration with Obfussor

Obfussor uses custom passes for each obfuscation technique:

Source Code
    ↓
  LLVM IR
    ↓
  Control Flow Flattening Pass
    ↓
  String Encryption Pass
    ↓
  Bogus Control Flow Pass
    ↓
  Instruction Substitution Pass
    ↓
  Optimization Passes
    ↓
  Obfuscated Binary

Each pass:

  • Operates on LLVM IR
  • Preserves semantics
  • Can be enabled/disabled
  • Has configurable intensity

Summary

The LLVM Pass system:

  • Provides modular transformation framework
  • Enables analysis and optimization
  • Supports custom passes for obfuscation
  • Manages dependencies automatically
  • Schedules passes efficiently

Key concepts:

  • Different pass types (Module, Function, BasicBlock, Loop)
  • Pass Manager orchestrates execution
  • Analysis passes provide information
  • Transformation passes modify IR
  • Dependencies ensure correct ordering

Next Steps


The pass system is the engine that powers LLVM obfuscation.