LLVM Pass System
The LLVM Pass framework is the infrastructure that enables code analysis and transformation. Understanding the pass system is essential for implementing and using obfuscation techniques in Obfussor.
What is an LLVM Pass?
An LLVM Pass is a unit of compilation work that performs analysis or transformation on LLVM IR. Passes are:
- Modular: Self-contained units of functionality
- Composable: Can be combined in sequences
- Reusable: Can be applied to different modules
- Analyzable: Can depend on other passes
Pass Types
1. Module Pass
Operates on entire modules (all functions and globals):
struct MyModulePass : public ModulePass {
static char ID;
bool runOnModule(Module &M) override {
// Process all functions in module
for (Function &F : M) {
// Process function
}
return true; // Module was modified
}
};
Use Cases:
- Inter-procedural analysis
- Global transformations
- Call graph construction
2. Function Pass
Operates on individual functions:
struct MyFunctionPass : public FunctionPass {
static char ID;
bool runOnFunction(Function &F) override {
// Process all basic blocks
for (BasicBlock &BB : F) {
// Process basic block
}
return true; // Function was modified
}
};
Use Cases:
- Intra-procedural optimizations
- Function-level obfuscation
- Local analysis
3. BasicBlock Pass
Operates on individual basic blocks:
struct MyBasicBlockPass : public BasicBlockPass {
static char ID;
bool runOnBasicBlock(BasicBlock &BB) override {
for (Instruction &I : BB) {
// Process instruction
}
return true; // Basic block was modified
}
};
Use Cases:
- Local optimizations
- Instruction-level transformations
4. Loop Pass
Operates on loop structures:
struct MyLoopPass : public LoopPass {
static char ID;
bool runOnLoop(Loop *L, LPPassManager &LPM) override {
// Process loop
for (BasicBlock *BB : L->blocks()) {
// Process blocks in loop
}
return true;
}
};
Use Cases:
- Loop optimizations
- Loop obfuscation
- Loop vectorization
Pass Manager
The Pass Manager orchestrates pass execution:
Legacy Pass Manager (Pre-LLVM 14)
legacy::PassManager PM;
PM.add(createPromoteMemoryToRegisterPass());
PM.add(new MyCustomPass());
PM.run(Module);
New Pass Manager (LLVM 14+)
ModulePassManager MPM;
FunctionPassManager FPM;
// Add function passes
FPM.addPass(SimplifyCFGPass());
FPM.addPass(InstructionCombiningPass());
// Add function pass manager to module pass manager
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
// Run passes
ModuleAnalysisManager MAM;
MPM.run(Module, MAM);
Pass Dependencies
Passes can declare dependencies on other passes:
void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
// This pass requires dominator tree
AU.addRequired<DominatorTreeWrapperPass>();
// This pass preserves CFG
AU.setPreservesCFG();
// This pass doesn't modify anything
AU.setPreservesAll();
}
// Using the analysis
bool MyPass::runOnFunction(Function &F) {
DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
// Use dominator tree...
}
Common Analysis Passes
Dominator Tree
Computes dominance relationships:
DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
if (DT.dominates(BB1, BB2)) {
// BB1 dominates BB2
}
BasicBlock *IDom = DT.getNode(BB)->getIDom()->getBlock();
Loop Information
Analyzes loop structure:
LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
for (Loop *L : LI) {
BasicBlock *Header = L->getHeader();
unsigned Depth = L->getLoopDepth();
// Process loop
}
Alias Analysis
Determines memory aliasing:
AliasAnalysis &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
if (AA.alias(Ptr1, Ptr2) == AliasResult::NoAlias) {
// Pointers don't alias
}
Call Graph
Represents function call relationships:
CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();
for (auto &Node : CG) {
Function *F = Node.first;
for (auto &CallRecord : *Node.second) {
Function *Callee = CallRecord.second->getFunction();
}
}
Writing a Custom Pass
Step 1: Define Pass Class
#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
namespace {
struct CountInstructionsPass : public FunctionPass {
static char ID;
CountInstructionsPass() : FunctionPass(ID) {}
bool runOnFunction(Function &F) override {
unsigned Count = 0;
for (BasicBlock &BB : F) {
Count += BB.size();
}
errs() << "Function " << F.getName()
<< " has " << Count << " instructions\n";
return false; // Didn't modify the function
}
};
}
char CountInstructionsPass::ID = 0;
Step 2: Register the Pass
static RegisterPass<CountInstructionsPass> X(
"count-instructions",
"Count instructions in functions",
false, // Only looks at CFG
true // Analysis pass
);
Step 3: Build and Load
# Build pass as shared library
clang++ -shared -fPIC MyPass.cpp -o MyPass.so \
`llvm-config --cxxflags --ldflags`
# Load and run pass
opt -load MyPass.so -count-instructions < input.bc > output.bc
Pass Scheduling
The pass manager schedules passes optimally:
Module Pass 1
Function Pass A (on each function)
Function Pass B (on each function)
Module Pass 2
Function Pass C (on each function)
This minimizes:
- Redundant analysis
- Cache misses
- Compilation time
Obfuscation Passes
Control Flow Flattening Pass
struct FlatteningPass : public FunctionPass {
bool runOnFunction(Function &F) override {
// Don't flatten already flat functions
if (isAlreadyFlat(&F)) return false;
// Split basic blocks
std::vector<BasicBlock*> Blocks;
for (BasicBlock &BB : F) {
Blocks.push_back(&BB);
}
// Create switch variable
AllocaInst *SwitchVar =
new AllocaInst(Type::getInt32Ty(F.getContext()));
// Create dispatcher block
BasicBlock *Dispatcher =
BasicBlock::Create(F.getContext(), "dispatcher", &F);
// Build switch instruction
SwitchInst *Switch = SwitchInst::Create(
SwitchVar, DefaultBlock, Blocks.size(), Dispatcher);
// Update blocks to branch to dispatcher
for (unsigned i = 0; i < Blocks.size(); ++i) {
// Modify terminator to update state and branch to dispatcher
// ... implementation details ...
}
return true;
}
};
String Encryption Pass
struct StringEncryptionPass : public ModulePass {
bool runOnModule(Module &M) override {
for (GlobalVariable &GV : M.globals()) {
if (!GV.hasInitializer()) continue;
Constant *Init = GV.getInitializer();
if (ConstantDataArray *CDA = dyn_cast<ConstantDataArray>(Init)) {
if (CDA->isString()) {
// Encrypt the string
std::string Original = CDA->getAsString().str();
std::vector<uint8_t> Encrypted = encryptString(Original);
// Replace with encrypted version
Constant *NewInit = ConstantDataArray::get(
M.getContext(), Encrypted);
GV.setInitializer(NewInit);
// Insert decryption code at usage sites
insertDecryptionCode(&GV, M);
}
}
}
return true;
}
};
Pass Options and Configuration
Passes can accept options:
static cl::opt<unsigned> ObfuscationLevel(
"obf-level",
cl::desc("Obfuscation intensity level (1-5)"),
cl::init(3)
);
struct ConfigurablePass : public FunctionPass {
bool runOnFunction(Function &F) override {
unsigned Level = ObfuscationLevel;
// Apply obfuscation based on level
return true;
}
};
Use from command line:
opt -load ObfPass.so -my-pass -obf-level=5 < input.bc > output.bc
Pass Debugging
Print IR Before/After
# Print IR after each pass
opt -print-after-all -O2 input.ll -S -o output.ll
# Print only specific pass
opt -print-after=my-pass input.ll -S -o output.ll
Verify IR
# Run verifier after each pass
opt -verify-each -O2 input.ll -S -o output.ll
Debug Pass Execution
#define DEBUG_TYPE "my-pass"
LLVM_DEBUG(dbgs() << "Processing function: " << F.getName() << "\n");
LLVM_DEBUG(dbgs() << "Found " << Count << " instructions\n");
Enable debug output:
opt -debug -debug-only=my-pass -my-pass < input.bc > output.bc
Best Practices
1. Preserve Analysis When Possible
void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG(); // If CFG unchanged
AU.addPreserved<LoopInfoWrapperPass>(); // If loops unchanged
}
2. Update Analysis After Modification
DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
// Modify IR
BasicBlock *NewBB = SplitBlock(BB, I, &DT);
// DT is automatically updated
3. Use LLVM IR Builder
IRBuilder<> Builder(Context);
Builder.SetInsertPoint(InsertBefore);
Value *Sum = Builder.CreateAdd(A, B, "sum");
Value *Product = Builder.CreateMul(Sum, C, "product");
4. Handle Edge Cases
bool runOnFunction(Function &F) override {
// Skip declarations
if (F.isDeclaration()) return false;
// Skip functions with specific attributes
if (F.hasFnAttribute("no-obfuscate")) return false;
// Process function
return true;
}
Integration with Obfussor
Obfussor uses custom passes for each obfuscation technique:
Source Code
↓
LLVM IR
↓
Control Flow Flattening Pass
↓
String Encryption Pass
↓
Bogus Control Flow Pass
↓
Instruction Substitution Pass
↓
Optimization Passes
↓
Obfuscated Binary
Each pass:
- Operates on LLVM IR
- Preserves semantics
- Can be enabled/disabled
- Has configurable intensity
Summary
The LLVM Pass system:
- Provides modular transformation framework
- Enables analysis and optimization
- Supports custom passes for obfuscation
- Manages dependencies automatically
- Schedules passes efficiently
Key concepts:
- Different pass types (Module, Function, BasicBlock, Loop)
- Pass Manager orchestrates execution
- Analysis passes provide information
- Transformation passes modify IR
- Dependencies ensure correct ordering
Next Steps
- Compilation Pipeline: See passes in action
- Obfuscation Techniques: Obfuscation passes
- Custom Passes: Write your own passes
The pass system is the engine that powers LLVM obfuscation.