Introduction

Welcome to the Obfussor documentation, your comprehensive guide to LLVM-based code obfuscation techniques and the Obfussor framework.

What is Obfussor?

Obfussor is a high-performance binary obfuscation framework that leverages LLVM's compiler infrastructure to transform source code into hardened, reverse-engineering-resistant binaries. Built for scenarios where intellectual property protection is non-negotiable, Obfussor provides a modern, user-friendly interface powered by Angular and Tauri, with a robust Rust backend for LLVM integration.

What is Code Obfuscation?

Code obfuscation is a technique used to make software difficult to understand and reverse engineer while preserving its original functionality. Unlike encryption, which makes code unreadable until decrypted, obfuscation transforms code into a functionally equivalent but significantly more complex form that hinders analysis and comprehension.

Why Code Obfuscation Matters

In today's software landscape, protecting intellectual property is crucial. Developers and organizations face numerous threats:

Reverse Engineering: Competitors or malicious actors analyzing your code to steal algorithms, business logic, or proprietary techniques
Code Theft: Direct copying of your codebase or critical components
License Violations: Unauthorized use or distribution of licensed software
Security Vulnerabilities: Easier exploitation when attackers understand your code structure
Intellectual Property Loss: Loss of competitive advantage when proprietary methods are exposed

Code obfuscation provides a critical defense layer against these threats, making your software significantly harder to analyze, understand, and exploit.

LLVM-Based Obfuscation Advantages

LLVM (Low Level Virtual Machine) provides unique advantages for code obfuscation:

Compiler-Level Transformation

Unlike binary obfuscation tools that work on compiled executables, LLVM obfuscation operates at the Intermediate Representation (IR) level during compilation. This provides:

Better Integration: Seamless integration with the compilation process
Platform Independence: Apply obfuscation once, compile for multiple targets
Optimization Compatibility: Works alongside compiler optimizations
Granular Control: Fine-grained control over which parts of code to obfuscate

Architecture-Agnostic Approach

LLVM IR serves as a universal intermediate language between source code and machine code:

Cross-Platform Support: The same obfuscation techniques work across x86, ARM, MIPS, and other architectures
Consistent Results: Predictable obfuscation behavior regardless of target platform
Maintainability: Single codebase for obfuscation logic

Advanced Transformation Capabilities

LLVM's rich IR and pass infrastructure enable sophisticated obfuscation techniques:

Control Flow Analysis: Deep understanding of program structure enables complex control flow transformations
Data Flow Tracking: Precise data flow information allows for effective instruction substitution
Type System: Strong type system in LLVM IR ensures transformations preserve program semantics

Project Features and Capabilities

Obfussor provides a comprehensive suite of obfuscation techniques:

Core Obfuscation Techniques

Control Flow Flattening
- Transforms natural program flow into opaque, non-linear execution paths
- Implements switch-based dispatch mechanisms
- Creates state machine-like control structures
String Encryption
- Automatic encryption of all string literals
- Runtime decryption mechanisms
- Multiple encryption algorithm support
Bogus Code Injection
- Insertion of dead code paths computationally indistinguishable from real logic
- Opaque predicate construction
- Code bloating with semantic preservation
Instruction Substitution
- Replaces simple instructions with semantically equivalent but complex alternatives
- Arithmetic transformation patterns
- Mixed boolean-arithmetic operations
Function Inlining/Outlining
- Strategic manipulation of function boundaries
- Call graph obfuscation
- Program structure obscuration

Advanced Features

Configurable Intensity Levels: Fine-tune the security/performance tradeoff for your specific needs
Selective Obfuscation: Choose which functions, modules, or code sections to obfuscate
Comprehensive Reporting: Detailed metrics on obfuscation coverage, complexity increase, and performance impact
Custom Pass Integration: Extend Obfussor with your own LLVM obfuscation passes

Performance Characteristics

Zero-Overhead Abstractions

Obfussor's design philosophy prioritizes minimal runtime overhead:

Compile-Time Transformation: All obfuscation happens during compilation
No Runtime Dependencies: No additional libraries or runtime components required
Optimized Output: Obfuscated code can still be optimized by standard compiler optimizations

Resource Efficiency

Memory Efficient: Optimized for constrained environments
Fast Compilation: Parallel pass execution when possible
Scalable: Handles large codebases efficiently

Performance Metrics

Obfussor generates comprehensive reports including:

Control Flow Complexity: Cyclomatic complexity increase factor
String Protection Coverage: Percentage of strings encrypted
Code Inflation Ratio: Size increase due to obfuscation
Bogus Code Distribution: Statistical analysis of injected code
Entropy Analysis: Information-theoretic metrics of output randomness

Cross-Platform Support

Obfussor is built with modern technologies ensuring broad platform support:

Desktop Platforms

Windows: Full support for Windows 10/11 (x64, ARM64)
macOS: Support for macOS 10.15+ (Intel and Apple Silicon)
Linux: Debian, Ubuntu, Fedora, Arch, and other major distributions

Architecture Support

Through LLVM's architecture-agnostic approach:

x86/x86_64: Full support for Intel and AMD processors
ARM/ARM64: Support for ARM-based systems including Apple M1/M2
RISC-V: Experimental support for RISC-V architectures
WebAssembly: Can obfuscate code compiled to WebAssembly

Technology Stack

Frontend: Angular 20.x - Modern, responsive web-based UI
Backend: Rust - Safe, fast, and reliable LLVM integration
Desktop Framework: Tauri - Lightweight, secure desktop application framework
Build System: Integration with standard compilation toolchains

Target Audience

Obfussor is designed for:

Software Developers

Developers wanting to protect their intellectual property
Teams building commercial software requiring reverse engineering protection
Open source developers protecting sensitive algorithms

Security Professionals

Security researchers studying obfuscation and deobfuscation techniques
Penetration testers understanding obfuscated code analysis
Security engineers implementing defense-in-depth strategies

Organizations

Companies protecting proprietary software and algorithms
Financial institutions securing trading algorithms and business logic
Gaming companies preventing cheating and piracy
Mobile app developers protecting against app cloning

Researchers and Academics

Computer science researchers studying program transformation
Students learning about compiler design and code protection
Academic institutions teaching software security

License Information

Obfussor is released under the MIT License, which means:

Free to Use: Use Obfussor for personal, academic, or commercial projects
Modification Rights: Modify the source code to suit your needs
Distribution: Distribute original or modified versions
No Warranty: Software is provided "as is" without warranty
Attribution: Keep the original copyright notice in distributions

For complete license details, see the LICENSE file in the repository.

Getting Started

Ready to protect your code? Here's what's next:

Installation: Set up Obfussor on your system
Quick Start: Obfuscate your first program
Configuration: Learn about configuration options
LLVM Fundamentals: Understand the underlying technology

Documentation Structure

This documentation is organized into several sections:

Getting Started: Installation, quick start guide, and basic configuration
LLVM Fundamentals: Understanding LLVM architecture, IR, and passes
Obfuscation Techniques: Detailed explanation of each obfuscation method
Implementation Details: Architecture and implementation of Obfussor
Advanced Topics: Custom passes, optimization, and security analysis
Use Cases: Real-world applications and scenarios
API Reference: Complete API documentation for CLI and programmatic use
Troubleshooting: Common issues and solutions
Contributing: How to contribute to Obfussor development

Community and Support

GitHub Repository: https://github.com/matrixbytes/Obfussor
Issue Tracker: Report bugs and request features on GitHub Issues
Discussions: Join community discussions on GitHub Discussions
Contributing: See Contributing Guidelines to get involved

Let's begin your journey into LLVM-based code obfuscation!

Installation

This guide will walk you through installing Obfussor and all its prerequisites on your system.

Prerequisites

Before installing Obfussor, ensure you have the following tools installed:

Required Tools

1. Node.js (v18.0.0 or later)

Node.js is required for the Angular frontend.

Windows:

# Download and install from nodejs.org
# Or use Chocolatey
choco install nodejs

# Verify installation
node --version
npm --version

macOS:

# Using Homebrew
brew install node

# Verify installation
node --version
npm --version

Linux:

# Ubuntu/Debian
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Fedora
sudo dnf install nodejs

# Verify installation
node --version
npm --version

2. Bun (Latest Version)

Bun is a fast JavaScript runtime and package manager used in this project.

Windows:

# Using PowerShell
powershell -c "irm bun.sh/install.ps1 | iex"

# Verify installation
bun --version

macOS/Linux:

# Using curl
curl -fsSL https://bun.sh/install | bash

# Verify installation
bun --version

3. Rust (Latest Stable)

Rust is required for the Tauri backend and LLVM integration.

All Platforms:

# Install rustup (Rust installer)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# On Windows, download and run rustup-init.exe from rustup.rs

# Follow the prompts and choose default installation

# Restart your terminal, then verify
rustc --version
cargo --version

Post-Installation:

# Update Rust to latest version
rustup update

# Add common components
rustup component add rustfmt clippy

4. Tauri CLI

Tauri CLI is required to build and run the desktop application.

# Install Tauri CLI via Cargo
cargo install tauri-cli --version "^2.0"

# Verify installation
cargo tauri --version

5. LLVM (Version 14.0 or later)

LLVM is the core dependency for obfuscation functionality.

Windows:

# Download pre-built binaries from llvm.org
# Or use Chocolatey
choco install llvm

# Add to PATH: C:\Program Files\LLVM\bin

macOS:

# Using Homebrew
brew install llvm

# Add to PATH (add to ~/.zshrc or ~/.bash_profile)
echo 'export PATH="/usr/local/opt/llvm/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

# Verify installation
llvm-config --version

Linux:

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install llvm-14 llvm-14-dev clang-14

# Fedora
sudo dnf install llvm llvm-devel clang

# Verify installation
llvm-config --version

Optional Tools

Git

Version control for cloning the repository:

# Windows (Chocolatey)
choco install git

# macOS
brew install git

# Linux (Ubuntu/Debian)
sudo apt-get install git

# Verify
git --version

Visual Studio Code

Recommended IDE with excellent Rust, Angular, and TypeScript support:

# Download from code.visualstudio.com

# Recommended Extensions:
# - rust-analyzer
# - Angular Language Service
# - Tauri
# - ESLint
# - Prettier

Platform-Specific Requirements

Windows

Visual Studio Build Tools (required for Rust compilation):

Download Visual Studio Build Tools
Install with "Desktop development with C++" workload
Ensure the following components are selected:
- MSVC v143 - VS 2022 C++ x64/x86 build tools
- Windows 10/11 SDK
- C++ CMake tools for Windows

WebView2 (required for Tauri):

Windows 11: Pre-installed
Windows 10: Download WebView2 Runtime

macOS

Xcode Command Line Tools:

xcode-select --install

Linux

Build Dependencies:

Debian/Ubuntu:

sudo apt-get update
sudo apt-get install -y \
    libwebkit2gtk-4.1-dev \
    build-essential \
    curl \
    wget \
    file \
    libssl-dev \
    libgtk-3-dev \
    libayatana-appindicator3-dev \
    librsvg2-dev

Fedora:

sudo dnf install \
    webkit2gtk4.1-devel \
    openssl-devel \
    curl \
    wget \
    file \
    gtk3-devel \
    libappindicator-gtk3-devel \
    librsvg2-devel

Arch Linux:

sudo pacman -S \
    webkit2gtk \
    base-devel \
    curl \
    wget \
    file \
    openssl \
    gtk3 \
    libappindicator-gtk3 \
    librsvg

Installing Obfussor

Method 1: Clone from GitHub

# Clone the repository
git clone https://github.com/matrixbytes/Obfussor.git

# Navigate to the directory
cd Obfussor

# Install dependencies
bun install

# Verify installation
bun ng version
cargo tauri info

Method 2: Download Release Binary

Visit GitHub Releases
Download the latest release for your platform:
- Windows: Obfussor-{version}-x64-setup.exe
- macOS: Obfussor-{version}-x64.dmg or Obfussor-{version}-aarch64.dmg
- Linux: Obfussor-{version}-amd64.AppImage or .deb/.rpm
Install following platform-specific instructions

Post-Installation Verification

Verify all components are correctly installed:

# Check Node.js
node --version  # Should be >= 18.0.0

# Check Bun
bun --version

# Check Rust
rustc --version
cargo --version

# Check Tauri
cargo tauri --version

# Check LLVM
llvm-config --version  # Should be >= 14.0

# Check Clang
clang --version

Building from Source

If you cloned from GitHub, build Obfussor:

# Development build
cargo tauri dev

# Production build
cargo tauri build

The production build will create installers in:

Windows: src-tauri/target/release/bundle/msi/ or src-tauri/target/release/bundle/nsis/
macOS: src-tauri/target/release/bundle/dmg/ or src-tauri/target/release/bundle/macos/
Linux: src-tauri/target/release/bundle/appimage/ or src-tauri/target/release/bundle/deb/

Environment Configuration

Setting up LLVM Environment Variables

Windows:

# Add to System Environment Variables
setx LLVM_SYS_140_PREFIX "C:\Program Files\LLVM"
setx PATH "%PATH%;C:\Program Files\LLVM\bin"

macOS/Linux:

# Add to ~/.zshrc or ~/.bashrc
export LLVM_SYS_140_PREFIX="/usr/local/opt/llvm"
export PATH="/usr/local/opt/llvm/bin:$PATH"

# Apply changes
source ~/.zshrc  # or source ~/.bashrc

Rust Environment

Ensure Rust environment is properly configured:

# Verify cargo is in PATH
which cargo

# If not found, add to PATH
echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Troubleshooting Installation Issues

Common Problems

1. Bun Installation Fails on Windows

Error: PowerShell execution policy prevents installation

Solution:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

2. Rust/Cargo Not Found

Error: cargo: command not found

Solution:

Restart terminal
Manually add to PATH: $HOME/.cargo/bin (Unix) or %USERPROFILE%\.cargo\bin (Windows)
Re-run Rust installer

3. LLVM Not Found

Error: could not find native static library 'LLVM'

Solution:

# Verify LLVM is installed
llvm-config --version

# Set LLVM_SYS_140_PREFIX environment variable
export LLVM_SYS_140_PREFIX=$(llvm-config --prefix)

# Try installation again

4. WebKit2GTK Missing (Linux)

Error: Package webkit2gtk-4.1 was not found

Solution:

# Ubuntu/Debian
sudo apt-get install libwebkit2gtk-4.1-dev

# Older systems may need webkit2gtk-4.0:
sudo apt-get install libwebkit2gtk-4.0-dev

5. Tauri CLI Installation Fails

Error: Compilation errors during cargo install tauri-cli

Solution:

# Update Rust
rustup update

# Install with specific version
cargo install tauri-cli --version "^2.0"

# Windows: Ensure Visual Studio Build Tools are installed

Getting Help

If you encounter issues not covered here:

Check Troubleshooting Guide
Review GitHub Issues
Consult Tauri Prerequisites
Open a new issue with detailed error messages

Next Steps

Once installation is complete:

Quick Start Guide: Learn to obfuscate your first program
Configuration: Understand configuration options
LLVM Overview: Learn about LLVM fundamentals

Congratulations! You now have Obfussor installed and ready to use.

Quick Start

This guide will help you obfuscate your first program using Obfussor. By the end of this tutorial, you'll understand the basic workflow and be able to apply obfuscation to your own projects.

Prerequisites

Before you begin, ensure you have:

Completed the Installation guide
Basic knowledge of C/C++ programming
A simple C/C++ program to obfuscate
LLVM toolchain properly configured

Your First Obfuscation

Step 1: Create a Sample Program

Let's start with a simple C program:

// hello.c
#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

int main() {
    int x = 5;
    int y = 10;
    int result = add(x, y);
    
    printf("The sum of %d and %d is %d\n", x, y, result);
    printf("Hello from Obfussor!\n");
    
    return 0;
}

Save this as hello.c in your working directory.

Step 2: Launch Obfussor

Start the Obfussor application:

# If built from source
cargo tauri dev

# Or run the installed application
./Obfussor  # Linux/macOS
# Or launch from Applications menu/Start menu

The Obfussor GUI will open, presenting you with the main interface.

Step 3: Configure Obfuscation Settings

In the Obfussor interface:

Select Input File: Click "Browse" and select your hello.c file
Choose Output Directory: Specify where the obfuscated output should be saved
Select Obfuscation Techniques:
- ✓ Control Flow Flattening
- ✓ String Encryption
- ✓ Instruction Substitution
Set Intensity Level: Choose "Medium" for this example
Configure Compiler Options:
- Compiler: clang
- Optimization Level: -O2
- Target Architecture: Auto-detect

Step 4: Run Obfuscation

Click the "Obfuscate" button to start the process.

Obfussor will:

Parse your source code to LLVM IR
Apply selected obfuscation passes
Generate obfuscated LLVM IR
Compile to native binary
Generate a detailed report

Step 5: Review Results

After completion, you'll see:

Obfuscation Report:

Obfuscation Summary
===================
Input File: hello.c
Output File: hello_obfuscated
Techniques Applied:
  - Control Flow Flattening: ✓
  - String Encryption: ✓
  - Instruction Substitution: ✓

Metrics:
  - Functions Obfuscated: 2/2
  - Strings Encrypted: 2/2
  - Instructions Substituted: 15
  - Code Size Increase: 45%
  - Cyclomatic Complexity Increase: 3.2x

Status: Success ✓

Step 6: Test the Obfuscated Program

Run the obfuscated binary to verify it works correctly:

# Navigate to output directory
cd output/

# Run the obfuscated program
./hello_obfuscated

# Expected output:
# The sum of 5 and 10 is 15
# Hello from Obfussor!

The program should function identically to the original!

Step 7: Compare Original and Obfuscated

Original LLVM IR (simplified):

define i32 @add(i32 %a, i32 %b) {
entry:
  %sum = add i32 %a, %b
  ret i32 %sum
}

define i32 @main() {
entry:
  %x = alloca i32
  %y = alloca i32
  store i32 5, i32* %x
  store i32 10, i32* %y
  %call = call i32 @add(i32 5, i32 10)
  ; ... printf calls ...
  ret i32 0
}

Obfuscated LLVM IR (simplified):

define i32 @add(i32 %a, i32 %b) {
entry:
  %switch.var = alloca i32
  store i32 0, i32* %switch.var
  br label %switch.dispatch

switch.dispatch:
  %state = load i32, i32* %switch.var
  switch i32 %state, label %unreachable [
    i32 0, label %block.0
    i32 1, label %block.1
    i32 2, label %block.2
  ]

block.0:
  ; Bogus code
  %bogus1 = add i32 %a, 42
  store i32 1, i32* %switch.var
  br label %switch.dispatch

block.1:
  ; Actual computation (obfuscated)
  %t1 = sub i32 0, %b
  %t2 = sub i32 %a, %t1
  store i32 2, i32* %switch.var
  br label %switch.dispatch

block.2:
  ret i32 %t2

unreachable:
  unreachable
}

Notice how the control flow is flattened and the simple addition is replaced with complex instructions.

Command Line Interface (CLI)

For automation and scripting, use the CLI:

Basic Usage

obfussor-cli obfuscate \
  --input hello.c \
  --output hello_obfuscated \
  --techniques cff,str,sub \
  --intensity medium

CLI Options

obfussor-cli obfuscate [OPTIONS]

OPTIONS:
  -i, --input <FILE>          Input source file
  -o, --output <FILE>         Output file name
  -t, --techniques <LIST>     Comma-separated list of techniques
                              (cff, str, bog, sub, inl)
  --intensity <LEVEL>         Obfuscation intensity (low, medium, high)
  --compiler <COMPILER>       Compiler to use (clang, gcc)
  -O <LEVEL>                  Optimization level (0, 1, 2, 3, s)
  --target <ARCH>             Target architecture
  --config <FILE>             Configuration file
  --report <FILE>             Output report file
  --ir-only                   Generate LLVM IR only (no compilation)
  -v, --verbose               Verbose output
  -h, --help                  Show help message

Example: Maximum Obfuscation

obfussor-cli obfuscate \
  --input myprogram.c \
  --output myprogram_protected \
  --techniques cff,str,bog,sub,inl \
  --intensity high \
  -O2 \
  --report obfuscation-report.json \
  --verbose

Example: Configuration File

Create obfuscation-config.json:

{
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "medium",
      "preserve_functions": ["main"]
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "aes128",
      "exclude_patterns": ["debug_*"]
    },
    "instruction_substitution": {
      "enabled": true,
      "complexity": 3
    }
  },
  "compiler": {
    "name": "clang",
    "optimization": "O2",
    "flags": ["-fno-inline"]
  },
  "output": {
    "ir_file": "output.ll",
    "report_file": "report.json",
    "preserve_symbols": false
  }
}

Use the configuration:

obfussor-cli obfuscate \
  --input myprogram.c \
  --config obfuscation-config.json

Working with Projects

Single File Projects

obfussor-cli obfuscate \
  --input main.c \
  --output main_obf \
  --techniques cff,str

Multiple Files

Obfuscate each file separately and link:

# Obfuscate each source file to LLVM IR
obfussor-cli obfuscate --input file1.c --output file1_obf.ll --ir-only
obfussor-cli obfuscate --input file2.c --output file2_obf.ll --ir-only

# Compile IR files to object files
clang -c file1_obf.ll -o file1_obf.o
clang -c file2_obf.ll -o file2_obf.o
# Link obfuscated object files
clang file1_obf.o file2_obf.o -o program_obfuscated

Integration with Build Systems

Makefile Example

CC = clang
OBFUSSOR = obfussor-cli

SOURCES = main.c utils.c
OBJECTS = $(SOURCES:.c=.o)
OBFUSCATED = $(SOURCES:.c=_obf.o)

all: program_obfuscated

%.o: %.c
	$(CC) -c $< -o $@

%_obf.o: %.c
	$(OBFUSSOR) obfuscate --input $< --output $@ --techniques cff,str

program_obfuscated: $(OBFUSCATED)
	$(CC) $(OBFUSCATED) -o $@

clean:
	rm -f $(OBJECTS) $(OBFUSCATED) program_obfuscated

CMake Example

# Add custom command for obfuscation
function(add_obfuscated_executable target)
    set(SOURCES ${ARGN})
    set(OBFUSCATED_SOURCES "")
    
    foreach(source ${SOURCES})
        get_filename_component(source_name ${source} NAME_WE)
        set(obf_source "${CMAKE_BINARY_DIR}/${source_name}_obf.c")
        
        add_custom_command(
            OUTPUT ${obf_source}
            COMMAND obfussor-cli obfuscate 
                --input ${CMAKE_CURRENT_SOURCE_DIR}/${source}
                --output ${obf_source}
                --techniques cff,str
            DEPENDS ${source}
            COMMENT "Obfuscating ${source}"
        )
        
        list(APPEND OBFUSCATED_SOURCES ${obf_source})
    endforeach()
    
    add_executable(${target} ${OBFUSCATED_SOURCES})
endfunction()

# Usage
add_obfuscated_executable(my_program main.c utils.c)

Verifying Obfuscation

Visual Inspection

Compare the disassembly of original and obfuscated binaries:

# Disassemble original
objdump -d hello > hello_original.asm

# Disassemble obfuscated
objdump -d hello_obfuscated > hello_obfuscated.asm

# Compare
diff hello_original.asm hello_obfuscated.asm

Using Analysis Tools

Analyze with tools like Ghidra or IDA Pro:

Load the original binary
Note the control flow graph structure
Load the obfuscated binary
Compare the complexity and readability

Automated Testing

Ensure functionality is preserved:

# Create test script
cat > test.sh << 'EOF'
#!/bin/bash

# Test original
./hello > original_output.txt

# Test obfuscated
./hello_obfuscated > obfuscated_output.txt

# Compare outputs
if diff original_output.txt obfuscated_output.txt; then
    echo "✓ Functionality preserved"
else
    echo "✗ Output differs - obfuscation error!"
    exit 1
fi
EOF

chmod +x test.sh
./test.sh

Understanding the Report

Obfussor generates detailed JSON reports:

{
  "timestamp": "2024-01-15T10:30:00Z",
  "input_file": "hello.c",
  "output_file": "hello_obfuscated",
  "techniques": [
    {
      "name": "control_flow_flattening",
      "status": "applied",
      "functions_affected": 2,
      "metrics": {
        "blocks_added": 15,
        "complexity_increase": 3.2
      }
    },
    {
      "name": "string_encryption",
      "status": "applied",
      "strings_encrypted": 2,
      "encryption_algorithm": "xor"
    }
  ],
  "overall_metrics": {
    "original_size": 8432,
    "obfuscated_size": 12227,
    "size_increase_percent": 45,
    "original_complexity": 5,
    "obfuscated_complexity": 16
  }
}

Next Steps

Now that you've obfuscated your first program:

Configuration Guide: Learn about advanced configuration options
Obfuscation Techniques: Understand each technique in detail
LLVM Fundamentals: Learn how LLVM powers obfuscation
Advanced Topics: Create custom obfuscation passes

Common Pitfalls

1. Over-Obfuscation

Problem: Applying all techniques at maximum intensity Solution: Start with medium intensity and specific techniques based on threat model

2. Breaking Debug Symbols

Problem: Obfuscation removes debug information Solution: Keep separate debug builds; use --preserve-symbols for development

3. Performance Degradation

Problem: High intensity obfuscation significantly slows execution Solution: Profile your application; selectively obfuscate critical functions only

4. Compilation Errors

Problem: Obfuscated IR fails to compile Solution: Check LLVM version compatibility; verify input code compiles without obfuscation first

Tips for Success

Start Simple: Begin with one technique, verify it works, then add more
Test Thoroughly: Always test obfuscated binaries match original behavior
Version Control: Keep original source separate from obfuscated versions
Document Configuration: Save your obfuscation configs for reproducibility
Benchmark Performance: Measure performance impact before deploying

Congratulations! You've successfully obfuscated your first program with Obfussor.

Configuration

Obfussor provides flexible configuration options to customize obfuscation behavior for your specific needs. This guide covers all configuration methods and available options.

Configuration Methods

Obfussor supports three configuration methods:

GUI Configuration: Interactive configuration through the desktop application
Configuration Files: JSON-based configuration files for reproducible builds
Command-Line Arguments: Direct configuration via CLI flags

Priority Order

When multiple configuration methods are used:

CLI Arguments > Configuration File > GUI Settings > Default Values

Configuration File Format

Basic Structure

Create a JSON configuration file (e.g., obfussor.json):

{
  "version": "1.0",
  "input": {
    "files": ["src/main.c", "src/utils.c"],
    "include_dirs": ["include/"],
    "defines": ["RELEASE_BUILD"]
  },
  "output": {
    "directory": "build/obfuscated",
    "basename": "program",
    "generate_ir": true,
    "generate_report": true,
    "report_format": "json"
  },
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "medium",
      "options": {}
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "aes128",
      "options": {}
    },
    "bogus_control_flow": {
      "enabled": false
    },
    "instruction_substitution": {
      "enabled": true,
      "complexity": 3
    },
    "function_inlining": {
      "enabled": false
    }
  },
  "compiler": {
    "name": "clang",
    "optimization_level": "O2",
    "target_architecture": "x86_64",
    "additional_flags": ["-fno-inline", "-fno-unroll-loops"]
  },
  "advanced": {
    "preserve_symbols": false,
    "strip_debug_info": true,
    "seed": null
  }
}

Using Configuration Files

# CLI
obfussor-cli obfuscate --config obfussor.json

# Or specify additional overrides
obfussor-cli obfuscate --config obfussor.json --intensity high

Configuration Sections

Input Configuration

Controls what source files to obfuscate and how to process them.

{
  "input": {
    "files": [
      "src/main.c",
      "src/module1.c",
      "src/module2.c"
    ],
    "include_dirs": [
      "include/",
      "third_party/include/"
    ],
    "defines": [
      "RELEASE_BUILD",
      "ENABLE_OBFUSCATION",
      "VERSION=1.0"
    ],
    "exclude_patterns": [
      "*_test.c",
      "debug_*.c"
    ]
  }
}

Options:

files: Array of source files to obfuscate
include_dirs: Include directories for compilation
defines: Preprocessor definitions
exclude_patterns: Glob patterns for files to exclude

Output Configuration

Controls output generation and reporting.

{
  "output": {
    "directory": "build/obfuscated",
    "basename": "myapp",
    "generate_ir": true,
    "generate_report": true,
    "report_format": "json",
    "report_file": "obfuscation-report.json",
    "ir_directory": "build/ir/",
    "preserve_structure": false
  }
}

Options:

directory: Output directory for obfuscated files
basename: Base name for output files
generate_ir: Generate intermediate LLVM IR files
generate_report: Create obfuscation report
report_format: Report format (json, html, text)
report_file: Custom report file name
ir_directory: Directory for IR files
preserve_structure: Maintain input directory structure

Technique Configuration

Each obfuscation technique can be configured individually.

Control Flow Flattening

{
  "control_flow_flattening": {
    "enabled": true,
    "intensity": "medium",
    "options": {
      "split_basic_blocks": true,
      "dispatch_type": "switch",
      "state_variable_type": "i32",
      "bogus_states": 5,
      "preserve_functions": ["main", "init_*"],
      "min_block_size": 3
    }
  }
}

Options:

enabled: Enable/disable the technique
intensity: Obfuscation intensity (low, medium, high)
split_basic_blocks: Split basic blocks before flattening
dispatch_type: Dispatch mechanism (switch, indirect)
state_variable_type: LLVM type for state variable
bogus_states: Number of unreachable bogus states
preserve_functions: Functions to exclude (glob patterns supported)
min_block_size: Minimum instructions per block to flatten

String Encryption

{
  "string_encryption": {
    "enabled": true,
    "algorithm": "aes128",
    "options": {
      "key_generation": "random",
      "encryption_key": null,
      "decrypt_function": "inline",
      "exclude_patterns": [
        "debug_*",
        "test_*"
      ],
      "min_length": 4,
      "encrypt_wide_strings": true
    }
  }
}

Options:

algorithm: Encryption algorithm (xor, aes128, aes256, custom)
key_generation: Key generation method (random, derived, fixed)
encryption_key: Fixed encryption key (hex string, null for random)
decrypt_function: Decryption function placement (inline, function, constructor)
exclude_patterns: String patterns to exclude
min_length: Minimum string length to encrypt
encrypt_wide_strings: Also encrypt wide character strings

Bogus Control Flow

{
  "bogus_control_flow": {
    "enabled": true,
    "intensity": "medium",
    "options": {
      "injection_probability": 0.3,
      "max_bogus_blocks": 5,
      "opaque_predicate_complexity": 3,
      "use_external_functions": false,
      "preserve_semantics": true
    }
  }
}

Options:

injection_probability: Probability of injecting bogus code (0.0-1.0)
max_bogus_blocks: Maximum bogus blocks per function
opaque_predicate_complexity: Complexity of opaque predicates (1-5)
use_external_functions: Call external functions in bogus code
preserve_semantics: Ensure bogus code doesn't affect semantics

Instruction Substitution

{
  "instruction_substitution": {
    "enabled": true,
    "complexity": 3,
    "options": {
      "substitute_arithmetic": true,
      "substitute_boolean": true,
      "mixed_boolean_arithmetic": true,
      "max_substitution_depth": 3,
      "preserve_performance": false
    }
  }
}

Options:

complexity: Substitution complexity level (1-5)
substitute_arithmetic: Replace arithmetic operations
substitute_boolean: Replace boolean operations
mixed_boolean_arithmetic: Use MBA (Mixed Boolean-Arithmetic) expressions
max_substitution_depth: Maximum recursive substitution depth
preserve_performance: Limit substitutions affecting performance

Function Inlining/Outlining

{
  "function_inlining": {
    "enabled": true,
    "strategy": "mixed",
    "options": {
      "inline_threshold": 100,
      "outline_threshold": 50,
      "inline_functions": ["small_*"],
      "outline_functions": ["compute_*"],
      "preserve_abi": true
    }
  }
}

Options:

strategy: Strategy (inline, outline, mixed, random)
inline_threshold: Maximum size for inlining (IR instructions)
outline_threshold: Minimum size for outlining
inline_functions: Function patterns to inline
outline_functions: Function patterns to outline
preserve_abi: Preserve ABI for external calls

Compiler Configuration

Configure the compilation process:

{
  "compiler": {
    "name": "clang",
    "version": "14.0",
    "optimization_level": "O2",
    "target_architecture": "x86_64",
    "target_os": "linux",
    "additional_flags": [
      "-fno-inline",
      "-fno-unroll-loops",
      "-fno-vectorize"
    ],
    "link_flags": [
      "-static",
      "-s"
    ],
    "emit_llvm": false
  }
}

Options:

name: Compiler executable (clang, gcc, clang++)
version: Required compiler version (optional)
optimization_level: Optimization level (O0, O1, O2, O3, Os, Oz)
target_architecture: Target architecture (x86_64, arm64, i386)
target_os: Target operating system (linux, windows, macos)
additional_flags: Extra compiler flags
link_flags: Linker flags
emit_llvm: Emit LLVM bitcode instead of native binary

Advanced Configuration

Advanced options for fine-tuning:

{
  "advanced": {
    "preserve_symbols": false,
    "strip_debug_info": true,
    "seed": 12345,
    "parallelism": 4,
    "cache_enabled": true,
    "cache_directory": ".obfussor-cache/",
    "verify_output": true,
    "log_level": "info",
    "dry_run": false
  }
}

Options:

preserve_symbols: Keep symbol names in output
strip_debug_info: Remove debug information
seed: Random seed for reproducible obfuscation (null for random)
parallelism: Number of parallel threads (0 for auto)
cache_enabled: Enable compilation cache
cache_directory: Cache directory location
verify_output: Verify obfuscated IR validity
log_level: Logging level (debug, info, warn, error)
dry_run: Perform dry run without generating output

Preset Configurations

Minimal Obfuscation

For development and debugging:

{
  "version": "1.0",
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "low"
    },
    "string_encryption": {
      "enabled": false
    }
  },
  "compiler": {
    "optimization_level": "O0"
  },
  "advanced": {
    "preserve_symbols": true,
    "strip_debug_info": false
  }
}

Balanced Obfuscation

For most production use cases:

{
  "version": "1.0",
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "medium"
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "aes128"
    },
    "instruction_substitution": {
      "enabled": true,
      "complexity": 3
    }
  },
  "compiler": {
    "optimization_level": "O2"
  },
  "advanced": {
    "preserve_symbols": false,
    "strip_debug_info": true
  }
}

Maximum Obfuscation

For maximum protection (performance impact):

{
  "version": "1.0",
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "high",
      "options": {
        "bogus_states": 10
      }
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "aes256"
    },
    "bogus_control_flow": {
      "enabled": true,
      "intensity": "high",
      "options": {
        "injection_probability": 0.5
      }
    },
    "instruction_substitution": {
      "enabled": true,
      "complexity": 5
    },
    "function_inlining": {
      "enabled": true,
      "strategy": "mixed"
    }
  },
  "compiler": {
    "optimization_level": "O3"
  },
  "advanced": {
    "preserve_symbols": false,
    "strip_debug_info": true
  }
}

Configuration Validation

Validate your configuration file:

obfussor-cli validate-config obfussor.json

Output:

✓ Configuration file is valid
✓ All techniques are properly configured
✓ Compiler settings are compatible
⚠ Warning: High intensity may significantly impact performance

Environment Variables

Override configuration with environment variables:

# Set default obfuscation intensity
export OBFUSSOR_INTENSITY=high

# Set compiler
export OBFUSSOR_COMPILER=clang-14

# Set parallelism
export OBFUSSOR_PARALLELISM=8

# Use configuration
obfussor-cli obfuscate --input main.c

GUI Configuration

Interactive Configuration

Launch Obfussor application
Navigate to Settings tab
Configure techniques:
- Toggle each technique on/off
- Adjust intensity sliders
- Configure technique-specific options
Save configuration:
- Click Save Configuration
- Choose location for config file
Load configuration:
- Click Load Configuration
- Select saved config file

Configuration Profiles

The GUI supports multiple named profiles:

Create Profile: Settings → New Profile
Switch Profile: Select from dropdown
Export Profile: Settings → Export → JSON/YAML
Import Profile: Settings → Import

Best Practices

1. Version Control Configuration

Store configuration files in version control:

project/
├── src/
├── obfussor-dev.json      # Development config
├── obfussor-release.json  # Release config
└── obfussor-max.json      # Maximum protection config

2. Incremental Configuration

Start minimal and add techniques incrementally:

# Start with basic
obfussor-cli obfuscate --config obfussor-basic.json --input main.c

# Test, then increase
obfussor-cli obfuscate --config obfussor-medium.json --input main.c

# Finally, apply maximum if needed
obfussor-cli obfuscate --config obfussor-max.json --input main.c

3. Performance Testing

Always measure performance impact:

# Benchmark original
time ./program_original

# Benchmark obfuscated
time ./program_obfuscated

# Compare and adjust configuration

4. Selective Obfuscation

Obfuscate only critical code:

{
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "options": {
        "preserve_functions": [
          "*",
          "!critical_*",
          "!secret_*"
        ]
      }
    }
  }
}

Pattern ! means "do NOT preserve" (i.e., do obfuscate).

5. Reproducible Builds

Use fixed seeds for reproducible obfuscation:

{
  "advanced": {
    "seed": 42
  }
}

Configuration Examples

Example 1: Mobile Application

{
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "medium"
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "xor"
    },
    "instruction_substitution": {
      "enabled": true,
      "complexity": 2
    }
  },
  "compiler": {
    "optimization_level": "Os",
    "target_architecture": "arm64"
  }
}

Example 2: Server Application

{
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "high"
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "aes256"
    },
    "bogus_control_flow": {
      "enabled": true
    }
  },
  "compiler": {
    "optimization_level": "O3"
  },
  "advanced": {
    "parallelism": 16
  }
}

Example 3: Embedded System

{
  "techniques": {
    "control_flow_flattening": {
      "enabled": true,
      "intensity": "low"
    },
    "string_encryption": {
      "enabled": true,
      "algorithm": "xor"
    }
  },
  "compiler": {
    "optimization_level": "Os",
    "target_architecture": "arm",
    "additional_flags": ["-mthumb"]
  },
  "advanced": {
    "verify_output": true
  }
}

Troubleshooting Configuration

Configuration Not Applied

Problem: Configuration seems ignored

Solution:

# Verify configuration is loaded
obfussor-cli obfuscate --config config.json --verbose

# Check for CLI argument overrides
# Ensure no conflicting environment variables

Invalid Configuration

Problem: Configuration validation fails

Solution:

# Validate JSON syntax
cat config.json | jq .

# Use schema validation
obfussor-cli validate-config config.json --schema

Unexpected Results

Problem: Obfuscation doesn't match expectations

Solution:

# Enable detailed logging
obfussor-cli obfuscate --config config.json --log-level debug

# Generate detailed report
obfussor-cli obfuscate --config config.json --report-format html

Next Steps

Obfuscation Techniques: Learn about each technique
CLI Reference: Complete CLI documentation
Advanced Topics: Optimize your configuration
Troubleshooting: Solve common problems

With proper configuration, you can balance security, performance, and maintainability for your specific use case.

LLVM Overview

LLVM (Low Level Virtual Machine) is a powerful compiler infrastructure that provides a modern, modular approach to compiler design. Understanding LLVM is essential for grasping how Obfussor performs code obfuscation at the compiler level.

What is LLVM?

LLVM is not just a compiler, but a comprehensive collection of modular and reusable compiler and toolchain technologies. Despite its name containing "Virtual Machine," LLVM is not a traditional virtual machine - it's a compiler infrastructure designed around a language-independent intermediate representation (IR).

Key Characteristics

Modular Design: LLVM's architecture separates concerns into distinct, reusable components
Language Independence: Frontend-agnostic approach supports multiple source languages
Target Independence: Backend supports multiple target architectures
Optimization Framework: Sophisticated optimization infrastructure built on SSA form
Active Development: Continuously evolving with strong industry and academic support

LLVM Architecture

LLVM follows a three-phase design that separates compilation into distinct stages:

Source Code → Frontend → LLVM IR → Optimizer → LLVM IR → Backend → Machine Code

Three-Phase Architecture

1. Frontend

The frontend translates source code into LLVM IR:

Lexical Analysis: Tokenization of source code
Syntax Analysis: Parse tree construction
Semantic Analysis: Type checking and validation
IR Generation: Translation to LLVM IR

Popular frontends include:

Clang: C, C++, Objective-C
Swift: Swift language
Rust: Rust language (via rustc)
Julia: Julia language

2. Optimizer (Middle-End)

The optimizer transforms LLVM IR to improve performance:

Analysis Passes: Gather information about the code
Transformation Passes: Modify the IR to optimize it
Utility Passes: Provide helper functionality

Key optimizations:

Dead code elimination
Constant folding and propagation
Loop optimizations
Inlining
Scalar optimizations
Vectorization

3. Backend

The backend translates optimized IR to machine code:

Instruction Selection: Map IR to target instructions
Register Allocation: Assign virtual registers to physical registers
Instruction Scheduling: Optimize instruction order
Code Emission: Generate final machine code

Supported architectures:

x86/x86_64
ARM/ARM64 (AArch64)
RISC-V
PowerPC
MIPS
WebAssembly
And many more

Core Components

LLVM Intermediate Representation (IR)

The IR is the heart of LLVM - a low-level, typed, assembly-like language:

Example:

define i32 @add(i32 %a, i32 %b) {
  %result = add i32 %a, %b
  ret i32 %result
}

Characteristics:

Static Single Assignment (SSA) form
Strongly typed
Platform independent
Suitable for optimization
Readable and writable

PassManager

The PassManager orchestrates optimization and transformation passes:

// C++ API example
PassBuilder PB;
ModulePassManager MPM;
MPM.addPass(createModuleToFunctionPassAdaptor(SimplifyCFGPass()));
MPM.addPass(createModuleToFunctionPassAdaptor(InstructionCombiningPass()));
MPM.run(Module, MAM);

Types of Passes:

Module Passes: Operate on entire module
Function Passes: Operate on individual functions
BasicBlock Passes: Operate on basic blocks
Loop Passes: Operate on loop structures

Analysis Infrastructure

LLVM provides rich analysis capabilities:

Dominator Trees: Control flow dominance
Loop Information: Loop structure analysis
Alias Analysis: Memory dependency analysis
Call Graph: Function call relationships
Data Flow: Value flow analysis

LLVM Toolchain

Essential Tools

1. clang

C/C++/Objective-C compiler frontend:

clang -O2 -S -emit-llvm source.c -o source.ll

2. llc

LLVM IR to native assembly compiler:

llc -O2 source.ll -o source.s

3. opt

LLVM IR optimizer:

opt -O3 source.ll -S -o source_opt.ll

4. llvm-link

LLVM IR linker:

llvm-link module1.ll module2.ll -S -o combined.ll

5. llvm-dis

LLVM bitcode disassembler:

llvm-dis source.bc -o source.ll

6. llvm-as

LLVM IR assembler:

llvm-as source.ll -o source.bc

7. lli

LLVM IR interpreter and JIT compiler:

lli source.ll

Analysis and Debug Tools

llvm-objdump

Object file dumper:

llvm-objdump -d binary

llvm-nm

Symbol table viewer:

llvm-nm library.a

llvm-readobj

Object file reader:

llvm-readobj -h binary

llvm-config

LLVM configuration tool:

llvm-config --cxxflags --ldflags --libs core

LLVM in Compilation Pipeline

Typical Compilation Flow

Preprocessing:
```
clang -E source.c -o source.i
```

Compilation to IR:

clang -S -emit-llvm source.i -o source.ll

Optimization:
```
opt -O3 source.ll -S -o source_opt.ll
```
Backend Compilation:
```
llc source_opt.ll -o source.s
```
Assembly:
```
as source.s -o source.o
```
Linking:
```
ld source.o -o executable
```

Obfuscation Integration Point

Obfussor integrates into this pipeline at the IR level:

Source Code
    ↓
  Clang Frontend
    ↓
  LLVM IR ← ← ← Obfuscation Happens Here
    ↓
  Optimizer (opt)
    ↓
  Backend (llc)
    ↓
  Machine Code

Advantages:

Platform-independent obfuscation
Works with optimizations
Access to full program analysis
Language-agnostic

LLVM Design Principles

1. Static Single Assignment (SSA) Form

Every variable is assigned exactly once:

; SSA Form
define i32 @example(i32 %x) {
  %1 = add i32 %x, 1
  %2 = mul i32 %1, 2
  %3 = add i32 %2, 3
  ret i32 %3
}

Benefits:

Simplified optimization algorithms
Easier data flow analysis
Clearer def-use relationships

2. Type System

Strong, static typing throughout the IR:

; Type examples
i32                    ; 32-bit integer
i8*                    ; Pointer to 8-bit integer
[10 x i32]            ; Array of 10 32-bit integers
{i32, i8*, double}    ; Structure type
<4 x float>           ; Vector of 4 floats

3. Explicit Memory Model

Memory operations are explicit:

%ptr = alloca i32              ; Allocate stack memory
store i32 42, i32* %ptr        ; Store value
%val = load i32, i32* %ptr     ; Load value

4. Control Flow Representation

Structured control flow using basic blocks:

define i32 @max(i32 %a, i32 %b) {
entry:
  %cmp = icmp sgt i32 %a, %b
  br i1 %cmp, label %if.then, label %if.else

if.then:
  ret i32 %a

if.else:
  ret i32 %b
}

LLVM and Obfuscation

Why LLVM is Ideal for Obfuscation

IR-Level Transformations
- Platform-independent obfuscation
- Rich semantic information available
- Can leverage existing analyses
Modular Pass System
- Easy to add custom obfuscation passes
- Compose multiple techniques
- Integrate with standard optimizations
Strong Analysis Infrastructure
- Control flow analysis
- Data flow analysis
- Type information
- Aliasing information
Preservation of Semantics
- Type system ensures correctness
- SSA form simplifies transformations
- Built-in verification passes

Common Obfuscation Strategies

LLVM enables various obfuscation approaches:

Control Flow Obfuscation
- Manipulate basic block structure
- Insert opaque predicates
- Flatten control flow
Data Obfuscation
- Encrypt constant values
- Transform data types
- Obscure memory access patterns
Instruction-Level Obfuscation
- Substitute instructions
- Insert dead code
- Use complex instruction patterns
Function-Level Obfuscation
- Inline/outline strategically
- Split or merge functions
- Obscure call graphs

Integration with Other Tools

Clang Integration

Obfussor works seamlessly with Clang:

# Compile with Clang to IR
clang -S -emit-llvm source.c -o source.ll

# Apply obfuscation
obfussor-cli obfuscate --input source.ll --output obfuscated.ll

# Continue compilation
llc obfuscated.ll -o obfuscated.s
clang obfuscated.s -o program

Build System Integration

Makefile:

%.obf.ll: %.ll
	obfussor-cli obfuscate --input $< --output $@

%.s: %.obf.ll
	llc $< -o $@

CMake:

add_custom_command(
    OUTPUT obfuscated.ll
    COMMAND obfussor-cli obfuscate --input source.ll --output obfuscated.ll
    DEPENDS source.ll
)

LLVM Version Compatibility

Obfussor supports LLVM versions:

LLVM Version	Support Status	Notes
14.x	Full Support	Recommended
15.x	Full Support	Current
16.x	Full Support	Latest
13.x	Limited	Some features unavailable
< 13.x	Not Supported	Too old

Learning Resources

Official Documentation

Books

"Getting Started with LLVM Core Libraries" by Bruno Cardoso Lopes
"LLVM Essentials" by Mayur Pandey and Suyog Sarda
"LLVM Cookbook" by Mayur Pandey and Suyog Sarda

Online Resources

Summary

LLVM provides the foundation for Obfussor's obfuscation capabilities:

Modular Architecture: Clean separation of concerns
IR-Level Transformations: Platform-independent obfuscation
Rich Analysis: Deep understanding of code structure
Extensible Pass System: Easy integration of custom transformations
Strong Type System: Ensures semantic preservation
Industry Standard: Wide adoption and active development

Understanding LLVM is crucial for:

Configuring obfuscation effectively
Writing custom obfuscation passes
Debugging obfuscation issues
Optimizing obfuscation performance

Next Steps

LLVM IR Basics: Deep dive into LLVM IR structure
LLVM Pass System: Understanding the pass infrastructure
Compilation Pipeline: Complete compilation workflow
Obfuscation Techniques: How obfuscation leverages LLVM

With this foundation, you're ready to explore how Obfussor leverages LLVM for code protection.

LLVM IR Basics

LLVM Intermediate Representation (IR) is the core language that LLVM uses for program analysis and transformation. Understanding LLVM IR is essential for working with obfuscation techniques, as all transformations operate on this representation.

What is LLVM IR?

LLVM IR is a low-level, typed, assembly-like language that serves as a universal intermediate format between high-level source code and machine code. It combines:

Low-level operations: Close to machine instructions but platform-independent
Type safety: Strong static typing prevents invalid operations
SSA form: Static Single Assignment for optimization
Readability: Human-readable text format

Three Representations

LLVM IR exists in three equivalent forms:

1. Human-Readable Assembly (.ll files)

define i32 @add(i32 %a, i32 %b) {
  %result = add i32 %a, %b
  ret i32 %result
}

2. Bitcode (binary .bc files)

Compact binary format for storage and transmission:

llvm-as source.ll -o source.bc
llvm-dis source.bc -o source.ll

3. In-Memory Representation

C++ objects used by the compiler:

Function *F = ...;
BasicBlock *BB = ...;
Instruction *I = ...;

Basic Structure

Module

The top-level container representing a compilation unit:

; ModuleID = 'example.c'
source_filename = "example.c"
target datalayout = "..."
target triple = "x86_64-unknown-linux-gnu"

; Global variables
@global_var = global i32 42

; Function declarations
declare i32 @external_func(i32)

; Function definitions
define i32 @my_function(i32 %param) {
  ; ... function body ...
}

Functions

Functions are the primary unit of code:

define <return_type> @function_name(<parameters>) {
  ; function body
}

Example:

define i32 @multiply(i32 %x, i32 %y) {
entry:
  %result = mul i32 %x, %y
  ret i32 %result
}

Basic Blocks

Basic blocks are sequences of instructions with single entry and exit:

define i32 @example(i32 %n) {
entry:                              ; First basic block
  %cmp = icmp sgt i32 %n, 0
  br i1 %cmp, label %positive, label %negative

positive:                           ; Second basic block
  %pos_result = add i32 %n, 1
  ret i32 %pos_result

negative:                           ; Third basic block
  %neg_result = sub i32 0, %n
  ret i32 %neg_result
}

Rules:

Must have exactly one entry (label)
Must have exactly one terminator (ret, br, switch, etc.)
No branches except at the end

Instructions

Instructions are operations within basic blocks:

%result = add i32 %x, %y          ; Arithmetic
%ptr = getelementptr i32, i32* %base, i32 %offset  ; Memory
store i32 %value, i32* %ptr       ; Memory write
%loaded = load i32, i32* %ptr     ; Memory read
br label %next                     ; Control flow

Type System

LLVM IR has a rich, strongly-typed type system:

Primitive Types

Integer Types

i1      ; Boolean (1 bit)
i8      ; Byte (8 bits)
i16     ; Short (16 bits)
i32     ; Int (32 bits)
i64     ; Long (64 bits)
i128    ; 128-bit integer

Floating Point Types

half    ; 16-bit floating point
float   ; 32-bit floating point (IEEE 754)
double  ; 64-bit floating point (IEEE 754)
x86_fp80 ; 80-bit floating point (x87)
fp128   ; 128-bit floating point

Special Types

void    ; No value (for functions)
label   ; Basic block labels
metadata ; Metadata for debug info

Derived Types

Pointers

i32*           ; Pointer to 32-bit integer
i8**           ; Pointer to pointer to 8-bit integer
void (i32)*    ; Pointer to function taking i32, returning void

Arrays

[10 x i32]           ; Array of 10 32-bit integers
[5 x [3 x double]]   ; 2D array of doubles

Structures

{i32, i8*, double}              ; Packed structure
{i32, [10 x i8], i32*}         ; With array member
%struct.Point = type {float, float}  ; Named structure

Vectors

<4 x i32>      ; Vector of 4 32-bit integers (SIMD)
<8 x float>    ; Vector of 8 floats

Static Single Assignment (SSA)

Every value in LLVM IR is assigned exactly once:

Non-SSA (C-like):

int x = 5;
x = x + 1;
x = x * 2;

SSA (LLVM IR):

%x1 = alloca i32
store i32 5, i32* %x1
%x2 = load i32, i32* %x1
%x3 = add i32 %x2, 1
store i32 %x3, i32* %x1
%x4 = load i32, i32* %x1
%x5 = mul i32 %x4, 2

Phi Nodes

Phi nodes merge values from different control flow paths:

define i32 @select_max(i32 %a, i32 %b) {
entry:
  %cmp = icmp sgt i32 %a, %b
  br i1 %cmp, label %if.then, label %if.else

if.then:
  br label %if.end

if.else:
  br label %if.end

if.end:
  %result = phi i32 [ %a, %if.then ], [ %b, %if.else ]
  ret i32 %result
}

The phi node selects:

%a if coming from %if.then
%b if coming from %if.else

Instruction Categories

Arithmetic Instructions

; Integer arithmetic
%sum = add i32 %a, %b
%diff = sub i32 %a, %b
%prod = mul i32 %a, %b
%quot = sdiv i32 %a, %b    ; Signed division
%rem = srem i32 %a, %b     ; Signed remainder

; Floating point arithmetic
%fsum = fadd float %x, %y
%fdiff = fsub float %x, %y
%fprod = fmul float %x, %y
%fquot = fdiv float %x, %y

Bitwise Instructions

%and_result = and i32 %a, %b
%or_result = or i32 %a, %b
%xor_result = xor i32 %a, %b
%shl_result = shl i32 %a, 2      ; Shift left
%lshr_result = lshr i32 %a, 2    ; Logical shift right
%ashr_result = ashr i32 %a, 2    ; Arithmetic shift right

Comparison Instructions

; Integer comparisons
%eq = icmp eq i32 %a, %b      ; Equal
%ne = icmp ne i32 %a, %b      ; Not equal
%sgt = icmp sgt i32 %a, %b    ; Signed greater than
%slt = icmp slt i32 %a, %b    ; Signed less than
%ugt = icmp ugt i32 %a, %b    ; Unsigned greater than

; Float comparisons
%feq = fcmp oeq float %x, %y  ; Ordered equal
%fgt = fcmp ogt float %x, %y  ; Ordered greater than

Memory Instructions

; Stack allocation
%ptr = alloca i32
%arr = alloca [10 x i32]

; Store
store i32 42, i32* %ptr
store i32 %value, i32* %ptr, align 4

; Load
%value = load i32, i32* %ptr
%aligned = load i32, i32* %ptr, align 4

; Pointer arithmetic
%elem_ptr = getelementptr [10 x i32], [10 x i32]* %arr, i32 0, i32 5

Control Flow Instructions

; Unconditional branch
br label %target

; Conditional branch
br i1 %condition, label %true_bb, label %false_bb

; Switch
switch i32 %value, label %default [
  i32 0, label %case0
  i32 1, label %case1
  i32 2, label %case2
]

; Return
ret i32 %result
ret void

Call Instructions

; Direct call
%result = call i32 @function(i32 %arg1, i32 %arg2)

; Indirect call through function pointer
%fn_ptr = load i32 (i32, i32)*, i32 (i32, i32)** %fptr_var
%result = call i32 %fn_ptr(i32 %arg1, i32 %arg2)

; Tail call (optimization)
%result = tail call i32 @function(i32 %arg)

Conversion Instructions

; Integer truncation/extension
%trunc = trunc i32 %value to i8
%zext = zext i8 %byte to i32      ; Zero extend
%sext = sext i8 %byte to i32      ; Sign extend

; Float conversions
%to_float = sitofp i32 %int to float
%to_int = fptosi float %f to i32

; Pointer/integer conversions
%int = ptrtoint i8* %ptr to i64
%ptr = inttoptr i64 %int to i8*

; Bitcast (reinterpret bits)
%float_bits = bitcast i32 %int to float

Constants

Integer Constants

i32 42
i32 -17
i1 true
i1 false

Floating Point Constants

float 3.14
double 2.718281828

Null and Undefined

i32* null               ; Null pointer
i32 undef              ; Undefined value
i32 poison             ; Poison value (LLVM 12+)

Aggregate Constants

[3 x i32] [i32 1, i32 2, i32 3]
{i32, float} {i32 42, float 3.14}
<4 x i32> <i32 1, i32 2, i32 3, i32 4>

Constant Expressions

@global = global i32* getelementptr (i32, i32* @array, i32 5)
@ptr = global i8* bitcast (i32* @value to i8*)

Attributes

Attributes provide additional information:

Function Attributes

define i32 @example() nounwind readnone {
  ret i32 42
}

; Common attributes:
; - nounwind: doesn't throw exceptions
; - readnone: doesn't read/write memory
; - readonly: doesn't write memory
; - alwaysinline: force inline
; - noinline: prevent inlining

Parameter Attributes

define void @example(i32* noalias %ptr, i32 signext %value) {
  ; ...
}

; Common attributes:
; - noalias: pointer doesn't alias
; - readonly: parameter not modified
; - nocapture: pointer not captured
; - signext/zeroext: sign/zero extended

Calling Conventions

define fastcc i32 @fast_function(i32 %arg) {
  ; ...
}

; Conventions:
; - ccc: C calling convention (default)
; - fastcc: Fast calling convention
; - coldcc: Cold calling convention

Metadata

Metadata provides debugging and optimization hints:

define i32 @example(i32 %n) !dbg !1 {
  %result = add i32 %n, 1, !dbg !2
  ret i32 %result
}

!llvm.dbg.cu = !{!0}
!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1)
!1 = !DIFile(filename: "example.c", directory: "/path")
!2 = !DILocation(line: 5, column: 12, scope: !1)

Example: Complete Function

Here's a complete example showing various IR features:

; Function: Compute factorial
define i64 @factorial(i64 %n) {
entry:
  ; Check if n <= 1
  %cmp = icmp sle i64 %n, 1
  br i1 %cmp, label %base_case, label %recursive_case

base_case:
  ; Base case: return 1
  ret i64 1

recursive_case:
  ; Recursive case: n * factorial(n-1)
  %n_minus_1 = sub i64 %n, 1
  %rec_result = call i64 @factorial(i64 %n_minus_1)
  %result = mul i64 %n, %rec_result
  ret i64 %result
}

Working with LLVM IR

Generating IR from C

# Generate human-readable IR
clang -S -emit-llvm example.c -o example.ll

# Generate optimized IR
clang -S -emit-llvm -O2 example.c -o example_opt.ll

# Generate bitcode
clang -c -emit-llvm example.c -o example.bc

Inspecting IR

# View IR
cat example.ll
less example.ll

# Disassemble bitcode
llvm-dis example.bc -o example.ll

# View with syntax highlighting
vim example.ll  # or your preferred editor

Validating IR

# Check IR is well-formed
opt -verify example.ll -S -o /dev/null

# Run specific verification
opt -verify-each example.ll -S -o /dev/null

IR in Obfuscation

Understanding IR is crucial for obfuscation:

Why IR Level?

Platform Independence: Transform once, compile anywhere
Rich Information: Type and structure information available
Analysis Power: Leverage LLVM's analysis passes
Composability: Combine with standard optimizations

Transformation Examples

Original:

define i32 @simple(i32 %x) {
  %result = add i32 %x, 10
  ret i32 %result
}

After Control Flow Flattening:

define i32 @simple(i32 %x) {
entry:
  %state = alloca i32
  store i32 0, i32* %state
  br label %dispatcher

dispatcher:
  %s = load i32, i32* %state
  switch i32 %s, label %exit [
    i32 0, label %block0
    i32 1, label %block1
  ]

block0:
  %result = add i32 %x, 10
  store i32 1, i32* %state
  br label %dispatcher

block1:
  ret i32 %result

exit:
  unreachable
}

Common Patterns

Allocating and Using Local Variables

define void @local_vars() {
  %x = alloca i32
  store i32 42, i32* %x
  %val = load i32, i32* %x
  ; use %val...
  ret void
}

Array Access

define i32 @array_access() {
  %arr = alloca [10 x i32]
  %elem_ptr = getelementptr [10 x i32], [10 x i32]* %arr, i32 0, i32 5
  store i32 42, i32* %elem_ptr
  %val = load i32, i32* %elem_ptr
  ret i32 %val
}

Structure Access

%struct.Point = type { float, float }

define float @get_x(%struct.Point* %p) {
  %x_ptr = getelementptr %struct.Point, %struct.Point* %p, i32 0, i32 0
  %x = load float, float* %x_ptr
  ret float %x
}

Summary

LLVM IR is:

Low-level but platform-independent
Strongly typed ensuring correctness
In SSA form simplifying analysis
Human-readable for debugging
The foundation for LLVM transformations

Key concepts:

Modules contain functions
Functions contain basic blocks
Basic blocks contain instructions
All values are typed
SSA form with phi nodes
Rich instruction set for operations

Next Steps

LLVM Pass System: Learn about transformation passes
Compilation Pipeline: See IR in the full compilation flow
Obfuscation Techniques: How techniques transform IR

Mastering LLVM IR is essential for understanding and customizing obfuscation techniques.

LLVM Pass System

The LLVM Pass framework is the infrastructure that enables code analysis and transformation. Understanding the pass system is essential for implementing and using obfuscation techniques in Obfussor.

What is an LLVM Pass?

An LLVM Pass is a unit of compilation work that performs analysis or transformation on LLVM IR. Passes are:

Modular: Self-contained units of functionality
Composable: Can be combined in sequences
Reusable: Can be applied to different modules
Analyzable: Can depend on other passes

Pass Types

1. Module Pass

Operates on entire modules (all functions and globals):

struct MyModulePass : public ModulePass {
  static char ID;
  
  bool runOnModule(Module &M) override {
    // Process all functions in module
    for (Function &F : M) {
      // Process function
    }
    return true; // Module was modified
  }
};

Use Cases:

Inter-procedural analysis
Global transformations
Call graph construction

2. Function Pass

Operates on individual functions:

struct MyFunctionPass : public FunctionPass {
  static char ID;
  
  bool runOnFunction(Function &F) override {
    // Process all basic blocks
    for (BasicBlock &BB : F) {
      // Process basic block
    }
    return true; // Function was modified
  }
};

Use Cases:

Intra-procedural optimizations
Function-level obfuscation
Local analysis

3. BasicBlock Pass

Operates on individual basic blocks:

struct MyBasicBlockPass : public BasicBlockPass {
  static char ID;
  
  bool runOnBasicBlock(BasicBlock &BB) override {
    for (Instruction &I : BB) {
      // Process instruction
    }
    return true; // Basic block was modified
  }
};

Use Cases:

Local optimizations
Instruction-level transformations

4. Loop Pass

Operates on loop structures:

struct MyLoopPass : public LoopPass {
  static char ID;
  
  bool runOnLoop(Loop *L, LPPassManager &LPM) override {
    // Process loop
    for (BasicBlock *BB : L->blocks()) {
      // Process blocks in loop
    }
    return true;
  }
};

Use Cases:

Loop optimizations
Loop obfuscation
Loop vectorization

Pass Manager

The Pass Manager orchestrates pass execution:

Legacy Pass Manager (Pre-LLVM 14)

legacy::PassManager PM;
PM.add(createPromoteMemoryToRegisterPass());
PM.add(new MyCustomPass());
PM.run(Module);

New Pass Manager (LLVM 14+)

ModulePassManager MPM;
FunctionPassManager FPM;

// Add function passes
FPM.addPass(SimplifyCFGPass());
FPM.addPass(InstructionCombiningPass());

// Add function pass manager to module pass manager
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));

// Run passes
ModuleAnalysisManager MAM;
MPM.run(Module, MAM);

Pass Dependencies

Passes can declare dependencies on other passes:

void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
  // This pass requires dominator tree
  AU.addRequired<DominatorTreeWrapperPass>();
  
  // This pass preserves CFG
  AU.setPreservesCFG();
  
  // This pass doesn't modify anything
  AU.setPreservesAll();
}

// Using the analysis
bool MyPass::runOnFunction(Function &F) {
  DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
  // Use dominator tree...
}

Common Analysis Passes

Dominator Tree

Computes dominance relationships:

DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();

if (DT.dominates(BB1, BB2)) {
  // BB1 dominates BB2
}

BasicBlock *IDom = DT.getNode(BB)->getIDom()->getBlock();

Loop Information

Analyzes loop structure:

LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();

for (Loop *L : LI) {
  BasicBlock *Header = L->getHeader();
  unsigned Depth = L->getLoopDepth();
  // Process loop
}

Alias Analysis

Determines memory aliasing:

AliasAnalysis &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();

if (AA.alias(Ptr1, Ptr2) == AliasResult::NoAlias) {
  // Pointers don't alias
}

Call Graph

Represents function call relationships:

CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();

for (auto &Node : CG) {
  Function *F = Node.first;
  for (auto &CallRecord : *Node.second) {
    Function *Callee = CallRecord.second->getFunction();
  }
}

Writing a Custom Pass

Step 1: Define Pass Class

#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"

namespace {
  struct CountInstructionsPass : public FunctionPass {
    static char ID;
    CountInstructionsPass() : FunctionPass(ID) {}
    
    bool runOnFunction(Function &F) override {
      unsigned Count = 0;
      for (BasicBlock &BB : F) {
        Count += BB.size();
      }
      errs() << "Function " << F.getName() 
             << " has " << Count << " instructions\n";
      return false; // Didn't modify the function
    }
  };
}

char CountInstructionsPass::ID = 0;

Step 2: Register the Pass

static RegisterPass<CountInstructionsPass> X(
  "count-instructions",
  "Count instructions in functions",
  false,  // Only looks at CFG
  true    // Analysis pass
);

Step 3: Build and Load

# Build pass as shared library
clang++ -shared -fPIC MyPass.cpp -o MyPass.so \
  `llvm-config --cxxflags --ldflags`

# Load and run pass
opt -load MyPass.so -count-instructions < input.bc > output.bc

Pass Scheduling

The pass manager schedules passes optimally:

Module Pass 1
  Function Pass A (on each function)
  Function Pass B (on each function)
Module Pass 2
  Function Pass C (on each function)

This minimizes:

Redundant analysis
Cache misses
Compilation time

Obfuscation Passes

Control Flow Flattening Pass

struct FlatteningPass : public FunctionPass {
  bool runOnFunction(Function &F) override {
    // Don't flatten already flat functions
    if (isAlreadyFlat(&F)) return false;
    
    // Split basic blocks
    std::vector<BasicBlock*> Blocks;
    for (BasicBlock &BB : F) {
      Blocks.push_back(&BB);
    }
    
    // Create switch variable
    AllocaInst *SwitchVar = 
      new AllocaInst(Type::getInt32Ty(F.getContext()));
    
    // Create dispatcher block
    BasicBlock *Dispatcher = 
      BasicBlock::Create(F.getContext(), "dispatcher", &F);
    
    // Build switch instruction
    SwitchInst *Switch = SwitchInst::Create(
      SwitchVar, DefaultBlock, Blocks.size(), Dispatcher);
    
    // Update blocks to branch to dispatcher
    for (unsigned i = 0; i < Blocks.size(); ++i) {
      // Modify terminator to update state and branch to dispatcher
      // ... implementation details ...
    }
    
    return true;
  }
};

String Encryption Pass

struct StringEncryptionPass : public ModulePass {
  bool runOnModule(Module &M) override {
    for (GlobalVariable &GV : M.globals()) {
      if (!GV.hasInitializer()) continue;
      
      Constant *Init = GV.getInitializer();
      if (ConstantDataArray *CDA = dyn_cast<ConstantDataArray>(Init)) {
        if (CDA->isString()) {
          // Encrypt the string
          std::string Original = CDA->getAsString().str();
          std::vector<uint8_t> Encrypted = encryptString(Original);
          
          // Replace with encrypted version
          Constant *NewInit = ConstantDataArray::get(
            M.getContext(), Encrypted);
          GV.setInitializer(NewInit);
          
          // Insert decryption code at usage sites
          insertDecryptionCode(&GV, M);
        }
      }
    }
    return true;
  }
};

Pass Options and Configuration

Passes can accept options:

static cl::opt<unsigned> ObfuscationLevel(
  "obf-level",
  cl::desc("Obfuscation intensity level (1-5)"),
  cl::init(3)
);

struct ConfigurablePass : public FunctionPass {
  bool runOnFunction(Function &F) override {
    unsigned Level = ObfuscationLevel;
    // Apply obfuscation based on level
    return true;
  }
};

Use from command line:

opt -load ObfPass.so -my-pass -obf-level=5 < input.bc > output.bc

Pass Debugging

Print IR Before/After

# Print IR after each pass
opt -print-after-all -O2 input.ll -S -o output.ll

# Print only specific pass
opt -print-after=my-pass input.ll -S -o output.ll

Verify IR

# Run verifier after each pass
opt -verify-each -O2 input.ll -S -o output.ll

Debug Pass Execution

#define DEBUG_TYPE "my-pass"

LLVM_DEBUG(dbgs() << "Processing function: " << F.getName() << "\n");
LLVM_DEBUG(dbgs() << "Found " << Count << " instructions\n");

Enable debug output:

opt -debug -debug-only=my-pass -my-pass < input.bc > output.bc

Best Practices

1. Preserve Analysis When Possible

void MyPass::getAnalysisUsage(AnalysisUsage &AU) const {
  AU.setPreservesCFG(); // If CFG unchanged
  AU.addPreserved<LoopInfoWrapperPass>(); // If loops unchanged
}

2. Update Analysis After Modification

DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();

// Modify IR
BasicBlock *NewBB = SplitBlock(BB, I, &DT);

// DT is automatically updated

3. Use LLVM IR Builder

IRBuilder<> Builder(Context);
Builder.SetInsertPoint(InsertBefore);

Value *Sum = Builder.CreateAdd(A, B, "sum");
Value *Product = Builder.CreateMul(Sum, C, "product");

4. Handle Edge Cases

bool runOnFunction(Function &F) override {
  // Skip declarations
  if (F.isDeclaration()) return false;
  
  // Skip functions with specific attributes
  if (F.hasFnAttribute("no-obfuscate")) return false;
  
  // Process function
  return true;
}

Integration with Obfussor

Obfussor uses custom passes for each obfuscation technique:

Source Code
    ↓
  LLVM IR
    ↓
  Control Flow Flattening Pass
    ↓
  String Encryption Pass
    ↓
  Bogus Control Flow Pass
    ↓
  Instruction Substitution Pass
    ↓
  Optimization Passes
    ↓
  Obfuscated Binary

Each pass:

Operates on LLVM IR
Preserves semantics
Can be enabled/disabled
Has configurable intensity

Summary

The LLVM Pass system:

Provides modular transformation framework
Enables analysis and optimization
Supports custom passes for obfuscation
Manages dependencies automatically
Schedules passes efficiently

Key concepts:

Different pass types (Module, Function, BasicBlock, Loop)
Pass Manager orchestrates execution
Analysis passes provide information
Transformation passes modify IR
Dependencies ensure correct ordering

Next Steps

Compilation Pipeline: See passes in action
Obfuscation Techniques: Obfuscation passes
Custom Passes: Write your own passes

The pass system is the engine that powers LLVM obfuscation.

Compilation Pipeline

Understanding the complete LLVM compilation pipeline is essential for knowing where and how obfuscation fits into the build process. This chapter explains the end-to-end compilation flow and how Obfussor integrates seamlessly.

Standard LLVM Compilation Pipeline

Complete Flow

Source Code (.c, .cpp)
        ↓
    Preprocessor
        ↓
Preprocessed Source (.i)
        ↓
Frontend (Clang)
        ↓
   LLVM IR (.ll or .bc)
        ↓
   Optimizer (opt)
        ↓
 Optimized LLVM IR
        ↓
   Backend (llc)
        ↓
Assembly Code (.s)
        ↓
Assembler (as)
        ↓
Object File (.o)
        ↓
  Linker (ld)
        ↓
Executable/Library

Phase-by-Phase Breakdown

1. Preprocessing

clang -E source.c -o source.i

What Happens:

Includes header files (#include)
Expands macros (#define)
Processes conditionals (#ifdef)
Removes comments

Output: Preprocessed source code

2. Compilation to IR

clang -S -emit-llvm source.i -o source.ll

What Happens:

Lexical analysis (tokenization)
Syntax analysis (parsing)
Semantic analysis (type checking)
IR generation

Output: LLVM IR (human-readable .ll or bitcode .bc)

3. Optimization

opt -O3 source.ll -S -o source_opt.ll

What Happens:

Analysis passes gather information
Transformation passes modify IR
Dead code elimination
Function inlining
Loop optimizations
Constant propagation

Output: Optimized LLVM IR

4. Backend Compilation

llc -O2 source_opt.ll -o source.s

What Happens:

Instruction selection
Register allocation
Instruction scheduling
Code emission

Output: Assembly code for target architecture

5. Assembly

as source.s -o source.o

What Happens:

Convert assembly to machine code
Generate object file format (ELF, Mach-O, COFF)

Output: Object file

6. Linking

ld source.o -o executable
# Or using clang:
clang source.o -o executable

What Happens:

Resolve symbols
Combine object files
Link libraries
Generate executable

Output: Final executable or library

Obfuscation-Enhanced Pipeline

Where Obfuscation Fits

Source Code
        ↓
    Frontend
        ↓
   LLVM IR
        ↓
┌───────────────────┐
│ OBFUSCATION LAYER │ ← Obfussor operates here
│                   │
│ • Control Flow    │
│ • String Encrypt  │
│ • Bogus Code      │
│ • Inst. Subst.    │
└───────────────────┘
        ↓
Obfuscated LLVM IR
        ↓
   Optimizer
        ↓
    Backend
        ↓
Obfuscated Binary

Obfuscation Pipeline

# 1. Compile to IR
clang -S -emit-llvm source.c -o source.ll

# 2. Apply obfuscation passes
opt -load ObfuscatorPass.so \
    -control-flow-flattening \
    -string-encryption \
    -bogus-control-flow \
    source.ll -S -o obfuscated.ll

# 3. Optimize obfuscated IR
opt -O2 obfuscated.ll -S -o obfuscated_opt.ll

# 4. Compile to binary
llc obfuscated_opt.ll -o obfuscated.s
clang obfuscated.s -o program

Integration Methods

Method 1: Standalone Pass

Apply obfuscation as separate compilation step:

# Standard pipeline with obfuscation inserted
clang -S -emit-llvm source.c -o source.ll
obfussor-cli obfuscate --input source.ll --output obf.ll
opt -O2 obf.ll -S -o obf_opt.ll
llc obf_opt.ll -o obf.s
clang obf.s -o program

Method 2: Integrated with opt

Load obfuscation passes into opt:

opt -load /path/to/ObfuscatorPass.so \
    -control-flow-flattening \
    -string-encryption \
    -O2 \
    source.ll -o obfuscated.bc

Method 3: Compiler Plugin

Use Clang plugin interface:

clang -fplugin=/path/to/ObfuscatorPlugin.so \
      -mllvm -obfuscate \
      source.c -o program

Method 4: LTO (Link-Time Optimization)

Apply obfuscation during link time:

# Compile with LTO
clang -flto -c source1.c -o source1.o
clang -flto -c source2.c -o source2.o

# Link with obfuscation
clang -flto source1.o source2.o \
      -Wl,-mllvm=-obfuscate \
      -o program

Obfussor CLI Integration

Basic Usage

obfussor-cli obfuscate \
  --input source.c \
  --output obfuscated \
  --techniques cff,str,bog

Internal Pipeline:

source.c → clang → IR → Obfuscation Passes → opt → llc → Binary

Advanced Configuration

obfussor-cli obfuscate \
  --input source.c \
  --output obfuscated \
  --config obf-config.json \
  --ir-output obfuscated.ll \
  --optimization-level O2

With Custom Passes:

obfussor-cli obfuscate \
  --input source.c \
  --output obfuscated \
  --custom-pass /path/to/MyPass.so \
  --pass-options "level=5,seed=42"

Build System Integration

Makefile Integration

CC = clang
OBFUSSOR = obfussor-cli
OPT_LEVEL = -O2

# Obfuscation rules
%.ll: %.c
$(CC) -S -emit-llvm $< -o $@

%.obf.ll: %.ll
$(OBFUSSOR) obfuscate --input $< --output $@

%.o: %.obf.ll
$(CC) -c $< -o $@

# Link
program: main.o utils.o
$(CC) $(OPT_LEVEL) $^ -o $@

.PHONY: clean
clean:
rm -f *.ll *.o program

CMake Integration

# Find LLVM
find_package(LLVM REQUIRED CONFIG)
include_directories(${LLVM_INCLUDE_DIRS})

# Custom command for obfuscation
function(add_obfuscated_executable target)
    set(sources ${ARGN})
    set(obfuscated_sources "")
    
    foreach(src ${sources})
        # Generate IR
        set(ir_file "${CMAKE_BINARY_DIR}/${src}.ll")
        add_custom_command(
            OUTPUT ${ir_file}
            COMMAND ${CMAKE_C_COMPILER} -S -emit-llvm 
                    ${CMAKE_SOURCE_DIR}/${src} -o ${ir_file}
            DEPENDS ${src}
        )
        
        # Obfuscate IR
        set(obf_file "${CMAKE_BINARY_DIR}/${src}.obf.ll")
        add_custom_command(
            OUTPUT ${obf_file}
            COMMAND obfussor-cli obfuscate 
                    --input ${ir_file} --output ${obf_file}
            DEPENDS ${ir_file}
        )
        
        list(APPEND obfuscated_sources ${obf_file})
    endforeach()
    
    add_executable(${target} ${obfuscated_sources})
endfunction()

# Usage
add_obfuscated_executable(my_program main.c utils.c)

Bazel Integration

# BUILD file
load("//tools:obfuscation.bzl", "obfuscated_cc_binary")

obfuscated_cc_binary(
    name = "my_program",
    srcs = ["main.c", "utils.c"],
    obfuscation_config = "obf-config.json",
)

Multi-File Projects

Approach 1: Individual File Obfuscation

# Obfuscate each file separately
for src in *.c; do
    clang -S -emit-llvm $src -o ${src%.c}.ll
    obfussor-cli obfuscate --input ${src%.c}.ll --output ${src%.c}.obf.ll
done

# Compile and link
clang *.obf.ll -o program

Approach 2: Whole Program Obfuscation

# Combine all source files
llvm-link $(find . -name "*.ll") -S -o combined.ll

# Obfuscate combined IR
obfussor-cli obfuscate --input combined.ll --output obfuscated.ll

# Compile to binary
llc obfuscated.ll -o obfuscated.s
clang obfuscated.s -o program

Approach 3: LTO-based

# Compile with LTO
clang -flto -c *.c

# Link with obfuscation at link time
clang -flto -fuse-ld=gold -Wl,-plugin-opt=obfuscate *.o -o program

Cross-Compilation

Targeting Different Architectures

# Compile for ARM64
clang -target aarch64-linux-gnu -S -emit-llvm source.c -o source.ll

# Obfuscate (platform-independent)
obfussor-cli obfuscate --input source.ll --output obf.ll

# Compile for ARM64
llc -march=aarch64 obf.ll -o obf.s
aarch64-linux-gnu-gcc obf.s -o program-arm64

Multi-Target Build

#!/bin/bash

TARGETS=("x86_64-linux-gnu" "aarch64-linux-gnu" "arm-linux-gnueabi")

for target in "${TARGETS[@]}"; do
    # Generate IR (target-independent)
    clang -S -emit-llvm source.c -o source.ll
    
    # Obfuscate (once, for all targets)
    obfussor-cli obfuscate --input source.ll --output obf.ll
    
    # Compile for specific target
    llc -march=${target%%-*} obf.ll -o obf-${target}.s
    ${target}-gcc obf-${target}.s -o program-${target}
done

# Optimize first
opt -O3 source.ll -S -o optimized.ll

# Then obfuscate
obfussor-cli obfuscate --input optimized.ll --output obf.ll

Pros:

Better performance
Cleaner IR for obfuscation

Cons:

Optimizations may undo obfuscation

Optimize After Obfuscation

# Obfuscate first
obfussor-cli obfuscate --input source.ll --output obf.ll

# Then optimize
opt -O2 obf.ll -S -o obf_opt.ll

Pros:

Preserves obfuscation
Can optimize obfuscated code

Cons:

May have performance impact

Recommended: Both

# Light optimization before
opt -O1 source.ll -S -o pre_opt.ll

# Obfuscate
obfussor-cli obfuscate --input pre_opt.ll --output obf.ll

# Optimize after (carefully)
opt -O2 -disable-simplify-cfg obf.ll -S -o final.ll

Debugging Obfuscated Code

Preserve Debug Info

# Compile with debug info
clang -g -S -emit-llvm source.c -o source.ll

# Obfuscate while preserving debug metadata
obfussor-cli obfuscate --input source.ll --output obf.ll \
    --preserve-debug-info

# Compile with debug info
llc -filetype=obj obf.ll -o obf.o
clang -g obf.o -o program

Separate Debug and Release Pipelines

# Debug build (minimal obfuscation)
obfussor-cli obfuscate \
    --input source.ll \
    --output debug.ll \
    --config debug-config.json  # Minimal obfuscation

# Release build (maximum obfuscation)
obfussor-cli obfuscate \
    --input source.ll \
    --output release.ll \
    --config release-config.json  # Maximum obfuscation

Performance Profiling

Measure Compilation Time

#!/bin/bash

echo "Baseline compilation:"
time clang -O2 source.c -o baseline

echo "With obfuscation:"
time obfussor-cli obfuscate \
    --input source.c \
    --output obfuscated \
    --config obf-config.json

Measure Runtime Impact

# Build both versions
clang -O2 source.c -o baseline
obfussor-cli obfuscate --input source.c --output obfuscated

# Benchmark
echo "Baseline:"
time ./baseline

echo "Obfuscated:"
time ./obfuscated

Summary

The LLVM compilation pipeline:

Transforms source code through multiple stages
Obfuscation integrates at IR level
Can be applied at various points
Supports multiple build systems
Works with cross-compilation

Obfussor Documentation