From Finding to Proof: How Automated Exploit Generation Works
A vulnerability finding that can be confirmed with a working proof-of-concept is categorically different from an unvalidated alert. The former is actionable; the latter is noise. The smart contract security tooling ecosystem has historically been good at generating alerts and poor at separating real exploitability from theoretical possibility.
This post explains how automated exploit generation works, what it can and cannot do, and why bytecode-level analysis is the foundation it requires.
Why Exploit Generation Is Hard
Generating a working exploit is harder than it sounds, even when the vulnerability is known.
Consider a classic reentrancy finding. Static analysis identifies that withdraw() calls msg.sender before updating balances[msg.sender]. The analysis is correct. But turning that into a working exploit requires:
- Deploying an attacker contract with a
receive()function that callswithdraw()recursively - Sizing the initial deposit so the recursive drain is profitable
- Respecting gas limits — the recursive depth that maximizes profit while fitting in a block
- Handling protocol-specific guards — withdrawal limits, cooldowns, circuit breakers
- Accounting for the execution environment — actual balances at a specific block height
Each of these requires knowledge that static analysis alone does not provide. This is why exploit generation requires combining static analysis with symbolic execution and dynamic simulation.
The Pipeline: Finding to Proof
The exploit generation pipeline operates in three stages.
flowchart LR
classDef process fill:#1a2233,stroke:#7ea8d4,stroke-width:2px,color:#c0d8f0
classDef data fill:#332a1a,stroke:#d4b870,stroke-width:2px,color:#f0e0c0
classDef success fill:#1a331a,stroke:#a8c686,stroke-width:2px,color:#c8e8b0
classDef highlight fill:#332519,stroke:#e8a87c,stroke-width:2px,color:#f0d8c0
F["Static Finding"]:::data
C["Constraint Synthesis"]:::process
S["Symbolic Execution"]:::process
P["Proof Generation"]:::process
V["Validation"]:::process
R["Confirmed Exploit"]:::success
X["Unconfirmed / Adjusted Confidence"]:::highlight
F --> C --> S --> P --> V
V --> R
V --> X
Stage 1: Constraint Synthesis
The first stage takes a finding and extracts the conditions required for exploitation. For each vulnerability class, the synthesis engine knows the structural requirements.
For reentrancy, the constraints are:
- A caller-controlled address receives ETH or a token callback
- The callback can call back into the vulnerable function before the balance is updated
- A minimum profitable balance exists in the contract
For access control (missing owner check on a sensitive function):
- The function is callable from any address (no
CALLERcomparison before the sensitive operation) - The sensitive operation has meaningful impact (selfdestruct, arbitrary transfer, implementation change)
For oracle manipulation (instant spot price from an AMM):
- A flash loan source exists with sufficient liquidity to move the oracle price
- The vulnerable protocol reads the oracle in the same transaction window the attacker controls
- The oracle-dependent operation produces a profitable outcome when price is manipulated
These constraints become the search specification for the symbolic execution stage.
Stage 2: Symbolic Execution
Symbolic execution explores program paths without concrete inputs, maintaining symbolic expressions for unknown values. The exploit generator uses it to find concrete inputs that satisfy the exploitation constraints.
Example: reentrancy in withdraw(uint256 amount)
Symbolic state at withdraw() entry:
balances[ATTACKER] = 1 ether (concrete — from storage at current block)
amount = α (symbolic — attacker-controlled)
Path constraint for profitable drain:
α ≤ 1 ether (balance check passes)
msg.sender.call{value: α}("") succeeds (recursive call returns true)
balances[ATTACKER] is read again before SSTORE completes
Solver query:
∃ α such that the above path is reachable and ETH drained > gas cost
Solution:
α = 1 ether
Recursive depth: 3 (given current contract balance of 3 ether)
The symbolic engine handles path explosion — the exponential growth in paths when branching conditions are complex — by applying heuristics that prioritize paths leading to security-critical operations. Paths that do not reach CALL, DELEGATECALL, SSTORE, or SELFDESTRUCT are deprioritized early.
Stage 3: Proof Generation and Validation
A candidate exploit is a transaction sequence. For reentrancy:
- Deploy attacker contract with recursive
receive()hook - Call
deposit()with a small initial balance (to satisfy the balance check) - Call
withdraw()— trigger the recursive drain - Measure the final balance difference
The validation stage simulates this transaction sequence against a fork of the chain at the target block, replaying the exact state — token balances, protocol configuration, oracle readings — at the block where the vulnerability was analyzed.
If the simulation produces a positive profit for the attacker address (ETH drained exceeds gas cost plus initial deposit), the exploit is confirmed and the finding is upgraded to Exploitable status. If the simulation reverts, partial drain occurs, or the profit is negative, the finding retains its static confidence score with a note that automated generation did not succeed.
Vulnerability Classes and Generation Success Rates
Different vulnerability classes have different exploit generation success rates. The rates reflect both technical difficulty and how precisely the vulnerability is described by static patterns.
flowchart TD
classDef success fill:#1a331a,stroke:#a8c686,stroke-width:2px,color:#c8e8b0
classDef process fill:#1a2233,stroke:#7ea8d4,stroke-width:2px,color:#c0d8f0
classDef highlight fill:#332519,stroke:#e8a87c,stroke-width:2px,color:#f0d8c0
subgraph HighSuccess ["High Generation Success (70-90%)"]
A["Classic Reentrancy"]:::success
B["Missing Access Control on Selfdestruct"]:::success
C["Arbitrary Storage Write"]:::success
D["Unprotected Initialization"]:::success
end
subgraph MedSuccess ["Medium Generation Success (40-65%)"]
E["Cross-Function Reentrancy"]:::process
F["Flash Loan Oracle Manipulation"]:::process
G["Integer Overflow (pre-0.8)"]:::process
H["Unsafe Delegatecall"]:::process
end
subgraph LowSuccess ["Context-Dependent (20-40%)"]
I["Read-Only Reentrancy"]:::highlight
J["Multi-Protocol Attack Chains"]:::highlight
K["Governance Flash Loan Attacks"]:::highlight
end
Classic reentrancy has the highest success rate because the attack structure is well-defined: deploy attacker, deposit, withdraw, observe balance. The symbolic engine quickly finds the profitable recursive depth given the contract’s current balance.
Multi-protocol attack chains have lower success rates because they require modeling interactions between multiple deployed contracts — the exploit depends on the specific state of external protocols (liquidity pool balances, oracle readings) that change block-by-block.
A Worked Example: Access Control on Initialization
Consider an upgradeable contract with an unprotected initialize() function:
contract VaultImpl {
address public owner;
bool public initialized;
function initialize(address _owner) external {
require(!initialized, "Already initialized");
owner = _owner;
initialized = true;
}
function drain(address payable recipient) external {
require(msg.sender == owner, "Not owner");
recipient.transfer(address(this).balance);
}
}
The static finding: initialize() has no access control. For a proxy-based deployment, the implementation contract may have initialized = false because initialization was only called through the proxy.
The exploit generator synthesizes:
Constraint: initialize(ATTACKER) succeeds without revert
Proof of concept:
1. Call initialize(attacker_address) on the implementation
2. Call drain(attacker_address)
3. Implementation balance transfers to attacker
The simulation confirms: on a fresh deployment where the implementation has not been directly initialized, this drains the contract. Confidence upgrades from 0.91 (static) to 1.0 (exploitable) with a concrete transaction record from the simulation.
What Exploit Generation Cannot Do
Automated exploit generation is not a complete exploitability oracle. A failed generation attempt does not mean the contract is safe — it means the automated pipeline could not construct a working exploit within its search budget and model assumptions.
The pipeline cannot:
- Model arbitrary off-chain conditions (API availability, user behavior)
- Reason about MEV-dependent exploits that require specific transaction ordering provided by a validator
- Exhaustively explore protocols with dozens of interacting functions and complex state invariants
- Account for future state changes that make currently-safe contracts vulnerable after a governance action
A finding with high static confidence and a failed exploit generation attempt should still be treated seriously. The finding informs remediation; the exploit generation provides an extra confirmation signal when it succeeds.
Integration with the Analysis Workflow
Exploit generation runs automatically as part of the full analysis pipeline. For each critical and high-severity finding, the system attempts generation within a configurable time budget (typically 30 seconds per finding). Findings that exceed the budget retain their static confidence score; findings with a confirmed exploit receive the exploitable label.
The results appear in the findings output:
{
"detector_id": "reentrancy",
"severity": "CRITICAL",
"confidence": 0.97,
"exploitable": true,
"exploit": {
"type": "reentrancy_drain",
"initial_deposit": "1000000000000000000",
"recursive_depth": 4,
"profit_wei": "3000000000000000000",
"simulation_block": 19847231,
"attacker_contract_bytecode": "0x..."
}
}
The attacker contract bytecode field contains the compiled exploit contract — the actual artifact that would drain the target. This is provided for security teams to test in controlled environments.
Why Bytecode Is the Required Foundation
Automated exploit generation only works correctly when the analysis operates on bytecode rather than source code.
Source code tools flag patterns in an abstraction. The actual exploitability depends on what the compiler produced — the jump targets, the gas forwarding amounts, the exact storage slot layout that determines which balances are affected.
Two real-world cases demonstrate the difference:
The Curve Finance 2023 exploit involved a Vyper reentrancy guard that was present in source code and absent in bytecode due to a compiler bug. A source-level tool would have seen the guard and reported the contract safe. Bytecode analysis sees no guard in the compiled output and correctly identifies the reentrancy window. Exploit generation on the bytecode would have produced a working attack before deployment.
Proxy storage collisions are another class where source analysis fails. When a proxy’s storage layout overlaps with its implementation’s variables, the vulnerability is visible only when comparing concrete storage slot computations from both contracts’ bytecode. Source analysis sees two separate, clean contracts; bytecode analysis sees that slot 0 of one is slot 0 of the other.
Summary
Automated exploit generation changes the signal-to-noise ratio of security analysis. A confirmed exploit is a stronger signal than a static finding alone — it collapses the distance between “a tool thinks this looks vulnerable” and “this contract can be drained.”
The pipeline — constraint synthesis, symbolic execution, simulation validation — provides that confirmation for vulnerability classes where automated construction is tractable. For more complex multi-protocol attacks, it provides a partial search that still informs the confidence score.
The foundation is bytecode. Only by analyzing what actually executes can the system construct transactions that prove exploitability against the real deployed contract, rather than against a source code model that may diverge from ground truth.