EVMbench, a benchmarking tool, is set to enhance blockchain security by measuring the capabilities of AI agents in detecting, patching, and exploiting vulnerabilities in smart contracts. This new tool underscores the growing role of artificial intelligence in enhancing the security of decentralized finance (DeFi) ecosystems.
EVMbench employs historical vulnerabilities and a Rust-based harness to evaluate AI performance. At the forefront is GPT-5.3-Codex, an AI model developed by OpenAI, which achieved a score of 72.2% in exploit-mode evaluations.
EVMbench's evaluation is comprehensive, utilizing 120 curated vulnerabilities from over 40 audits. These include scenarios provided by Tempo L1, which focuses on payment-oriented evaluations.
To continue reading this as well as other DeFi and Web3 news, visit us at thedefiant.io







