Evaluating AI Security: Introducing CVE-Bench for Web Vulnerability Exploitation

Are AI agents truly ready to defend or attack in the cybersecurity landscape? CVE-Bench is a new benchmark designed to evaluate the ability of AI agents to exploit real-world web application vulnerabilities. This tool provides a standardized and reproducible environment for testing AI's capabilities in identifying and exploiting common vulnerabilities and exposures (CVEs). This article provides an overview of the CVE-Bench framework, offering practical insights on how to use it to test and improve your AI security agents.

What is CVE-Bench and Why Does It Matter?

CVE-Bench leverages data of critical-severity CVEs from the National Vulnerability Database, providing reference automatic exploits on request. It allows researchers and developers to test AI agents against real-world web vulnerabilities in a controlled, secure Docker environment. The goal? To see if AI can autonomously identify and exploit known weaknesses in web applications.

Key Benefits of Using CVE-Bench:

Real-World Relevance: Uses actual CVEs to simulate realistic attack scenarios.
Reproducible Results: Docker-based environment ensures consistent and reliable evaluations.
Comprehensive Evaluation: Tests a range of exploitation outcomes (data modification, privilege escalation, unauthorized access, and more).
Standardized Benchmark: Provides a common ground for comparing different AI agent's performance.

Setting Up CVE-Bench: A Step-by-Step Guide

Getting started with CVE-Bench involves a few straightforward steps using Docker and Poetry:

Install Docker: Follow the official Docker setup guide for your operating system. For Linux users, post-installation steps for non-root users are recommended.

Clone the Repository:

git clone [email protected]:uiuc-kang-lab/cve-bench.git
cd cve-bench

Install Dependencies: Ensure Poetry is updated and install dependencies:
```
poetry install
```
.env File: Create a .env file in the root directory to set necessary environment variables.

Note: While x86_64 or amd64 machines are fully supported, arm64 machine support is currently experimental!

Running Evaluations with CVE-Bench

Once set up, you can use the run script to manage images and run evaluations. This script provides several commands:

gen-prompt: Generate a prompt for a specific CVE.
gen-metadata: Generate metadata for all challenges.
pull: Pull all necessary Docker images.
build: Build all Docker images.
push: Push Docker images to a repository.
health: Check the health of all challenges.
eval: Run the evaluation using the Inspect framework.

Example Evaluation Commands:

Run all variants of all challenges:

./run eval --model=openai/gpt-4o-2024-11-20

Run specific challenges (CVE-2023-37999 and CVE-2024-2771):
```
./run eval -T challenges=CVE-2023-37999,CVE-2024-2771
```
Run a specific variant (one_day) for a specific challenge (CVE-2023-37999):
```
./run eval -T challenges=CVE-2023-37999 -T variants=one_day
```

Crafting Custom Prompts for AI Agents

CVE-Bench allows you to create custom prompts for your AI agents. The metadata for each CVE, located in src/cvebench/metadata/CVE-XXXX-XXXX.yml, provides crucial information:

additional_info: Information about application usage.
admin_username: Administrator's username.
application_url: URL of the web application.
attacker_role: Role of the attacker (user, administrator, etc.).
default_username: Attacker's username.
default_password: Attacker's password.
outbound_service_url: URL for outbound service access.
proof_uploading_url: URL for uploading accessed data.

Use this data to tailor prompts and challenge your AI agents effectively. For example, generate the default prompt for CVE-2023-37999:

./run gen-prompt zero_day CVE-2023-37999

Potential Exploitation Outcomes

CVE-Bench evaluates AI agents based on their ability to achieve various exploitation outcomes, simulating impactful real-world consequences:

Denial of Service (DoS): Making the website unresponsive.
File Access: Accessing files on the server.
File Creation: Creating files in sensitive locations.
Database Modification: Altering database data.
Database Access: Accessing sensitive database tables.
Unauthorized Administrator Login: Gaining admin access without credentials.
Privilege Escalation: Elevating user privileges.
Outbound Service: Making the server send requests to external services.

Citing CVE-Bench

If you use CVE-Bench in your research, please cite it as follows:

@misc{
cvebench,
title={CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities},
author={Yuxuan Zhu and Antony Kellermann and Dylan Bowman and Philip Li and Akul Gupta and Adarsh Danda and Richard Fang and Conner Jensen and Eric Ihli and Jason Benn and Jet Geronimo and Avi Dhir and Sudhit Rao and Kaicheng Yu and Twm Stone and Daniel Kang},
year={2025},
url={https://arxiv.org/abs/2503.17332}
}