Automate Code from Research Papers: A Guide to Paper2Code for Machine Learning
Tired of manually translating research papers into functional code? Paper2Code automates this process, saving you time and effort. This guide provides a deep dive into Paper2Code, a multi-agent LLM system designed to transform scientific papers into code repositories.
What is Paper2Code?
Paper2Code employs a three-stage pipeline – planning, analysis, and code generation – with specialized agents handling each stage. This innovative method demonstrably outperforms strong baselines, delivering faithful and high-quality implementations from research papers.
Jump Right In: Quick Start Guide
Want to see Paper2Code in action? These quick start instructions will get you up and running fast. Importantly, the following commands run an example using the "Attention Is All You Need" paper.
Using OpenAI API
If you choose to use the OpenAI API, be aware of the estimated cost. Using the o3-mini
model will likely cost between $0.50 and $0.70.
-
Install the OpenAI package:
-
Set your OpenAI API key:
-
Run the script:
Leveraging Open Source Models with vLLM
For those preferring open-source solutions, Paper2Code supports integration with vLLM. The default model is deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
. Note, If you encounter installation issues, consult the official vLLM repository.
-
Install vLLM:
-
Execute the script:
Understanding the Output Folder Structure
After running Paper2Code, the generated output will be organized as follows:
outputs
├── Transformer
│ ├── analyzing_artifacts
│ ├── coding_artifacts
│ └── planning_artifacts
└── Transformer_repo # Final output repository
This structure includes intermediate artifacts from each stage of the process, as well as the final, generated code repository.
Detailed Setup Instructions: Configuring Your Environment
To harness the full power of Paper2Code, follow these detailed steps to set up your environment correctly, ensuring a smooth and efficient workflow.
Essential Environment Setup
Make sure you install the necessary packages. For the o3-mini
version, you must have the latest openai
package installed. Install only the packages you need:
- OpenAI API:
openai
- Open-source models:
vllm
Install the packages using pip
:
Converting PDFs to JSON Format
Paper2Code requires the input paper to be in JSON format. Use the s2orc-doc2json
repository for this conversion. (Refer to the official repository for detailed configuration options.)
-
Clone the repository:
-
Run the PDF processing service:
-
Convert your PDF to JSON:
Running Paper2Code with Your Own Papers
After setting up your environment, you can run Paper2Code on your own research papers. Remember to modify the environment variables accordingly.
Paper2Code Benchmark Datasets: Evaluating Performance
Explore the data/paper2code
directory for a description of the Paper2Code benchmark dataset. Section 4.1, "Paper2Code Benchmark" in the paper, provides further details.
Model-Based Evaluation: Assessing Repository Quality
Paper2Code utilizes a model-based approach to evaluate the quality of generated repositories. This includes both reference-based and reference-free settings. Section 4.3.1 of the paper, "Paper2Code Benchmark", elaborates on the evaluation process.
The model critiques key implementation components, assigns severity levels, and generates a 1-5 correctness score (averaged over 8 samples using o3-mini-high). Modify the paths and arguments to evaluate different repositories.
Setting Up the Evaluation Environment
Before running the evaluation scripts, install the tiktoken
library and set your OpenAI API key (if applicable).
Reference-Free Evaluation
This evaluation method assesses the generated repository without comparing it to a gold reference.
Reference-Based Evaluation
This method compares the generated repository to an official, author-released repository.
Example Evaluation Output
The evaluation script provides a summary of the results, including a correctness score and usage statistics.
========================================
🌟 Evaluation Summary 🌟
📄 Paper name: Transformer
🧪 Evaluation type: ref_based
📁 Target repo directory: ../outputs/Transformer_repo
📊 Evaluation result:
📈 Score: 4.5000
✅ Valid: 8/8
========================================
🌟 Usage Summary 🌟
[Evaluation] Transformer - ref_based
🛠️ Model: o3-mini
📥 Input tokens: 44318 (Cost: $0.04874980)
📦 Cached input tokens: 0 (Cost: $0.00000000)
📤 Output tokens: 26310 (Cost: $0.11576400)
💵 Current total cost: $0.16451380
🪙 Accumulated total cost so far: $0.16451380
========================================