Benchmark Environment Setup

本文记录可直接 follow 不报错的常用 benchmark 环境安装指南。

GenEval

环境设置

1、创建 conda 环境：

1 2	`conda create -n geneval -y python=3.8.10 conda activate geneval`

2、安装 torch 依赖：

1	`pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121`

3、安装其他依赖：

pip install networkx==2.8.8
pip install open-clip-torch==2.26.1
pip install clip-benchmark
pip install -U openmim
pip install einops
python -m pip install lightning
pip install diffusers transformers
pip install tomli
pip install platformdirs
pip install setuptools==60.2.0

4、安装 mmcv 依赖：

1	`mim install mmengine mmcv-full==1.7.2`

5、安装 mmdet 依赖：

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
git checkout 2.x
pip install -v -e .

运行测试

1、Clone 仓库：

1 2	`git clone https://github.com/djghosh13/geneval.git cd geneval`

2、下载 Mask2Former 目标检测器：

1	`./evaluation/download_models.sh ./object_detector`

3、下载 CLIP ViT-L/14，直接在终端开启一个 Python 交互界面执行以下代码：

1 2	`import open_clip open_clip.create_model_and_transforms("ViT-L-14", pretrained="openai", device="cpu")`

4、生成图像：从 ./prompts/evaluation_metadata.jsonl 中逐行读取 prompt，生成 4 张图像，并按以下格式存储，其中 metadata.jsonl 包含上述 prompt 文件的对应行

<IMAGE_FOLDER>/
    00000/
        metadata.jsonl
        samples/
            0000.png
            0001.png
            0002.png
            0003.png
    00001/
        ...

5、运行评测：

1	`python evaluation/evaluate_images.py "<IMAGE_FOLDER>" --outfile "<RESULTS_FOLDER>/results.jsonl" --model-path ./object_detector`

6、获取最终得分：

1	`python evaluation/summary_scores.py "<RESULTS_FOLDER>/results.jsonl"`

DPG-Bench

官方仓库 3c228f1｜重要参考 issue#65

环境设置

1、创建 conda 环境：

1 2	`conda create -n dpg python=3.13 -y conda activate dpg`

2、安装 torch 依赖：

1	`pip install torch==2.11.0 torchvision==0.26.0 torchaudio==2.11.0 --index-url https://download.pytorch.org/whl/cu126`

3、创建 requirements-for-dpg_bench-issues65.txt 写入以下内容：

accelerate==1.10.1
addict==2.4.0
cloudpickle==3.1.1
datasets==2.21.0
decord==0.6.0
diffusers==0.35.1
# fairseq==0.12.2
ftfy==6.0.3
librosa==0.10.1
modelscope==1.30.0
numpy==2.3.3
opencv-python==4.11.0.86
pandas==2.3.2
pillow==11.3.0
rapidfuzz==3.14.1
rouge-score<=0.0.4
safetensors==0.6.2
simplejson==3.20.1
sortedcontainers==2.4.0
soundfile==0.13.1
taming-transformers-rom1504==0.0.6
tiktoken==0.11.0
timm==1.0.19
tokenizers==0.22.1
tqdm==4.67.1
transformers==4.56.2
transformers-stream-generator==0.0.5
unicodedata2==16.0.0
zhconv==1.4.3

从该 txt 文件安装依赖：

1	`pip install -r requirements-for-dpg_bench-issues65.txt`

4、安装 fairseq 依赖：

1	`pip install git+https://github.com/One-sixth/fairseq.git`

运行测试

1、Clone 仓库：

1 2	`git clone https://github.com/TencentQQGYLab/ELLA.git cd ELLA`

2、下载 mPLUG VQA 模型，直接在终端开启一个 Python 交互界面执行以下代码：

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
ckpt = "damo/mplug_visual-question-answering_coco_large_en"
pipeline(Tasks.visual_question_answering, model=ckpt, device="cpu")

3、生成图像：对仓库中 ./dpg_bench/prompts 内的每个 prompt，生成 4 张图并按 2x2 格子拼成一张大图，图像文件名应与 prompt 的 txt 文件名相同。

4、运行评测：

1	`PROCESSES=N bash dpg_bench/dist_eval.sh $YOUR_IMAGE_PATH $RESOLUTION`

Deep Learning

#deep learning

Benchmark Environment Setup

https://xyfjason.github.io/blog-main/2026/04/07/Benchmark-Environment-Setup/

作者

xyfJASON

发布于

2026年4月7日

许可协议

Linear Attention 下一篇