CDD-IIE-Bench 📊

Latest Updates 🚀

2026.03.31 Released CDD-IIE-Bench evaluation set and standards
2026.03.31 Manual evaluation results of current latest open-source models
Manual evaluation results of current latest closed-source models are in progress...

Dataset Introduction 📚

We are excited to share our latest research—CDD-IIE-Bench v1.0! This is an open-source evaluation suite for Instruct-based Image Editing (IIE) tasks, dedicated to providing Comprehensive, in-Depth, and Diagnostic (CDD) evaluation standards. The dataset consists of 2 major categories, 5 intermediate categories, 21 sub-categories, and 33 fine-grained categories, totaling 1,341 test cases.

Evaluation Standards ⭐

To ensure evaluation accuracy, we employ manual evaluation (Golden Metric). Twelve vision experts randomly ordered generated images from different models under the same instruction. For 21 evaluation tasks, each task has 3 specific evaluation dimensions and a 5-point rating scale (where 1 = Poor, 5 = Excellent). Each level has clear criteria. For instance, the "Object Addition" task includes three dimensions: "Instruction Adherence," "Visual Naturalness," and "Physical and Detail Consistency." The 5-point standard for "Instruction Followed" is as follows:

Score 5: All specified attributes are correct and scene logic is coherent; only minor microscopic imperfections.
Score 4: Main attributes correct; only slight deviations in details or 1-2 small features missing.
Score 3: Correct category but key attributes (position, color, size, quantity, etc.) are incorrect.
Score 2: Added object category is wrong or unrelated to the instruction.
Score 1: No content added, or added content is damaged/invalid.

For more specific standards, please refer to Detailed Evaluation Standards

Evaluation Objects 🎯

We conducted manual evaluations on the current latest open-source instruction-based editing models, and the results are shown below:

References 📖

@article{zang2026instruction,
  title={Instruction-based image editing: a survey on data, models, evaluation, and applications},
  author={Zang, Xianghao and Jiang, Zijian and Cheng, Jiarong and others},
  journal={Vicinagearth},
  volume={3},
  number={1},
  pages={3},
  year={2026},
  publisher={Springer} 
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
evalmetrics		evalmetrics
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CDD-IIE-Bench 📊

Latest Updates 🚀

Dataset Introduction 📚

Evaluation Standards ⭐

Evaluation Objects 🎯

References 📖

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CDD-IIE-Bench 📊

Latest Updates 🚀

Dataset Introduction 📚

Evaluation Standards ⭐

Evaluation Objects 🎯

References 📖

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages