[ICCV 2025]
Holistic Unlearning Benchmark:
A Multi-Faceted Evaluation for
Text-to-Image Diffusion Model Unlearning

Saemi Moon^*,1, Minjong Lee^*,1, Sangdon Park^1,2, Dongwoo Kim^1,2

CSE, POSTECH¹, GSAI, POSTECH²
^*Indicates Equal Contribution

Paper Code arXiv

HUB: Holistic Unlearning Benchmark

HUB systematically evaluates unlearning methods across six key perspectives, covering 33 target concepts categorized into four dimensions: Celebrity, Style, IP, and NSFW. HUB provides an extensive set of 16,000 prompts per concept.

No single method outperforms in all perspectives

For each method and task, we compute the average performance across all concept categories as shown in the table below, and we then use the averages to rank the methods.

Leaderboard for unlearning methods

We report the average performance across concepts for each category. Overall represents the average performance across all categories.

Comparison with previous methods and benchmarks

We use the abbreviations I, S, C, N, and O to denote Intellectual Property (IP), artist style (Style), Celebrity, NSFW, and Object, respectively. ✓ indicates that the method quantitatively evaluates the corresponding task.

Citation


        @article{moon2024holistic,
          title={Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning},
          author={Moon, Saemi and Lee, Minjong and Park, Sangdon and Kim, Dongwoo},
          journal={arXiv preprint arXiv:2410.05664},
          year={2024}
        }

[ICCV 2025]Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning

HUB: Holistic Unlearning Benchmark

HUB systematically evaluates unlearning methods across six key perspectives, covering 33 target concepts categorized into four dimensions: Celebrity, Style, IP, and NSFW. HUB provides an extensive set of 16,000 prompts per concept.

No single method outperforms in all perspectives

For each method and task, we compute the average performance across all concept categories as shown in the table below, and we then use the averages to rank the methods.

Leaderboard for unlearning methods

We report the average performance across concepts for each category. Overall represents the average performance across all categories.

Comparison with previous methods and benchmarks

We use the abbreviations I, S, C, N, and O to denote Intellectual Property (IP), artist style (Style), Celebrity, NSFW, and Object, respectively. ✓ indicates that the method quantitatively evaluates the corresponding task.

Citation

[ICCV 2025]
Holistic Unlearning Benchmark:
A Multi-Faceted Evaluation for
Text-to-Image Diffusion Model Unlearning