Conditional diffusion models (CDMs) have shown impressive performance in generative tasks by modeling the full data distribution.
CDMs can entangle class-defining features with irrelevant context, making it challenging to extract robust and interpretable representations.
A new concept, Canonical Latent Representations (CLAReps), has been introduced to address this issue.
CLAReps are latent codes in CDMs that preserve essential categorical information while discarding non-discriminative signals.
By utilizing CLAReps, a novel diffusion-based feature-distillation paradigm called CaDistill has been developed.
CaDistill ensures the transfer of core class knowledge from teacher to student CDMs via CLAReps.
CLAReps enable representative sample generation for each class, providing an interpretable and compact summary of core class semantics.
The student model trained with CaDistill achieves strong adversarial robustness and generalization ability.
By focusing on class signals and ignoring spurious background cues, the student model becomes more robust.
The study indicates that CDMs can serve not only as image generators but also as compact, interpretable teachers for driving robust representation learning.