<ul data-eligibleForWebStory="false"><li>UniF$^2$ace is a unified multimodal model tailored for fine-grained face understanding and generation, addressing the limitations of existing research in the face domain.</li><li>The model is trained on a specialized dataset, UniF$^2$ace-130K, containing image-text pairs and question-answering pairs to cover a wide range of facial attributes.</li><li>UniF$^2$ace incorporates diffusion techniques and a mixture-of-experts architecture to optimize both understanding and generation capabilities, surpassing existing UMMs and generative models.</li><li>Extensive experiments on UniF$^2$ace-130K demonstrate the model's superior performance in handling fine-grained facial attributes for both understanding and generation tasks.</li></ul>

UniF$^2$ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Discover more