This research paper focuses on the Open-set Video-based Facial Expression Recognition (OV-FER) task.
The goal is to identify both known and unknown human facial expressions not encountered during training.
The proposed approach, Human Expression-Sensitive Prompting (HESP), enhances CLIP's ability to effectively model video-based facial expression details.
Experimental results show that HESP significantly improves CLIP's performance and outperforms other state-of-the-art methods.