This study examines the use of multimodal machine learning to detect deception in dyadic interactions, integrating data from both deceivers and deceived.
The study compared early and late fusion approaches using audio and video data, specifically focusing on Action Units and gaze information.
Results show that combining speech and facial data enhances deception detection accuracy, with the best performance (71%) achieved through late fusion across modalities and participants.
The research on a Swedish cohort suggests that including data from both participants improves detection accuracy and lays the groundwork for future studies in dyadic interactions, especially in psychotherapy settings.