menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

HopaDIFF: ...
source image

Arxiv

2d

read

0

img
dot

Image Credit: Arxiv

HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios

  • Action segmentation is a core challenge in high-level video understanding, focusing on partitioning videos and assigning predefined action labels.
  • Existing methods mainly address single-person activities, leaving out multi-person scenarios.
  • A new dataset, RHAS133, is introduced for Referring Human Action Segmentation in multi-person settings, comprising 133 movies with annotations for 137 actions and textual descriptions.
  • Benchmarking existing methods on the RHAS133 dataset shows limited performance in aggregating visual cues for target individuals.
  • To improve action segmentation in multi-person scenarios, a new framework called HopaDIFF is proposed.
  • HopaDIFF leverages a holistic-partial aware Fourier-conditioned diffusion approach and a novel cross-input gate attentional xLSTM for enhanced long-range reasoning.
  • The framework introduces a Fourier condition to gain more control and improve action segmentation generation.
  • HopaDIFF achieves state-of-the-art results on the RHAS133 dataset across various evaluation scenarios.
  • The code for HopaDIFF is available at https://github.com/KPeng9510/HopaDIFF.git.

Read Full Article

like

Like

For uninterrupted reading, download the app