Video diffusion models (VDMs) facilitate the generation of high-quality videos.
Most research efforts in VDMs have focused on scaling during training, but scaling during inference time has received less attention.
Recent findings suggest that guiding the scaling inference-time search of VDMs with reward signals can enhance video quality by identifying better noise candidates.
The proposed ScalingNoise strategy improves global content consistency and visual diversity in long video generation.