menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Adding Tra...
source image

Towards Data Science

1M

read

156

img
dot

Adding Training Noise To Improve Detections In Transformers

  • Modern vision transformers are utilizing noise addition to enhance the performance of object detection tasks.
  • Early vision transformers like DETR used learned decoder queries for object detection, but had slow convergence.
  • Recent transformer architectures have implemented deformable aggregation and spatial anchors for improved detection results.
  • The Hungarian algorithm is used for prediction to ground truth matching in transformers, leading to unstable training objectives.
  • DN-DETR addresses the unstable matching issue by introducing noise to ground truth boxes, improving model stability and convergence speed.
  • DINO enhances denoising by incorporating contrastive learning, improving detection performance even further.
  • Temporal models like Sparse4Dv3 leverage denoising and temporal denoising groups for object tracking across frames.
  • Denoising in vision transformers accelerates convergence and boosts detection results, especially with learnable anchors.
  • The use of denoising raises questions about the necessity of learnable anchors and the impact on models with non-learnable anchors.
  • While denoising improves stability in gradient descent, the relevance in models with spatially constrained queries remains a topic for further exploration.

Read Full Article

like

10 Likes

For uninterrupted reading, download the app