menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Biased Due...
source image

Arxiv

1w

read

241

img
dot

Image Credit: Arxiv

Biased Dueling Bandits with Stochastic Delayed Feedback

  • The dueling bandit problem is gaining popularity in various fields due to its applications in online advertising, recommendation systems, and more.
  • Delays in feedback pose a challenge to existing dueling bandit literature, affecting the agent's ability to update their policy quickly and accurately.
  • A new problem called biased dueling bandit problem with stochastic delayed feedback is introduced, involving preference bias between selections.
  • Two algorithms are presented to handle delayed feedback, one requiring complete delay distribution information and the other only the expected value of delay.

Read Full Article

like

14 Likes

For uninterrupted reading, download the app