<ul data-eligibleForWebStory="true"><li>The research focuses on addressing heavy-tailed noise in stochastic linear bandits.</li><li>Existing strategies like truncation and median-of-means are limited in applicability due to specific noise assumptions or bandit structures.</li><li>A recent work introduced a soft truncation method using adaptive Huber regression but faced computational challenges.</li><li>A new 'one-pass' algorithm based on online mirror descent reduces per-round computational costs significantly, offering near-optimal regret.</li><li>The method updates using only current data at each round, improving efficiency.</li><li>Per-round computational cost decreases from O(t*log T) to O(1).</li><li>The algorithm achieves a regret order of d * T^((1-ε)/(2*(1+ε))) * sqrt(Σ_{t=1}^T ν_t^2) for a dimension d and moment of reward ν_t.</li></ul>

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update

Discover more