A new approach has been proposed to address the problem of online distribution shift in deep learning.The proposed method is a meta-algorithm that can enhance the performance of any online learner under non-stationarity.It automatically adapts to changes in the data distribution and selects the most appropriate 'attention span' for learning.Experiments show consistent improvement in classification accuracy across various real-world datasets.