Continual learning (CL) is a challenge in training neural networks on sequential tasks without catastrophic forgetting.
Traditional CL approaches rely on gradient-based optimization using stochastic gradient descent (SGD) or its variants.
The limitation of gradient-based CL arises when previous data is not available, resulting in uncontrolled parameter changes and significant forgetting of previously learned tasks.
This work explores the use of gradient-free optimization methods as a robust alternative to address forgetting in CL.