Researchers have conducted a study on transfer learning to optimize downstream performance.
They introduced a simple linear model that utilizes a pretrained feature transform.
The researchers derived the exact asymptotics of the downstream risk and its fine-grained bias-variance decomposition.
The study revealed that the optimal featurization is naturally sparse and undergoes a phase transition from hard selection to soft selection of relevant features.