Text-to-text regression is proposed as a scalable alternative for predicting metric outcomes of large systems where traditional tabular regression methods struggle.
A 60M parameter encoder-decoder model trained from random initialization achieves high accuracy in predicting resource efficiency on Google's massive compute cluster scheduling system.
The text-to-text regression model outperforms tabular approaches with a near-perfect rank correlation and significantly lower mean squared error across the system fleet.
The model demonstrates adaptability to new tasks with few-shot examples and effectively captures complex outcome distributions, with important implications for universal simulators of real-world outcomes.