Recent advancements in tabular classification leverage the in-context learning capability of Large Language Models (LLMs), with TabFlex being the latest model enhancing efficiency and scalability for larger datasets.
TabFlex incorporates linear attention mechanisms, enabling seamless scaling to handle tabular datasets with thousands of features and hundreds of classes, processing over a million samples in just 5 seconds.
Extensive evaluations show that TabFlex achieves over a 2x speedup compared to TabPFN and a 1.5x speedup over XGBoost, outperforming 25 tested baselines in terms of efficiency across diverse datasets.
TabFlex demonstrates strong performance on large-scale datasets with reduced computational costs when combined with data-efficient techniques such as dimensionality reduction and data sampling.