SPIO is a framework that leverages Large Language Models (LLMs) for orchestration of multi-agent planning in automated data science.
SPIO consists of four key modules: data preprocessing, feature engineering, modeling, and hyperparameter tuning.
Dedicated planning agents generate candidate strategies in each module, leading to comprehensive exploration.
SPIO offers two variants: SPIO-S, which selects the best solution path determined by LLM; and SPIO-E, which ensembles the top k candidate plans for improved predictive performance.