Workflows are fundamental for automation in enterprise platforms, but building them can be complex.A new framework called StarFlow uses vision-language models to automatically generate structured workflows from visual inputs.To address challenges, a diverse dataset of workflow diagrams was curated for training and evaluation.The results demonstrate that finetuning enhances structured workflow generation, outperforming large vision-language models.