This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines.
For Discriminative models, the study examines various architectures and hyperparameters during training and inference and identifies energy-efficient practices.
For Generative AI, the study focuses on Large Language Models (LLMs) and assesses energy consumption across different model sizes and varying service requests.
The results indicate that optimizing architectures, hyperparameters, and hardware can significantly reduce energy consumption for Discriminative models without sacrificing performance.