The rise of IoT has led to an increased demand for on-edge machine learning, with TinyML being a promising solution for resource-constrained devices like MCUs.
A new benchmarking methodology has been introduced to evaluate performance by integrating energy and latency measurements across three execution phases: pre-inference, inference, and post-inference.
The methodology ensures that devices operate independently, allowing for automated testing to enhance statistical significance. Testing on an STM32N6 MCU with high-performance and low-power configurations revealed insights on energy efficiency impact.
Findings showed that reducing core voltage and clock frequency improved pre- and post-processing efficiency without majorly affecting network execution performance, enabling cross-platform comparisons to determine the most efficient inference platform.