<ul><li>Analog in-memory computing (AIMC) is a promising compute paradigm that aims to enhance speed and power efficiency of neural network inference beyond traditional von Neumann-based architectures.</li><li>Challenges like noisy computations and strict input/output quantization constraints hinder the performance of off-the-shelf Large Language Models (LLMs) when deployed on AIMC-based hardware.</li><li>A new method has been introduced to adapt LLMs for execution on noisy, low-precision analog hardware, allowing advanced models to maintain performance comparable to 4-bit weight, 8-bit activation standards despite noise and quantization restrictions.</li><li>The models developed through this approach can also be quantized for inference on low-precision digital hardware, displaying improved scaling behavior compared to models trained with 4-bit weight and 8-bit static input quantization.</li></ul>

Analog Foundation Models

Discover more