Efficient neural networks (NNs) using lookup tables (LUTs) have shown potential for AI applications on FPGAs.
NeuraLUT-Assemble is a framework that combines mixed-precision techniques and assembly of larger neurons to improve accuracy and connectivity of LUT-based designs.
It introduces skip-connections to enhance gradient flow and achieves competitive accuracy in various tasks.
NeuraLUT-Assemble also demonstrates up to 8.42x reduction in the area-delay product compared to the state-of-the-art at the time of publication.