AWS Neuron SDK for developing and running new kernels on AWS Trainium and AWS Inferentia.
The NKI interface joins another API that enables NeuronCore-v2 programmability.
By exposing APIs for Neuron kernel customization, the SDK empowers developers to create and/or optimize the low-level operations, greatly increasing the opportunity for running ML workloads on Trainium and Inferentia.
The NKI documentation includes a dedicated section on the architecture design of NeuronCore-v2 and its implications on custom operator development.
Similar to other dedicated AI chips, NeuronCore-v2 includes several internal acceleration engines, each of which specializes in performing certain types of computations.
The NKI API Reference Manual details the Python API for kernel development.
The second method for creating a custom Neuron kernel involves building a C++ operator for the GpSimd engine.
Through its high-level Python interface, the NKI APIs expose the power of the Neuron acceleration engines to ML developers in an accessible and user-friendly manner.