This article provides an understanding of the underlying mechanisms in a Transformer model.The illustration shows the relationships between Query and Key in the Transformer model.Query, Key, and Value are derived from the embedding matrices in the model.The Transformer model consists of attention heads, layer normalization, residual connections, and an MLP layer.