Embeddings are a basic feature extraction step in many machine learning models, especially in natural language processing.
A probability model is used to study the learning capability of embeddings, where the correlation of random variables is related to the similarity of the embeddings.
The low-rank approximate message passing (AMP) method can be used to learn the embeddings.
The theoretical findings are validated through simulations on synthetic and real text data.