menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Open Source News

>

Yandex Rel...
source image

Marktechpost

4d

read

93

img
dot

Yandex Releases Yambda: The World’s Largest Event Dataset to Accelerate Recommender Systems

  • Yandex has released Yambda, the world’s largest publicly available dataset for recommender system research, containing nearly 5 billion anonymized user interaction events from Yandex Music.
  • This dataset aims to bridge the gap between academia and industry, offering valuable behavioral data for developing recommender systems.
  • Recommender systems rely on massive behavioral data to provide personalized experiences, but access to such large, anonymized datasets has been limited.
  • Yambda addresses this challenge by offering 4.79 billion anonymized user interactions over a 10-month period, with features like audio embeddings and organic interaction flags.
  • The dataset is provided in Apache Parquet format, making it accessible for researchers and developers using big data processing frameworks.
  • Yandex introduces a Global Temporal Split evaluation strategy in Yambda, preserving temporal order for realistic testing of recommender models.
  • Baseline models like MostPop, DecayPop, ItemKNN, and others are included for benchmarking and assessing the performance of new algorithms.
  • Yambda's applicability extends beyond music streaming, serving as a benchmark for recommender systems in various domains like e-commerce and social networks.
  • My Wave, Yandex Music's personalized recommender system, utilizes deep neural networks to offer tailored music suggestions based on user preferences.
  • Yandex emphasizes privacy in the dataset by anonymizing user data and omitting sensitive attributes, ensuring ethical use of the data for research purposes.

Read Full Article

like

5 Likes

For uninterrupted reading, download the app