Think Twice Before Joining Bluesky

A naukri.com initiative

New

Home

Data Science News

Think Twic...

Analyticsindiamag

Image Credit: Analyticsindiamag

Think Twice Before Joining Bluesky

Bluesky, a microblogging platform alternative to X and Twitter, has doubled its user base since September to 20 million by November 20.
The platform is competing against approximately 611 million monthly active users of Elon Musk’s X and 275 million monthly active users of Meta’s Threads.
Bluesky offers an open API, which allows its data to be used for training AI models, unlike X. Daniel van Strien, a machine learning engineer at Hugging Face, recently released a controversial dataset of one million public posts sourced from Bluesky’s Firehose API without user consent.
Clem Delangue, CEO of Hugging Face, responded on X claiming “there are a lot of toxic users on Bluesky.” Bluesky itself has no intention of using user content to train generative AI.
X updated its terms of service stating that when users upload content, they permit X to use it for analysis, including using user content to help train machine learning and artificial intelligence models. This change led users to migrate to Bluesky.
Meta’s updated privacy policy also specifies that it trains its models using users’ posts, photos, and captions.
Startups like OpenAI and Anthropic have already exhausted human-generated content to train their models and now rely on synthetic data for their upcoming frontier models.
However, user consent is still essential.
In India, Sarvam AI is using synthetic data created by Meta Llama 3.1 405B to train its model, while OpenAI reportedly uses Strawberry to generate synthetic data for GPT-5.
This sets up a ‘recursive improvement cycle,’ where each GPT version is trained on higher-quality synthetic data created by the previous model..

Read Full Article

4 Likes

Discover more

For uninterrupted reading, download the app