menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Data Science News

>

Think Twic...
source image

Analyticsindiamag

4w

read

67

img
dot

Image Credit: Analyticsindiamag

Think Twice Before Joining Bluesky

  • Bluesky, a microblogging platform alternative to X and Twitter, has doubled its user base since September to 20 million by November 20.
  • The platform is competing against approximately 611 million monthly active users of Elon Musk’s X and 275 million monthly active users of Meta’s Threads.
  • Bluesky offers an open API, which allows its data to be used for training AI models, unlike X. Daniel van Strien, a machine learning engineer at Hugging Face, recently released a controversial dataset of one million public posts sourced from Bluesky’s Firehose API without user consent.
  • Clem Delangue, CEO of Hugging Face, responded on X claiming “there are a lot of toxic users on Bluesky.” Bluesky itself has no intention of using user content to train generative AI.
  • X updated its terms of service stating that when users upload content, they permit X to use it for analysis, including using user content to help train machine learning and artificial intelligence models. This change led users to migrate to Bluesky.
  • Meta’s updated privacy policy also specifies that it trains its models using users’ posts, photos, and captions.
  • Startups like OpenAI and Anthropic have already exhausted human-generated content to train their models and now rely on synthetic data for their upcoming frontier models.
  • However, user consent is still essential.
  • In India, Sarvam AI is using synthetic data created by Meta Llama 3.1 405B to train its model, while OpenAI reportedly uses Strawberry to generate synthetic data for GPT-5.
  • This sets up a ‘recursive improvement cycle,’ where each GPT version is trained on higher-quality synthetic data created by the previous model..

Read Full Article

like

4 Likes

For uninterrupted reading, download the app