menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

How LLMs L...
source image

Medium

4h

read

277

img
dot

Image Credit: Medium

How LLMs Learn: The Pre-Training Phase Explained

  • Large language models (LLMs) learn during the pre-training phase by being fed a huge amount of text to understand language rules and context.
  • Common Crawl provides data from 250 billion web pages for pre-training, but preprocessing to remove noise is crucial.
  • Tokenization breaks text into manageable tokens for numerical processing, with methods like Byte Pair Encoding (BPE) being common.
  • Models like GPT-4o use subword-based tokenization to handle large vocabularies more efficiently.
  • Training involves Next Token Prediction and Masked Language Modeling to learn language structure and relationships between tokens.
  • Base models learn to generate text one token at a time and serve as a starting point for further fine-tuning.
  • Base models can memorize text patterns but may struggle with reasoning tasks due to limited structured understanding.
  • In-context memory allows base models to adjust responses based on the provided context, demonstrating versatility without fine-tuning.
  • Base models excel in replicating text based on memorized patterns but may lack originality and deep reasoning abilities.
  • In the pre-training phase, LLMs develop foundational skills by learning from raw data before advanced techniques are applied for post-training.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app