LLMs Have Hit a Wall

A naukri.com initiative

New

Home

Data Science News

LLMs Have ...

Analyticsindiamag

297

Image Credit: Analyticsindiamag

LLMs Have Hit a Wall

Safe Superintelligence founder Ilya Sutskever is reportedly working on an alternative approach to scale LLMs, and eventually build safe superintelligence. To tackle the scaling challenge OpenAI plans to scale test-time compute and utilise the high-quality synthetic data generated by previous models. Former OpenAI Co-founder, Andrej Karpathy highlighted that LLMs lack thought process data. LeCun has been working on ‘the next thing’ for a while now at FAIR. Meta plans to launch Llama 4 early next year. The company said that it leverages self-supervised learning (SSL) during its training to help Llama learn broad representations of data. The researchers employed a technique called ‘dictionary learning’, borrowed from classical machine learning, which isolates patterns of neuron activations (called features) that recur across different contexts.
“There is no wall,” claims OpenAI chief Sam Altman. Former OpenAI co-founder and Safe Superintelligence (SSI) founder Ilya Sutskever is reportedly working on an alternative approach to scale LLMs, and eventually build safe superintelligence.
To tackle the scaling challenge the company plans to scale test-time compute and utilise the high-quality synthetic data generated by previous models.
Another former OpenAI co-founder and founder of Eureka Labs, Andrej Karpathy, also highlighted that LLMs lack thought process data, noting that current data is mostly fragmented information.
Meta’s chief AI scientist, Yann LeCun, has been working on ‘the next thing’ for a while now at FAIR. The company is developing a ‘world model’ with reasoning capabilities akin to those of humans and animals.
Meta plans to launch Llama 4 early next year. The company said that it leverages self-supervised learning (SSL) during its training to help Llama learn broad representations of data across domains.
‘Mapping the Mind of a Large Language Model’ explains that LLMs can make analogies, recognize patterns, and even exhibit reasoning abilities by showing how features can be activated to manipulate responses.
In a recent interview, Google DeepMind chief Demis Hassabis explained that Google is focused on more than just scaling. Citing examples of AlphaGo and AlphaZero, he said these use RL agents that learn by interacting with an environment.
Google DeepMind recently published a paper titled Scaling LLM Test-Time Compute Optimally Can Be More Effective than Scaling Model Parameters similar to OpenAI’s o1 strategy. It showed that applying a compute-optimal scaling approach can improve test-time compute efficiency by 2-4x.
Each step takes us closer to a future where these models will truly understand and maybe even surpass our intelligence.

Read Full Article

17 Likes

Discover more

For uninterrupted reading, download the app