menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Technology News

>

Alibaba re...
source image

VentureBeat

1M

read

397

img
dot

Image Credit: VentureBeat

Alibaba researchers unveil Marco-o1, an LLM with advanced reasoning capabilities

  • Alibaba researchers have announced Marco-o1, a large reasoning model(LRM) that applies advanced reasoning techniques to solve open-ended problems.
  • Marco-o1 is a fine-tuned version of Alibaba's Qwen2-7B-Instruct and uses techniques such as Monte Carlo Tree Search(MCTS) and Reasoning Action Strategies.
  • The model uses MCTS to explore various reasoning paths as it generates response tokens.
  • Marco-o1 also uses a flexible reasoning action strategy that allows for adjusting the granularity of MCTS by defining token numbers generated at each node.
  • The introduction of a reflection mechanism allows the model to identify potential reasoning errors in its thought process.
  • The researchers conducted experiments on several tasks, including multilingual grade school math problems and translated colloquial phrases/slang expressions.
  • Marco-o1 significantly outperformed the base Qwen2-7B model, particularly when the MCTS component was adjusted for single-token granularity.
  • The model is designed to help with scenarios that require deep contextual understanding and do not have well-defined metrics.
  • A partial reasoning dataset accompanies the Marco-o1 release on Hugging Face.
  • The open-source community is also catching up with the private model market, releasing models and datasets that take advantage of inference-time scaling laws.

Read Full Article

like

23 Likes

For uninterrupted reading, download the app