menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

RuleReason...
source image

Arxiv

2d

read

105

img
dot

Image Credit: Arxiv

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

  • Rule-based reasoning is a fundamental problem, but variations in rule formats and complexity in real-world applications are challenging.
  • Large reasoning models enhanced by reinforcement learning have shown remarkable capabilities.
  • The effectiveness of small reasoning models in learning rule-based reasoning with generalization across tasks and domains remains an open question.
  • A method called RuleReasoner is introduced to conduct rule-based reasoning with a wide range of tasks and domain-aware dynamic sampling.
  • RuleReasoner resamples training batches by updating sampling weights based on historical rewards to facilitate domain augmentation and flexible learning schedules.
  • Empirical evaluations show that RuleReasoner outperforms leading large reasoning models on in-distribution and out-of-distribution benchmarks.
  • RuleReasoner achieves a significant performance improvement over existing methods on both in-distribution and out-of-distribution tasks.
  • The approach also demonstrates higher computational efficiency compared to previous dynamic sampling methods for reinforcement learning.

Read Full Article

like

6 Likes

For uninterrupted reading, download the app