Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

292

Image Credit: Arxiv

GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO

GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO is a framework proposed for training reward models efficiently in Reinforcement Learning from Human Feedback (RLHF).
The framework introduces data augmentation and expansion techniques to train generative reward models on small datasets effectively.
Preference refinement, Chain-of-Thought (CoT) sampling, perplexity-based scoring, and Multi-level Direct Preference Optimization (M-DPO) are key components of this framework.
Experimental results show that the proposed method enhances data efficiency and model performance, enabling few-shot trained reward models to perform comparably to those trained on large-scale datasets.

Read Full Article

17 Likes

Arxiv

158

Image Credit: Arxiv

Tailored Architectures for Time Series Forecasting: Evaluating Deep Learning Models on Gaussian Process-Generated Data

Developments in Deep Learning have improved time series forecasting by modeling complex temporal dependencies.
Research aims to find connections between time series characteristics and model strengths.
A new dataset using Gaussian Processes shows data characteristics for model evaluations.
Introduction of TimeFlex model tailored for handling diverse temporal dynamics compared to existing models.

Read Full Article

9 Likes

Arxiv

154

Image Credit: Arxiv

Propositional Logic for Probing Generalization in Neural Networks

Study investigates generalization behavior of neural architectures (Transformers, Graph Convolution Networks, LSTMs) using propositional logic.
Models were tested on generating satisfying assignments for logical formulas, emphasizing structured and interpretable settings.
While all models performed well in-distribution, generalization to unseen operator combinations, especially negation, remained challenging.
Findings suggest persistent limitations in standard architectures' ability to learn systematic representations of logical operators, indicating a need for stronger inductive biases.

Read Full Article

9 Likes

Arxiv

Image Credit: Arxiv

On Finetuning Tabular Foundation Models

Foundation models in tabular deep learning are a new area of focus.
Recent research highlighted TabPFNv2's superior performance over traditional methods on small datasets.
Full finetuning is found to be the most effective approach for adapting TabPFNv2, improving time-efficiency and effectiveness.
Finetuning alters TabPFNv2's internal mechanisms, enhancing similarity accuracy and improving prediction logic.

Read Full Article

3 Likes

Discover more

Arxiv

Image Credit: Arxiv

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Reinforcement Learning with Verifiable Rewards (RLVR) is effective for training large language models (LLMs) on complex reasoning tasks, such as mathematical problem solving.
The scarcity of human-labeled math problems and limited-verification answers in existing datasets limits the effectiveness of RL training.
A Self-aware Weakness-driven problem Synthesis framework (SwS) is introduced to identify model deficiencies and leverage them for problem augmentation.
SwS systematically identifies model weaknesses, extracts core concepts from failure cases, and synthesizes new problems to strengthen weak areas in subsequent training, resulting in average performance gains on mainstream reasoning benchmarks.

Read Full Article

1 Like

Arxiv

365

Image Credit: Arxiv

Effective Data Pruning through Score Extrapolation

Data pruning techniques are essential for training advanced machine learning models on massive datasets.
Existing pruning techniques often require a full initial training pass, which can be inefficient for single training runs.
A new importance score extrapolation framework has been introduced to predict sample importance with minimal data usage.
The framework demonstrates effectiveness across different datasets and training paradigms, offering scalability for expensive score calculation methods.

Read Full Article

21 Likes

Arxiv

215

Image Credit: Arxiv

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

Test-time scaling aims to enhance Large Language Models (LLMs) reasoning by utilizing more compute at inference time and enabling extrapolation for improved performance on challenging problems.
Existing reasoning models generally do not extrapolate well, but one way to enable extrapolation is by training the LLM to engage in in-context exploration.
In-context exploration involves training the LLM to appropriately utilize its test time by chaining operations and testing multiple hypotheses before providing an answer.
The proposed recipe e3 includes chaining skills, leveraging negative gradients, and coupling task difficulty with training token budget to enable in-context exploration, resulting in improved performance and extrapolation for Large Language Models.

Read Full Article

12 Likes

Arxiv

377

Image Credit: Arxiv

The Decoupled Risk Landscape in Performative Prediction

Performative Prediction deals with scenarios where deploying a model causes a shift in input data distribution.
Visualization of the loss landscape can complement theoretical advances with practical insights.
A new decoupled risk visualization method is introduced for understanding risk landscape with two parameter vectors: model parameters and data parameters.
Extended Performative Prediction is proposed to capture scenarios where the distribution reacts to a model different from the decision-making one.

Read Full Article

22 Likes

Arxiv

370

Image Credit: Arxiv

Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation

Agentic Neural Network(ANN) is a framework that conceptualizes multi-agent collaboration as a layered neural network architecture.
ANN follows a two-phase optimization strategy: Forward Phase dynamically decomposes tasks into subtasks, while the Backward Phase refines collaboration through iterative feedback.
This neuro-symbolic approach enables ANN to create new or specialized agent teams post-training, leading to gains in accuracy and adaptability.
Across four benchmark datasets, ANN outperforms leading multi-agent baselines, demonstrating improved performance and scalability.

Read Full Article

22 Likes

Arxiv

361

Image Credit: Arxiv

Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations

Task vectors are vital for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation.
The Linear Combination Conjecture proposes that task vectors are single in-context demonstrations formed through linear combinations of the original ones, supported by theoretical and empirical evidence.
The emergence of task vectors is shown in linear transformers trained on triplet-formatted prompts through loss landscape analysis.
However, task vectors may fail in representing high-rank mappings, as evidenced on practical LLMs, emphasizing the need for enhancement by injecting multiple vectors into few-shot prompts.

Read Full Article

21 Likes

Arxiv

167

Image Credit: Arxiv

Inverse Design in Distributed Circuits Using Single-Step Reinforcement Learning

The goal of inverse design in distributed circuits is to generate near-optimal designs meeting transfer function specifications.
Existing methods for design exploration in distributed circuits use artificial grids, differentiable evaluation procedures, and specific template topologies.
This paper introduces DCIDA, a design exploration framework that learns a near-optimal design sampling policy for a target transfer function.
DCIDA utilizes a Transformer-based policy network to achieve significant reductions in design error compared to existing approaches.

Read Full Article

10 Likes

Arxiv

Image Credit: Arxiv

Feasibility Study of CNNs and MLPs for Radiation Heat Transfer in 2-D Furnaces with Spectrally Participative Gases

A CNN and MLP are introduced to build a surrogate model for radiative heat transfer in 2-D furnaces with spectrally participative gases.
CNN architecture is adapted for the problem inputs, resulting in a significant speedup and accuracy compared to the classical solver.
The performance of CNN is compared to MLP in terms of speed, accuracy, and robustness to hyper-parameter changes.
Results show CNN outperforms MLP in precision and stability while providing a deeper understanding of model behavior with dataset size analysis.

Read Full Article

3 Likes

Arxiv

345

Image Credit: Arxiv

Neural-Augmented Kelvinlet: Real-Time Soft Tissue Deformation with Multiple Graspers

A novel physics-informed neural simulator is introduced for fast and accurate simulation of soft tissue deformations, crucial for surgical robotics and medical training.
The framework integrates Kelvinlet-based priors into neural simulators, combining Kelvinlets for residual learning and regularization in data-driven soft tissue modeling.
Incorporating large-scale Finite Element Method (FEM) simulations of linear and nonlinear soft tissue responses, the method enhances neural network predictions, improving accuracy, and physical consistency with low latency for real-time performance.
The approach demonstrates effectiveness in accurate surgical maneuvers mimicking standard laparoscopic tissue grasping tools, highlighting Kelvinlet-augmented learning as a powerful strategy for physics-aware soft tissue simulation in surgical applications.

Read Full Article

20 Likes

Arxiv

320

Image Credit: Arxiv

Physics-Informed Teleconnection-Aware Transformer for Global Subseasonal-to-Seasonal Forecasting

TelePiT is a new deep learning architecture designed for improved global Subseasonal-to-Seasonal (S2S) forecasting.
It integrates multi-scale physics and teleconnection awareness to tackle the challenges of modeling complex atmospheric systems and interactions across different scales.
TelePiT comprises Spherical Harmonic Embedding, Multi-Scale Physics-Informed Neural ODE, and Teleconnection-Aware Transformer as its key components.
Extensive experiments show that TelePiT outperforms existing data-driven models and operational weather prediction systems, achieving a significant 57.7% reduction in RMSE for 2-meter temperature forecasts.

Read Full Article

19 Likes

Arxiv

194

Image Credit: Arxiv

CaliciBoost: Performance-Driven Evaluation of Molecular Representations for Caco-2 Permeability Prediction

Caco-2 permeability prediction is crucial for drug absorption in early-stage drug discovery.
Study analyzed the impact of different molecular feature representation types on Caco-2 permeability prediction.
PaDEL, Mordred, and RDKit descriptors found to be effective for Caco-2 prediction.
CaliciBoost model, based on AutoML, achieved the best Mean Absolute Error (MAE) performance.

Read Full Article

11 Likes

For uninterrupted reading, download the app