Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

327

Image Credit: Arxiv

LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits

LASeR (Learning to Adaptively Select Rewards) addresses the challenge of utilizing multiple reward models efficiently when training large language models (LLMs).
It frames reward model selection as a multi-armed bandit problem to iteratively train LLMs using the most suitable reward models for each instance.
LASeR improved LLM training on commonsense, math reasoning, and open-ended instruction-following tasks, showing enhanced accuracy and speed compared to using an ensemble of reward models.
The study demonstrated that LASeR achieved significant performance gains in various tasks, such as boosting average accuracy and efficiency in LLM training as well as improving performance in long-context generation tasks.

Read Full Article

19 Likes

Arxiv

257

Image Credit: Arxiv

Quadratic Gating Mixture of Experts: Statistical Insights into Self-Attention

A new research paper explores the connection between Mixture of Experts (MoE) models and the self-attention mechanism, revealing that each row of a self-attention matrix can be expressed as a quadratic gating mixture of linear experts.
The study conducts a thorough convergence analysis of MoE models using different quadratic gating functions, suggesting that the quadratic monomial gate enhances sample efficiency for parameter estimation compared to the quadratic polynomial gate.
The analysis shows that employing non-linear experts instead of linear ones leads to faster parameter and expert estimation rates. The research proposes an 'active-attention' mechanism by applying a non-linear activation function to the value matrix in the self-attention formula.
Through extensive experiments in tasks such as image classification, language modeling, and time series forecasting, the proposed active-attention mechanism is shown to outperform standard self-attention.

Read Full Article

15 Likes

Arxiv

Image Credit: Arxiv

CHAI for LLMs: Improving Code-Mixed Translation in Large Language Models through Reinforcement Learning with AI Feedback

Large Language Models (LLMs) struggle with code-mixed language understanding despite their success in various NLP tasks.
CHAI is introduced as a framework to enhance multilingual LLMs' capabilities in handling code-mixed languages.
The framework involves using LLMs for accurate annotations, generating preference data, and employing reinforcement learning from AI feedback (RLAIF) for enhancement.
Experimental evaluation demonstrates that CHAI-powered LLMs outperform existing models by 25.66% in code-mixed translation tasks, paving the way for more inclusive code-mixed LLMs.

Read Full Article

1 Like

Arxiv

230

Image Credit: Arxiv

EFKAN: A KAN-Integrated Neural Operator For Efficient Magnetotelluric Forward Modeling

Magnetotelluric forward modeling is important for enhancing the accuracy and efficiency of MT inversion by solving related partial differential equations.
Neural operators (NOs) have been effective in quick MT forward modeling but often use multi-layer perceptrons (MLPs), which may reduce accuracy due to their drawbacks like lack of interpretability and overfitting.
A new neural operator called EFKAN combines Fourier neural operator (FNO) with Kolmogorov-Arnold network to improve MT forward modeling accuracy and explore alternatives to MLPs.
Experimental results show that EFKAN achieves higher accuracy in obtaining resistivity and phase compared to NOs with MLPs and is faster than traditional numerical methods.

Read Full Article

13 Likes

Discover more

Arxiv

Image Credit: Arxiv

Multi-Attribute Steering of Language Models via Targeted Intervention

Inference-time intervention (ITI) is a method for steering large language models (LLM) behavior without updating model parameters.
Multi-Attribute Targeted Steering (MAT-Steer) is introduced to handle conflicts in multi-attribute settings by selectively intervening at the token level.
MAT-Steer uses alignment objectives to shift model representations to reduce conflicts between attributes like helpfulness and toxicity.
MAT-Steer outperforms existing ITI and fine-tuning approaches across question answering and generative tasks.

Read Full Article

2 Likes

Arxiv

191

Image Credit: Arxiv

Transfer Learning for Transient Classification: From Simulations to Real Data and ZTF to LSST

Machine learning is crucial for classifying astronomical transients, but existing approaches have limitations when applied to real data and different surveys.
Transfer learning shows promise in overcoming these challenges by using existing models trained on simulations or data from other surveys.
A model trained on simulated Zwicky Transient Facility (ZTF) data demonstrates that transfer learning can significantly reduce the labeled data needed for real ZTF transients by 95% while maintaining performance.
Transfer learning also enables adapting ZTF models for LSST simulations with 94% performance using only 30% of the training data, promising reliable automated classification for LSST early operations.

Read Full Article

11 Likes

Arxiv

Image Credit: Arxiv

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

Researchers have introduced a new goal specification method called cross-view goal alignment to guide agent interactions in 3D environments.
The method allows users to specify target objects using segmentation masks from their camera views, enhancing spatial reasoning abilities of the agent.
ROCKET-2, a state-of-the-art agent trained in Minecraft, demonstrates improved efficiency and zero-shot generalization capabilities to other 3D environments like Doom, DMLab, and Unreal.
The development of ROCKET-2 includes auxiliary objectives like cross-view consistency loss and target visibility loss to align the agent's behavior with human intent when there are significant differences in camera views.

Read Full Article

1 Like

Arxiv

163

Image Credit: Arxiv

AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model

Researchers have introduced AHCPTQ, a Post-Training Quantization (PTQ) method to address challenges in the Segment Anything Model (SAM) for efficient deployment.
AHCPTQ employs Hybrid Log-Uniform Quantization (HLUQ) for managing post-GELU activations and Channel-Aware Grouping (CAG) to address inter-channel variation in SAM.
The combination of HLUQ and CAG in AHCPTQ enhances quantization effectiveness, hardware efficiency, and compatibility for efficient hardware execution.
AHCPTQ demonstrates significant performance improvements over its floating-point counterpart, achieving 36.6% mAP on instance segmentation with DINO detector, along with speedup and energy efficiency gains in FPGA implementation.

Read Full Article

9 Likes

Arxiv

249

Image Credit: Arxiv

UniF$^2$ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

UniF$^2$ace is a unified multimodal model tailored for fine-grained face understanding and generation, addressing the limitations of existing research in the face domain.
The model is trained on a specialized dataset, UniF$^2$ace-130K, containing image-text pairs and question-answering pairs to cover a wide range of facial attributes.
UniF$^2$ace incorporates diffusion techniques and a mixture-of-experts architecture to optimize both understanding and generation capabilities, surpassing existing UMMs and generative models.
Extensive experiments on UniF$^2$ace-130K demonstrate the model's superior performance in handling fine-grained facial attributes for both understanding and generation tasks.

Read Full Article

15 Likes

Arxiv

187

Image Credit: Arxiv

Terrier: A Deep Learning Repeat Classifier

Terrier is a deep learning model designed to classify repetitive DNA sequences using a curated repeat sequence library trained under the RepeatMasker schema.
The model overcomes challenges in accurate classification of repetitive DNA sequences by leveraging deep learning, providing improved accuracy compared to current methods.
Terrier, trained on Repbase with over 100,000 repeat families, maps 97.1% of Repbase sequences to RepeatMasker categories, offering a comprehensive classification system.
Benchmarked against other models, Terrier demonstrated superior accuracy in model organisms and non-model species, facilitating research on repeat-driven evolution and genomic instability.

Read Full Article

11 Likes

Arxiv

386

Image Credit: Arxiv

Representative Ranking for Deliberation in the Public Sphere

Online platforms are increasingly using algorithmic ranking to improve the quality of public deliberations in comment sections, but this may reduce the visibility of diverse viewpoints.
To address this issue, a new approach called justified representation (JR) is proposed, which aims to incorporate guarantees of representation into ranking methods.
Enforcing JR in comment ranking leads to greater inclusion of diverse viewpoints without compromising user engagement or conversational quality measures.
The study suggests that by utilizing justified representation, platforms can enhance the representation of a variety of viewpoints in online discussions.

Read Full Article

23 Likes

Arxiv

327

Image Credit: Arxiv

Adaptive Elicitation of Latent Information Using Natural Language

The paper discusses the importance of eliciting information to reduce uncertainty about latent entities in various domains using natural language.
Current large language models and fine-tuning algorithms lack mechanisms for strategically gathering information to refine their understanding of latent entities.
A proposed adaptive elicitation framework utilizes a meta-learned language model to actively reduce uncertainty by simulating future observations and quantifying uncertainty in natural language.
Experiments on games and assessments show that the framework outperforms baselines in identifying critical unknowns and improving predictions, showcasing the potential of strategic information gathering in natural language contexts.

Read Full Article

19 Likes

Arxiv

167

Image Credit: Arxiv

Hysteresis-Aware Neural Network Modeling and Whole-Body Reinforcement Learning Control of Soft Robots

Soft robots, designed for surgical applications, pose challenges in modeling and control due to their nonlinear and hysteretic behavior.
Researchers have developed a hysteresis-aware neural network model for accurate prediction of the soft robot's whole-body motion, including its hysteretic behavior.
An on-policy reinforcement learning algorithm is used to train whole-body motion control strategies for the soft robotic system.
The study demonstrates significant MSE reduction and high precision in trajectory tracking for the soft robot, showing promising results for real-world clinical applications.

Read Full Article

10 Likes

Arxiv

191

Image Credit: Arxiv

A Unifying Framework for Robust and Efficient Inference with Unstructured Data

This paper introduces a framework for conducting efficient inference on parameters derived from unstructured data like text, images, audio, and video.
The framework addresses the challenges of bias in predictions made by neural networks and the downstream estimators that rely on structured data extracted from unstructured inputs.
By reframing inference with unstructured data as a missing structured data problem, the framework applies classic results from semiparametric inference to create valid, efficient, and robust estimators.
The framework, known as MAR-S, provides economists with tools to construct unbiased estimators using unstructured data and is demonstrated through the re-analysis of influential studies.

Read Full Article

11 Likes

Arxiv

323

Image Credit: Arxiv

From Gradient Clipping to Normalization for Heavy Tailed SGD

Recent empirical evidence shows that heavy-tailed gradient noise in machine learning challenges standard assumptions of bounded variance in stochastic optimization.
Gradient clipping is commonly used to address heavy-tailed noise, but current theoretical understanding has limitations, such as relying on large clipping thresholds and sub-optimal sampling complexity.
A new approach, Normalized SGD (NSGD), is introduced to overcome these issues by establishing parameter-free sample complexity and improving convergence rates even when problem parameters are known.
The study on NSGD offers improved sample complexities, matching lower bounds for first-order methods, and ensures high-probability convergence with a mild dependence on failure probability.

Read Full Article

19 Likes

For uninterrupted reading, download the app