Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

378

Image Credit: Arxiv

Bandit Social Learning: Exploration under Myopic Behavior

The study focuses on social learning dynamics influenced by online reviews.
Agents follow a myopic behavior without exploration in a multi-armed bandit protocol.
The study derived learning failures for myopic behaviors and provided matching positive results.
The results emphasize the importance of intentional exploration in bandit algorithms.

Read Full Article

22 Likes

Arxiv

306

Image Credit: Arxiv

Universal Scaling Laws of Absorbing Phase Transitions in Artificial Deep Neural Networks

Conventional artificial deep neural networks operating near the phase boundary of signal propagation dynamics exhibit universal scaling laws in non-equilibrium statistical mechanics.
Multilayer perceptrons and convolutional neural networks belong to the mean-field and directed percolation universality classes, respectively.
Finite-size scaling suggests a potential connection to the depth-width trade-off in deep learning.
Hyperparameter tuning to the phase boundary is necessary but insufficient for achieving optimal generalization in deep networks, indicating the importance of nonuniversal metric factors.

Read Full Article

18 Likes

Arxiv

331

Image Credit: Arxiv

Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

Large language models (LLMs) show potential for automating the extraction of molecular interactions in biological systems.
The study evaluates the effectiveness of various LLMs in recognizing protein interactions, identifying genes related to radiation-affected pathways, and delineating gene regulatory relationships.
Larger models demonstrate superior performance, particularly in extracting complex interactions among genes and proteins.
LLMs face challenges in identifying groups with diverse functions and recognizing highly correlated gene regulatory relationships.

Read Full Article

19 Likes

Arxiv

285

Image Credit: Arxiv

Physics-tailored machine learning reveals unexpected physics in dusty plasmas

Dusty plasma is a mixture of ions, electrons, and macroscopic charged particles commonly found in space and planetary environments.
Machine learning models are used to learn the complex forces governing the interactions between particles in a dusty plasma.
A physics-tailored machine learning approach is demonstrated, accounting for symmetries, non-identical particles, and non-reciprocal forces.
The model accurately infers particle masses, reveals deviations from theoretical assumptions, and enables the discovery of new physics.

Read Full Article

17 Likes

Discover more

Arxiv

126

Image Credit: Arxiv

Toward a Theory of Tokenization in LLMs

Tokenization is considered a necessary initial step for designing performant language models.
Transformers trained on certain data processes without tokenization fail to learn the right distribution and predict characters according to a unigram model.
With tokenization, transformers are able to break through this barrier and model the probabilities of sequences drawn from the source near-optimally.
The use of tokenization in language modeling is justified through the study of transformers on Markovian data.

Read Full Article

7 Likes

Arxiv

344

Image Credit: Arxiv

Potential Field Based Deep Metric Learning

A new approach to Deep Metric Learning (DML) is proposed.
The approach represents the influence of each example by a continuous potential field.
Attractive/repulsive potential fields are used to model interactions among embeddings.
The proposed method outperforms state-of-the-art baselines on standard DML benchmarks.

Read Full Article

20 Likes

Arxiv

218

Image Credit: Arxiv

ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

Intent inferral on a hand orthosis for stroke patients is challenging due to the difficulty of data collection. Traditional approaches require a large labeled dataset from the new condition, session, or subject to train intent classifiers. In this paper, the authors propose ChatEMG, an autoregressive generative model that can generate synthetic EMG signals conditioned on prompts. ChatEMG enables them to collect only a small dataset and expand it with synthetic samples conditioned on prompts. Experimental results show that these synthetic samples can improve intent inferral accuracy for different types of classifiers.
The authors demonstrate that their complete approach can be integrated into a single patient session, including the use of the classifier for functional orthosis-assisted tasks.
This is the first time an intent classifier trained partially on synthetic data has been deployed for functional control of an orthosis by a stroke survivor.
Videos, source code, and additional information can be found at https://jxu.ai/chatemg.

Read Full Article

13 Likes

Arxiv

163

Image Credit: Arxiv

Compressing Search with Language Models

Millions of people turn to Google Search each day for information on various topics.
A new approach, SLaM Compression, has been introduced to reduce the dimensionality of search data.
SLaM Compression quantifies search terms using pre-trained language models to create a memory-efficient summary.
CoSMo, a Constrained Search Model, is presented to estimate real-world events using only search data.

Read Full Article

9 Likes

Arxiv

336

Image Credit: Arxiv

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

This paper introduces SigmaRL, an open-source, decentralized framework designed to enhance sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles.
SigmaRL aims to address the limited generalization capacity of RL agents by proposing five strategies to design information-dense observations, focusing on general features applicable to most traffic scenarios.
The RL agents trained using SigmaRL's observation design strategies achieved training times of under one hour on a single CPU.
Evaluation results demonstrate that these RL agents can effectively zero-shot generalize, even in completely unseen traffic scenarios.

Read Full Article

20 Likes

Arxiv

273

Image Credit: Arxiv

RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner

The theoretical analysis of reinforcement learning frameworks for self-taught reasoner (STaR) in large language models (LLMs) is presented.
STaR framework uses reinforcement learning to generate reasoning steps and reduce the dependence on human-labeled data.
The analysis provides a theoretical understanding of the effectiveness of reinforcement learning on chain-of-thought (CoT) reasoning and STaR.
The framework explores criteria for pre-trained models, policy improvement, convergence, and the robustness of STaR in improving reasoning in LLMs.

Read Full Article

16 Likes

Arxiv

134

Image Credit: Arxiv

UQ of 2D Slab Burner DNS: Surrogates, Uncertainty Propagation, and Parameter Calibration

This paper focuses on performing a complete uncertainty quantification analysis of a 2D slab burner direct numerical simulation (DNS).
The study addresses challenges related to developing data-driven surrogate models, propagating parametric uncertainties, and Bayesian calibration of latent heat and chemical reaction parameters.
Two surrogate models, Gaussian Process (GP) and Hierarchical Multiscale Surrogate (HMS), were constructed using ensemble simulations generated via Latin Hypercube sampling.
The study emphasizes the importance of surrogate model selection and parameter calibration in quantifying uncertainty in predictions of fuel regression rates in complex combustion systems.

Read Full Article

8 Likes

Arxiv

411

Image Credit: Arxiv

Marconi: Prefix Caching for the Era of Hybrid LLMs

Hybrid models that combine the language modeling capabilities of Attention layers with the efficiency of Recurrent layers have gained traction for supporting long contexts in Large Language Model serving.
Marconi is a system that supports efficient prefix caching with Hybrid LLMs.
Marconi uses novel admission and eviction policies that assess potential cache entries based on recency, reuse likelihood, and compute savings.
Marconi achieves significantly higher token hit rates compared to state-of-the-art prefix caching systems.

Read Full Article

24 Likes

Arxiv

Image Credit: Arxiv

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

Micro-Action Recognition (MAR) has gained attention for its role in non-verbal communication and emotion analysis.
A novel approach called Prototypical Calibrating Ambiguous Network (PCAN) is proposed to address the ambiguity in MAR.
PCAN employs a hierarchical action-tree to identify and categorize ambiguous samples into distinct sets.
Extensive experiments demonstrate the superior performance of PCAN compared to existing approaches.

Read Full Article

3 Likes

Arxiv

386

Image Credit: Arxiv

Tensor Product Attention Is All You Need

The paper introduces Tensor Product Attention (TPA), a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values compactly.
TPA significantly reduces the memory overhead during inference by shrinking the size of the key-value (KV) cache.
Based on TPA, the Tensor ProducT ATTenTion Transformer (T6) is introduced as a new model architecture for sequence modeling.
T6 outperforms standard Transformer baselines in language modeling tasks, achieving improved model quality and memory efficiency.

Read Full Article

23 Likes

Arxiv

243

Image Credit: Arxiv

Real-time Verification and Refinement of Language Model Text Generation

Large language models (LLMs) sometimes generate factually incorrect answers, posing a critical challenge.
The proposed Streaming-VR approach allows real-time verification and refinement of LLM outputs.
Streaming-VR checks and corrects tokens as they are being generated, ensuring factual accuracy.
Comprehensive evaluations show that Streaming-VR is an efficient solution compared to prior methods.

Read Full Article

14 Likes

For uninterrupted reading, download the app