Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

178

Image Credit: Arxiv

Influential Bandits: Pulling an Arm May Change the Environment

A new paper published on arXiv proposes the influential bandit problem, a multi-armed bandit formulation that considers interdependencies and non-stationary environments.
The problem models arm interactions through an unknown interaction matrix that governs the dynamics of arm losses.
The paper establishes regret lower bounds for standard bandit algorithms and introduces a new algorithm based on a lower confidence bound (LCB) estimator.
Empirical evaluations demonstrate the presence of inter-arm influence and confirm the superior performance of the proposed method compared to conventional bandit algorithms.

Read Full Article

10 Likes

Arxiv

332

Image Credit: Arxiv

DrivAer Transformer: A high-precision and fast prediction method for vehicle aerodynamic drag coefficient based on the DrivAerNet++ dataset

A new study proposes the DrivAer Transformer (DAT), a point cloud learning framework for evaluating vehicle aerodynamic performance.
The DAT structure uses the DrivAerNet++ dataset, containing high-fidelity CFD data of 3D vehicle shapes.
DAT enables accurate estimation of air drag directly from 3D meshes, avoiding limitations of traditional methods.
The framework is expected to accelerate the vehicle design process and improve development efficiency.

Read Full Article

20 Likes

Arxiv

304

Image Credit: Arxiv

Millions of States: Designing a Scalable MoE Architecture with RWKV-7 Meta-learner

State-based sequence models like RWKV-7 offer a compelling alternative to Transformer architectures.
RWKV-7 lacks mechanisms for token-parameter interactions and native scalability.
A novel extension to RWKV-7 called Meta-State is proposed, which replaces attention mechanisms with a fully state-driven approach.
Meta-State supports progressive model scaling and offers a flexible framework for efficient and adaptable sequence modeling.

Read Full Article

18 Likes

Arxiv

276

Image Credit: Arxiv

Enabling Automatic Differentiation with Mollified Graph Neural Operators

Physics-informed neural operators offer a powerful framework for learning solution operators of partial differential equations (PDEs) by combining data and physics losses.
The mollified graph neural operator (mGNO) is introduced as the first method to leverage automatic differentiation and compute exact gradients on arbitrary geometries.
mGNO enables efficient training on irregular grids and varying geometries, while allowing seamless evaluation of physics losses at randomly sampled points for improved generalization.
mGNOs demonstrate superior performance compared to finite differences and machine learning baselines when solving PDEs on regular and unstructured grids.

Read Full Article

16 Likes

Discover more

Arxiv

154

Image Credit: Arxiv

SortBench: Benchmarking LLMs based on their ability to sort lists

Sorting is a challenging task for Large Language Models (LLMs) due to weaknesses in faithfully representing input data, logical comparisons, and differentiating between syntax and semantics.
A new benchmark called SortBench for LLMs has been introduced, offering various difficulty levels and easy scalability.
Tests conducted on seven state-of-the-art LLMs, including test-time reasoning models, revealed that even highly capable models like o3-mini can struggle with sorting tasks that involve mixing syntax and semantics.
The models also face difficulties in preserving the faithfulness to input for long lists, often dropping or adding items. Test-time reasoning tends to overthink problems, leading to performance degradation.

Read Full Article

9 Likes

Arxiv

284

Image Credit: Arxiv

Academic Network Representation via Prediction-Sampling Incorporated Tensor Factorization

Accurate representation to an academic network is of great significance to academic relationship mining like predicting scientific impact.
The paper proposes a Prediction-sampling-based Latent Factorization of Tensors (PLFT) model to address the issue of high-dimensional and incomplete academic networks.
The PLFT model includes a cascade LFT architecture to enhance model representation learning ability and a predicting-sampling strategy to more accurately learn the network representation.
Experimental results show that the PLFT model outperforms existing models in predicting unexplored relationships in academic networks.

Read Full Article

17 Likes

Arxiv

304

Image Credit: Arxiv

Towards generalizable single-cell perturbation modeling via the Conditional Monge Gap

Learning the response of single-cells to various treatments offers great potential to enable targeted therapies.
Neural optimal transport (OT) has emerged as a methodological framework for analyzing unpaired single-cell data induced by cell destruction during data acquisition.
The Conditional Monge Gap is proposed as a method that learns OT maps conditionally on arbitrary covariates, such as time, drug treatment, drug dosage, or cell type.
The conditional models show promising generalization performance to unseen treatments, outperforming other models in capturing heterogeneity in the perturbed population.

Read Full Article

18 Likes

Arxiv

277

Image Credit: Arxiv

An Adaptive Clustering Scheme for Client Selections in Communication-Efficient Federated Learning

Federated learning is a decentralized learning architecture that consumes a lot of network transmission resources.
A new adaptive clustering scheme is proposed to reduce communication costs in federated learning.
The scheme dynamically adjusts the number of clusters to find the most ideal grouping results.
Experimental results show a reduction of communication and transmission costs by almost 50% without affecting model accuracy.

Read Full Article

16 Likes

Arxiv

385

Image Credit: Arxiv

DRIP: DRop unImportant data Points -- Enhancing Machine Learning Efficiency with Grad-CAM-Based Real-Time Data Prioritization for On-Device Training

Effective selection methods for model training can reduce labeling effort, optimize on-device training, and enhance model performance.
A novel algorithm using Grad-CAM is introduced for online decision-making on data point retention or discarding.
The algorithm computes a unique DRIP Score to quantify the importance of each data point.
Experimental evaluations show that the approach achieves storage savings of up to 39% while maintaining or surpassing model accuracy.

Read Full Article

23 Likes

Arxiv

121

Image Credit: Arxiv

Proofs as Explanations: Short Certificates for Reliable Predictions

This research explores a model for explainable AI that provides proof as explanations for reliable predictions.
The model defines an explanation as a subset of the training data that can serve as a proof of a prediction's correctness.
The research presents the concept of the robust hollow star number to determine the worst-case size of the smallest certificate achievable.
The study also analyzes worst-case distributional bounds and distribution-dependent bounds for certificate size.

Read Full Article

7 Likes

Arxiv

121

Image Credit: Arxiv

Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash

Large language models (LLMs) are being deployed on mobile devices, but limited DRAM capacity constrains the model size.
ActiveFlow is introduced as an LLM inference framework that enables adaptive DRAM usage for modern LLMs.
ActiveFlow utilizes novel techniques such as cross-layer active weights preloading and sparsity-aware self-distillation.
The framework achieves the performance-cost Pareto frontier compared to existing optimization methods.

Read Full Article

7 Likes

Arxiv

240

Image Credit: Arxiv

PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) is a powerful paradigm for grounding large language models in external knowledge sources.
This paper explores the use of Principal Component Analysis (PCA) to reduce the dimensionality of language model embeddings, addressing scalability challenges in processing large financial text corpora.
By reducing vectors from 3,072 to 110 dimensions, significant speedup in retrieval operations and reduction in index size are achieved, with moderate declines in correlation metrics.
The study highlights the practicality of leveraging classical dimensionality reduction techniques to optimize RAG architectures for knowledge-intensive applications in finance and trading.

Read Full Article

14 Likes

Arxiv

272

Image Credit: Arxiv

Graph Reduction with Unsupervised Learning in Column Generation: A Routing Application

Column Generation (CG) is a popular method for enhancing computational efficiency in large scale Combinatorial Optimization problems.
A new approach combines CG with Graph Neural Network (GNN) and unsupervised learning to reduce the size of the Elementary Shortest Path Problem with Resource Constraints (ESPPRC).
The reduced problem is then solved using local search techniques, resulting in significant improvements in convergence compared to previous reduction techniques.
The method has shown over 9% improvement in objective values for larger instances of Capacitated Vehicle Routing Problems with Time Windows.

Read Full Article

16 Likes

Arxiv

300

Image Credit: Arxiv

BOISHOMMO: Holistic Approach for Bangla Hate Speech

A multi-label Bangla hate speech dataset named BOISHOMMO has been compiled and evaluated.
BOISHOMMO includes categories of hate speech across various dimensions such as race, gender, religion, and politics.
The dataset consists of over two thousand annotated examples, providing a nuanced understanding of hate speech in Bangla.
The dataset aims to improve hate speech detection and analysis studies for low-resource languages.

Read Full Article

18 Likes

Arxiv

Image Credit: Arxiv

Constrained Machine Learning Through Hyperspherical Representation

The problem of ensuring constraints satisfaction on the output of machine learning models is critical for many applications, especially in safety-critical domains.
A novel method called Hypersherical Constrained Representation is proposed to enforce constraints in the output space for convex and bounded feasibility regions.
The method operates on a different representation system, where Euclidean coordinates are converted into hyperspherical coordinates relative to the constrained region, thereby ensuring only feasible points are represented.
Experiments on synthetic and real-world datasets show that the proposed method achieves comparable predictive performance, guarantees 100% constraint satisfaction, and has minimal computational cost at inference time.

Read Full Article

For uninterrupted reading, download the app