Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

15h

307

Image Credit: Arxiv

A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning

Large Language Models (LLMs) require enhanced mathematical reasoning for improving AI capabilities.
A new paper introduces a practical training approach that combines Supervised Fine-Tuning (SFT) with Reinforcement Learning (RL) for maximizing accuracy and efficiency.
The methodology involves extending SFT for up to 10 epochs to enhance accuracy and then using RL from online inference (GRPO) to improve token efficiency without compromising performance.
Experiments demonstrate the effectiveness of this approach, resulting in top-tier performance on benchmarks like the AI Mathematical Olympiad and providing a blueprint for developing advanced mathematical reasoning models.

Read Full Article

18 Likes

Arxiv

15h

157

Image Credit: Arxiv

Lightweight Safety Guardrails via Synthetic Data and RL-guided Adversarial Training

Researchers have developed a lightweight safety guardrail framework for language models that outperforms larger counterparts in content moderation tasks.
The framework utilizes synthetic data generation and adversarial training techniques, starting with human-curated seed data that is augmented and paraphrased to create diverse examples.
Adversarial training guided by reinforcement learning helps improve the safety classifier by generating challenging synthetic examples for fine-tuning.
This approach enhances the performance of smaller language models in content moderation, making them efficient and resilient against adversarial attacks.

Read Full Article

9 Likes

Arxiv

15h

307

Image Credit: Arxiv

CAS Condensed and Accelerated Silhouette: An Efficient Method for Determining the Optimal K in K-Means Clustering

Clustering is crucial in data-driven fields, but accuracy is challenging especially in large datasets.
A new method, CAS (Condensed and Accelerated Silhouette), aims to find the optimal k in K-Means clustering efficiently.
The approach combines multiple techniques and statistical methods for better performance in text and image data clustering.
Experimental results indicate significant improvements in computational efficiency and cluster validity compared to traditional methods.

Read Full Article

18 Likes

Arxiv

15h

328

Image Credit: Arxiv

A Comprehensively Adaptive Architectural Optimization-Ingrained Quantum Neural Network Model for Cloud Workloads Prediction

Researchers propose a Comprehensively Adaptive Architectural Optimization-based Variable Quantum Neural Network (CA-QNN) model for accurate cloud workload prediction and resource reservation.
The CA-QNN model integrates quantum computing principles with structural and parametric learning to address challenges faced by traditional neural networks and deep learning models in handling dynamic cloud workloads.
Workload data is converted into qubits and processed through qubit neurons with Controlled NOT-gated activation functions to enhance pattern recognition.
The CA-QNN model outperforms existing methods, achieving significant reductions in prediction errors up to 93.40% and 91.27% when evaluated on heterogeneous cloud workload datasets.

Read Full Article

19 Likes

Discover more

Arxiv

15h

Image Credit: Arxiv

scE$^2$TM: Toward Interpretable Single-Cell Embedding via Topic Modeling

Advances in sequencing technologies have allowed exploration of cellular heterogeneity at single-cell resolution.
Interpretability has become important alongside the increase in complexity of deep learning models.
A new model called scE2TM combines topic modeling for interpretable single-cell embedding learning.
scE2TM provides high-quality cell embeddings, improves clustering performance, and offers strong interpretation with external biological knowledge.

Read Full Article

2 Likes

Arxiv

15h

146

Image Credit: Arxiv

Leveraging Machine Learning and Enhanced Parallelism Detection for BPMN Model Generation from Text

Efficient planning, resource management, and consistent operations often rely on converting textual process documents into formal Business Process Model and Notation (BPMN) models.
Existing approaches, whether rule-based or machine-learning-based, struggle with writing styles and identifying parallel structures in process descriptions.
A new automated pipeline leveraging machine learning and large language models is introduced for extracting BPMN models from text, along with a newly annotated dataset to enhance training by including parallel gateways.
The proposed approach shows promising results in terms of reconstruction accuracy, providing a foundation to speed up BPMN model creation for organizations.

Read Full Article

8 Likes

Arxiv

15h

157

Image Credit: Arxiv

Prediction of Lane Change Intentions of Human Drivers using an LSTM, a CNN and a Transformer

The study focuses on predicting lane change intentions of human drivers using LSTM, CNN, and Transformer networks.
Lane changes of preceding vehicles significantly impact automated vehicle motion planning in complex traffic situations.
Transformer networks outperformed LSTM and CNN in predicting lane change intentions and showed less susceptibility to overfitting.
The accuracy of the method ranged from 82.79% to 96.73% for different input configurations, demonstrating promising performance in predicting human drivers' lane change intentions.

Read Full Article

9 Likes

Arxiv

15h

285

Image Credit: Arxiv

Advances in Machine Learning: Where Can Quantum Techniques Help?

Quantum Machine Learning (QML) is a field combining quantum computing and artificial intelligence to enhance data-driven tasks by leveraging quantum computational advantages.
The review explores how QML can address computational bottlenecks in classical machine learning, particularly for complex datasets.
Key areas of focus include quantum data encoding, learning theory, optimization techniques, and applications in quantum chemistry and sensing.
Challenges like Noisy Intermediate-Scale Quantum (NISQ) devices are discussed, highlighting the need for quantum-native algorithms and improved error correction for practical deployment.

Read Full Article

17 Likes

Arxiv

15h

Image Credit: Arxiv

Two-cluster test

Cluster analysis is a key topic in statistics and machine learning, with the challenge of determining if two sample subsets belong to the same cluster.
Classic two-sample tests used in clustering scenarios can lead to inflated Type-I error rates, necessitating the development of a new approach known as the two-cluster test.
A novel method utilizing boundary points between subsets is introduced to calculate analytical p-values, effectively reducing the Type-I error rate compared to traditional two-sample tests.
Experiments on synthetic and real datasets demonstrate the effectiveness of the proposed two-cluster test in various clustering applications, including tree-based interpretable clustering and significance-based hierarchical clustering.

Read Full Article

Arxiv

15h

113

Image Credit: Arxiv

Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

Discrete diffusion models have become powerful for language modeling, competing with auto-regressive models in training-time scaling.
A new approach using particle Gibbs sampling for inference-time scaling in discrete diffusion models is introduced.
The particle Gibbs sampling algorithm refines diffusion trajectories iteratively using Sequential Monte Carlo to improve text generation.
Empirical results show that this new approach outperforms prior inference-time strategies in reward-guided text generation tasks.

Read Full Article

6 Likes

Arxiv

15h

288

Image Credit: Arxiv

RTNinja: a generalized machine learning framework for analyzing random telegraph noise signals in nanoelectronic devices

Random telegraph noise is a prevalent variability phenomenon in nanoelectronic devices, impacting device reliability and performance.
RTNinja is a new machine learning framework introduced for the unsupervised analysis of random telegraph noise signals.
RTNinja deconvolves complex signals to identify hidden individual sources without prior system knowledge.
The framework consists of two components: LevelsExtractor for denoising and SourcesMapper for inferring source configurations.

Read Full Article

17 Likes

Arxiv

15h

146

Image Credit: Arxiv

KGRAG-Ex: Explainable Retrieval-Augmented Generation with Knowledge Graph-based Perturbations

Retrieval-Augmented Generation (RAG) aims to enhance language models by incorporating external information.
Knowledge graphs (KGs) are introduced in the KGRAG-Ex system to improve factual grounding and explainability.
KGRAG-Ex leverages a domain-specific KG to identify relevant entities and semantic paths for natural language generation.
The system incorporates perturbation-based explanation methods to assess the influence of KG-derived components on generated answers for improved interpretability.

Read Full Article

8 Likes

Arxiv

15h

Image Credit: Arxiv

Ranked Set Sampling-Based Multilayer Perceptron: Improving Generalization via Variance-Based Bounds

Researchers have introduced a new method called RSS-MLP that aims to enhance the generalization ability of Multilayer Perceptron (MLP) neural networks by reducing the variance of empirical loss.
The new method utilizes Ranked Set Sampling (RSS) to create an ordered structure in the training data set, which helps reduce variance compared to the traditional Simple Random Sampling (SRS) method used in bagging.
Theoretical results indicate that the variance of empirical exponential loss and logistic loss estimated by RSS-MLP are smaller than those estimated by SRS.
Comparison experiments on twelve benchmark data sets show that the RSS-MLP method is effective in improving performance under two fusion methods for convex loss functions.

Read Full Article

Arxiv

15h

120

Image Credit: Arxiv

Evaluating SAE interpretability without explanations

Sparse autoencoders (SAEs) and transcoders are important tools for machine learning interpretability.
Measuring the interpretability of SAEs remains challenging due to the lack of consensus on benchmarks.
Current evaluation procedures involve generating single-sentence explanations for each latent, which complicates the assessment process.
A new method has been proposed to assess the interpretability of sparse coders without the need for natural language explanations, aiming for a more direct evaluation approach.

Read Full Article

7 Likes

Arxiv

15h

127

Image Credit: Arxiv

SynBridge: Bridging Reaction States via Discrete Flow for Bidirectional Reaction Prediction

Researchers introduce SynBridge, a bidirectional flow-based generative model for multi-task reaction prediction, focusing on discrete and abrupt changes in chemical reactions like electron transfer and bond formation.
SynBridge utilizes a graph-to-graph transformer network architecture with discrete flow bridges to capture bidirectional chemical transformations between reactants and products, emphasizing discrete states of bonds and atoms.
The proposed method achieves state-of-the-art performance in forward and retrosynthesis tasks on benchmark datasets (USPTO-50K, USPTO-MIT, Pistachio), showcasing its effectiveness.
Experiments, ablation studies, and noise scheduling analysis highlight the advantages of structured diffusion over discrete spaces in reaction prediction, indicating the potential of SynBridge in advancing chemical reaction modeling.

Read Full Article

7 Likes

For uninterrupted reading, download the app