Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

278

Image Credit: Arxiv

Synergizing Reinforcement Learning and Genetic Algorithms for Neural Combinatorial Optimization

Combinatorial optimization problems are difficult due to their discrete structure and large solution space.
Deep reinforcement learning (DRL) has shown the ability to learn heuristics from data but can struggle with limited exploration and local optima.
Genetic Algorithms (GAs) excel in global exploration but are sample inefficient and computationally intensive.
A new framework called Evolutionary Augmentation Mechanism (EAM) combines DRL efficiency with GA's global search power by refining solutions through genetic operations.
EAM enhances exploration and speeds up convergence by integrating evolved solutions back into the policy training loop.
Theoretical analysis ensures stable policy updates by establishing an upper bound on the KL divergence between evolved and policy distributions.
EAM is versatile and can be used with various DRL solvers like Attention Model, POMO, and SymNCO.
Extensive testing on benchmark problems like TSP, CVRP, PCTSP, and OP shows EAM improves solution quality and training efficiency compared to baselines.

Read Full Article

16 Likes

Arxiv

289

Image Credit: Arxiv

Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity

Decentralized learning, allowing model training across scattered agents, is being focused on in signal and information processing.
Generalization errors of decentralized learning algorithms are less explored despite scrutiny on optimization errors.
Understanding generalization errors is vital for assessing model performance on new data for real-world applications.
The paper conducts a detailed analysis of generalization errors in attack-free and Byzantine-resilient decentralized learning with heterogeneous data.
This analysis is carried out under mild assumptions, unlike previous studies focusing on homogeneous data or strict bounded stochastic gradient assumptions.
Results emphasize the impact of data heterogeneity, model initialization, and stochastic gradient noise on decentralized learning's generalization error.
Byzantine attacks by malicious agents notably affect generalization error, primarily tied to data heterogeneity rather than sample size.
Numerical experiments verify the theoretical results on both convex and non-convex tasks.

Read Full Article

17 Likes

Arxiv

Image Credit: Arxiv

Safe Screening Rules for Group SLOPE

Variable selection in high-dimensional sparse learning with group structures is challenging.
Group SLOPE is effective for adaptive selection of predictor groups but faces issues with block non-separable group effects.
Existing methods are either invalid or inefficient in handling these effects, leading to high computational costs and memory usage.
A new safe screening rule tailored for Group SLOPE efficiently identifies inactive groups with zero coefficients by addressing block non-separable group effects.
By excluding inactive groups during training, significant gains in computational efficiency and memory usage are achieved.
The screening rule can be seamlessly integrated into existing solvers for both batch and stochastic algorithms.
Theoretically, the screening rule can be safely employed with existing optimization algorithms, ensuring the same results as the original approaches.
Experimental results show that the method detects inactive feature groups effectively, enhancing computational efficiency without compromising accuracy.

Read Full Article

4 Likes

Arxiv

Image Credit: Arxiv

Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform

The high cost of AI compute infrastructure has led to a rise in managed Model-as-a-service deployments.
Enterprises often share compute infrastructure among different teams for efficiency.
Deployed large language models (LLMs) typically operate on plaintext data.
Data owners are hesitant to use their private data in shared compute environments.
A solution, the Stained Glass Transform, is introduced to provide privacy to LLM input while maintaining model utility.
The Stained Glass Transform is a learned and stochastic transformation of LLM word embeddings.
It aims to theoretically provide privacy using mutual information theory of Gaussian Mixture Models.
A-posteriori privacy estimates based on mutual information are calculated.
The privacy and utility of transformed embeddings are verified through token level privacy metrics and LLM performance benchmarks.

Read Full Article

4 Likes

Discover more

Arxiv

Image Credit: Arxiv

NDCG-Consistent Softmax Approximation with Accelerated Convergence

The research paper introduces novel loss formulations, RG$^2$ and RG$^ imes$, to address the computational overhead and scalability issues associated with Softmax (SM) Loss in ranking tasks.
The RG$^2$ Loss and RG$^ imes$ Loss are derived through Taylor expansions of the SM Loss and reveal connections between different ranking loss paradigms.
The proposed losses are integrated with the Alternating Least Squares (ALS) optimization method to provide convergence rate analyses and generalization guarantees.
Empirical evaluations on real-world datasets show that the new approach achieves comparable or superior ranking performance to SM Loss while accelerating convergence significantly.
The framework contributes theoretical insights and efficient tools for the similarity learning community, suitable for tasks requiring a balance between ranking quality and computational efficiency.

Read Full Article

4 Likes

Arxiv

339

Image Credit: Arxiv

A Unified Theory of Compositionality, Modularity, and Interpretability in Markov Decision Processes

Researchers introduce Option Kernel Bellman Equations (OKBEs) for a new reward-free Markov Decision Process.
OKBEs directly optimize a predictive map called a state-time option kernel (STOK) to maximize goal completion probability while avoiding constraint violations.
STOKs are compositional, modular, and interpretable initiation-to-termination transition kernels for policies in the Options Framework of Reinforcement Learning.
STOKs can be composed using Chapman-Kolmogorov equations for spatiotemporal predictions over long horizons and can be efficiently represented in a factorized and reconfigurable form.
STOKs record probabilities of goal-success and constraint-violation events, crucial for formal verification.
High-dimensional state models can be decomposed using local STOKs and goal-conditioned policies aggregated into a factorized goal kernel for solving complex planning problems.
The approach enables forward-planning at the goal level in high-dimensions, providing flexible agents capable of rapidly synthesizing meta-policies and reusing planning representations.
Option Kernel Bellman Equations (OKBEs) support verifiable long-horizon planning and intrinsic motivation in dynamic high-dimensional world-models.
Researchers argue that reward-maximization conflicts with compositionality, modularity, and interpretability in reinforcement learning.

Read Full Article

20 Likes

Arxiv

Image Credit: Arxiv

Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design

Study on reinforcement learning from human feedback in general Markov decision processes focusing on trajectory-level preference comparisons.
Challenge: Designing algorithms for informative preference queries to identify rewards with theoretical guarantees.
Proposed a meta-algorithm based on randomized exploration to address challenges without computational complexity.
Established regret and last-iterate guarantees under mild reinforcement learning oracle assumptions.
Introduced an improved algorithm that collects batches of trajectory pairs and uses optimal experimental design for informative queries.
Batch structure enables parallelization of preference queries, enhancing practical deployment efficiency.
Empirical evaluation confirms competitiveness with reward-based reinforcement learning using minimal preference queries.

Read Full Article

1 Like

Arxiv

182

Image Credit: Arxiv

Neural Functions for Learning Periodic Signal

Deep neural networks are used as function approximators to represent various signal types, like periodic signals.
Recent approaches involve multi-layer perceptrons (MLPs) to learn nonlinear mappings from coordinates to signals.
MLPs face issues like overfitting and poor generalizability in learning continuous neural representations.
A new architecture is proposed to extract periodic patterns from measurements and enhance signal representation.
The proposed method aims to improve generalization and extrapolation performance for periodic signals.
Experiments demonstrate the effectiveness of the new architecture in learning periodic solutions for differential equations.
The method is also tested on real-world datasets for time series imputation and forecasting.

Read Full Article

11 Likes

Arxiv

Image Credit: Arxiv

Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models

Researchers introduce Athena-PRM, a multimodal process reward model for evaluating reward scores in complex reasoning problems efficiently.
Conventional methods for creating high-performance PRMs require time-consuming step-level annotations, leading to financial investments.
Athena-PRM leverages prediction consistency between weak and strong completers to generate high-quality process-labeled data effectively.
With just 5,000 samples, Athena-PRM shows remarkable effectiveness across different scenarios and benchmarks.
Two strategies, ORM initialization and up-sampling for negative data, are developed to boost PRM performance.
The approach is validated in verification, direct evaluation of reasoning step correctness, and reward ranked fine-tuning scenarios.
Athena-PRM consistently achieves superior performance across various benchmarks, enhancing performance by 10.2 points on WeMath and 7.1 points on MathVista for test time scaling.
It sets the state-of-the-art results in VisualProcessBench and outperforms the previous SoTA by 3.9 F1-score, demonstrating accurate reasoning step assessment.
Athena-7B, developed using Athena-PRM as the reward model, surpasses baseline performance significantly on five benchmarks.

Read Full Article

4 Likes

Arxiv

335

Image Credit: Arxiv

STOAT: Spatial-Temporal Probabilistic Causal Inference Network

STOAT (Spatial-Temporal Probabilistic Causal Inference Network) is a novel framework for probabilistic forecasting in spatial-temporal causal time series (STC-TS) with region-specific temporal observations driven by causally relevant covariates.
The proposed method incorporates a spatial relation matrix to encode interregional dependencies, improving spatially informed causal effect estimation and calibrated uncertainty modeling.
STOAT utilizes deep probabilistic models to estimate distribution parameters and explores multiple output distributions to capture region-specific variability.
Experiments on COVID-19 data from six countries show that STOAT outperforms existing probabilistic forecasting models like DeepAR, DeepVAR, and Deep State Space Model, especially in regions with strong spatial dependencies.
The framework bridges causal inference and geospatial probabilistic forecasting, offering a versatile approach for complex spatial-temporal tasks such as epidemic management.

Read Full Article

20 Likes

Arxiv

361

Image Credit: Arxiv

MOORL: A Framework for Integrating Offline-Online Reinforcement Learning

Offline RL addresses challenges in DRL by learning from pre-collected datasets.
MOORL is a hybrid framework combining offline and online RL for efficient learning.
Meta Offline-Online RL utilizes a meta-policy to adapt across offline and online trajectories.
MOORL improves exploration while leveraging offline data for robust initialization.
The hybrid approach enhances exploration by combining strengths of offline and online data.
MOORL achieves stable Q-function learning without added complexity.
Experiments on 28 tasks validate MOORL's effectiveness over existing baselines.
MOORL shows consistent improvements in performance.
The framework has potential for practical applications with minimal computational overhead.

Read Full Article

21 Likes

Arxiv

129

Image Credit: Arxiv

Beyond Overconfidence: Foundation Models Redefine Calibration in Deep Neural Networks

Deep neural networks need reliable uncertainty calibration for safe deployment in critical applications.
Foundation models like ConvNeXt, EVA, and BEiT have improved predictive performance but their calibration properties are not well understood.
A study investigated the calibration behavior of foundation models, revealing insights that question existing beliefs.
Empirical analysis found that foundation models are often underconfident in in-distribution predictions, leading to higher calibration errors.
However, these models show improved calibration under distribution shifts.
Foundation models respond well to post-hoc calibration techniques in in-distribution scenarios, helping in mitigating underconfidence bias.
But the effectiveness of these techniques diminishes under severe distribution shifts and can sometimes yield counterproductive results.
The study highlights the intricate effects of architectural and training advancements on calibration, challenging the notion of continuous improvement.

Read Full Article

7 Likes

Arxiv

125

Image Credit: Arxiv

Accelerating Large-Scale Regularized High-Order Tensor Recovery

Existing tensor recovery methods do not consider the impact of tensor scale variations on structural characteristics.
Current studies face computational challenges when dealing with large-scale high-order tensor data.
New algorithms leveraging Krylov subspace iteration, block Lanczos bidiagonalization process, and random projection strategies are introduced for low-rank tensor approximation.
The algorithms establish theoretical bounds on the accuracy of the approximation error estimate.
A novel nonconvex modeling framework is created for large-scale tensor recovery, utilizing a new regularization paradigm for insightful prior representation.
Unified nonconvex models and optimization algorithms are developed for various high-order tensor recovery tasks in unquantized and quantized scenarios.
Randomized LRTA schemes are integrated into computations to make the proposed algorithms practical and efficient for large-scale tensor data.
Extensive experiments on large-scale tensors show the effectiveness and superiority of the proposed method over state-of-the-art approaches.

Read Full Article

7 Likes

Arxiv

Image Credit: Arxiv

SparseSSM: Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

State-space language models like Mamba have billions of parameters which hinder deployment.
SparseSSM is introduced as a training-free pruning framework for state space architectures.
SparseSSM extends the optimal brain surgeon framework to state space models.
The algorithm calculates saliency scores to identify redundant parameters and guide pruning.
Component sensitivity analysis is used to identify where redundancy exists in the architecture.
SparseSSM can be extended to semi-structured and structured sparsity.
Empirical results show that 50% of SSM weights can be pruned without fine-tuning, maintaining accuracy.
No zero-shot accuracy loss is observed with SparseSSM, setting a new benchmark for pruning Mamba-based LLMs.

Read Full Article

4 Likes

Arxiv

236

Image Credit: Arxiv

In-Context Bias Propagation in LLM-Based Tabular Data Generation

The study focuses on Large Language Models (LLMs) used for generating tabular data with in-context learning.
LLMs are crucial for data augmentation in scenarios with limited data availability.
Previous research showcased LLMs enhancing task performance by augmenting underrepresented groups.
However, this enhancement often assumes access to unbiased in-context examples.
Real-world data is typically noisy and skewed, differing from ideal scenarios.
The research delves into how biases within in-context examples impact the distribution of synthetic tabular data.
Even subtle biases in in-context examples can cause significant global statistical distortions.
An adversarial situation is introduced where a malicious contributor injects bias via in-context examples, jeopardizing classifier fairness for a specific subgroup.
The study uncovers a vulnerability in LLM-based data generation pipelines when using in-context prompts in sensitive domains.

Read Full Article

14 Likes

For uninterrupted reading, download the app