Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

100

Image Credit: Arxiv

Trustworthiness of Stochastic Gradient Descent in Distributed Learning

Distributed learning (DL) uses multiple nodes to accelerate training, enabling efficient optimization of large-scale models.
Stochastic Gradient Descent (SGD) is a key optimization algorithm in DL but communication bottlenecks limit scalability and efficiency.
Compressed SGD techniques are used to address communication overheads, but they introduce trustworthiness concerns, including attacks like gradient inversion and membership inference attacks.
Empirical studies show that compressed SGD demonstrates higher resistance to privacy leakage compared to uncompressed SGD, and the reliability of using membership inference attacks as a metric for assessing privacy risks in distributed learning is questionable.

Read Full Article

6 Likes

Arxiv

340

Image Credit: Arxiv

Efficient Active Imitation Learning with Random Network Distillation

Developing agents for complex and underspecified tasks, where no clear objective exists, remains challenging but offers many opportunities.
The article introduces Random Network Distillation DAgger (RND-DAgger) as an active imitation learning method that uses a learned state-based out-of-distribution measure to trigger interventions.
RND-DAgger reduces the need for constant expert input during training and outperforms traditional imitation learning and other active approaches in 3D video games and a robotic locomotion task.
The method effectively limits expert querying by intervening only when necessary, improving the efficiency of active imitation learning.

Read Full Article

20 Likes

Arxiv

228

Image Credit: Arxiv

Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

Combining gradient compression methods and adaptive optimizers in federated learning.
Introducing specific sketched adaptive federated learning (SAFL) algorithms.
Theoretical convergence analyses show improved communication cost in FL settings.
Empirical studies support the effectiveness of SAFL methods in vision and language tasks.

Read Full Article

13 Likes

Arxiv

372

Image Credit: Arxiv

Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training

Dramatic increases in the capabilities of neural network models in recent years are driven by scaling model size, training data, and corresponding computational resources.
To effectively scale model size, training data, and total computation in large-scale distributed training, careful consideration of hardware configuration and parallelization strategy is critical.
An extensive empirical study of large-scale language model training workloads reveals that certain distributed communication strategies, previously considered sub-optimal, can become preferable at certain scales.
Scaling the total number of hardware accelerators for large model training yields diminishing returns, even with optimized hardware and parallelization strategies, resulting in poor marginal performance per additional unit of power or GPU-hour.

Read Full Article

22 Likes

Discover more

Arxiv

Image Credit: Arxiv

Efficient Spatio-Temporal Signal Recognition on Edge Devices Using PointLCA-Net

Efficient Spatio-Temporal Signal Recognition on Edge Devices Using PointLCA-Net
This paper presents a novel approach that combines PointNet's feature extraction with neuromorphic systems for spatio-temporal signal recognition.
The proposed method involves a two-stage process - feature extraction with PointNet and processing with spiking neural encoder-decoder that employs the Locally Competitive Algorithm (LCA).
PointLCA-Net achieves high recognition accuracy for spatio-temporal data with lower energy burden, enhancing computational efficiency on edge devices.

Read Full Article

Arxiv

308

Image Credit: Arxiv

Broad Critic Deep Actor Reinforcement Learning for Continuous Control

A novel hybrid actor-critic reinforcement learning framework is introduced for continuous control.
The framework integrates the broad learning system (BLS) with deep neural networks (DNNs).
The critic network employs BLS for rapid value estimation via ridge regression.
Experimental results show improved training efficiency and accuracy with the proposed framework.

Read Full Article

18 Likes

Arxiv

Image Credit: Arxiv

Improving Decoupled Posterior Sampling for Inverse Problems using Data Consistency Constraint

Diffusion models have shown strong performances in solving inverse problems through posterior sampling.
Decoupled Posterior Sampling methods have been proposed to address errors in earlier steps.
The proposed Guided Decoupled Posterior Sampling (GDPS) method integrates a data consistency constraint in the reverse process.
Experimental results demonstrate that GDPS achieves state-of-the-art performance, improving accuracy over existing methods.

Read Full Article

2 Likes

Arxiv

Image Credit: Arxiv

Expressivity of Representation Learning on Continuous-Time Dynamic Graphs: An Information-Flow Centric Review

Continuous-Time Dynamic Graphs (CTDGs) are important in many real-world applications, motivating the need for Graph Neural Networks (GNNs) tailored to CTDGs.
This paper provides a comprehensive review of Graph Representation Learning (GRL) on CTDGs, with a focus on Self-Supervised Representation Learning (SSRL).
The authors introduce a theoretical framework that analyzes the expressivity of CTDG models through an Information-Flow (IF) lens, quantifying their ability to propagate and encode temporal and structural information.
The paper also categorizes existing CTDG methods and explores SSRL methods tailored to CTDGs, such as predictive and contrastive approaches, which can reduce the reliance on labeled data.

Read Full Article

3 Likes

Arxiv

384

Image Credit: Arxiv

Choose Your Explanation: A Comparison of SHAP and GradCAM in Human Activity Recognition

Explaining machine learning (ML) models using eXplainable AI (XAI) techniques has become essential to make them more transparent and trustworthy.
A comparative analysis of Shapley Additive Explanations (SHAP) and Gradient-weighted Class Activation Mapping (Grad-CAM) methods in human activity recognition (HAR) is presented.
The study evaluates these methods on real-world datasets, providing insights into their strengths, limitations, and differences.
SHAP and Grad-CAM can complement each other to provide more interpretable and actionable model explanations, enhancing trust and transparency in ML models.

Read Full Article

23 Likes

Arxiv

280

Image Credit: Arxiv

Generative Regression Based Watch Time Prediction for Short-Video Recommendation

Watch time prediction (WTP) is a crucial task for short video recommendation systems.
A novel approach called Generative Regression (GR) framework is proposed for WTP.
GR reformulates WTP as a sequence generation task using structural discretization.
GR outperforms existing techniques significantly according to evaluation and A/B testing.

Read Full Article

16 Likes

Arxiv

196

Image Credit: Arxiv

DFF: Decision-Focused Fine-tuning for Smarter Predict-then-Optimize with Limited Data

Decision-Focused Fine-tuning (DFF) is a novel framework that combines decision-focused learning (DFL) with the predict-then-optimize (PO) approach.
DFF addresses challenges such as deviation from physical significance and non-differentiable or black-box models.
It maintains the proximity of the DL-enhanced model to the original predictive model within a defined trust region.
DFF demonstrates improved decision performance and adaptability to a broad range of PO tasks in diverse scenarios.

Read Full Article

11 Likes

Arxiv

396

Image Credit: Arxiv

Enhancing Offline Reinforcement Learning with Curriculum Learning-Based Trajectory Valuation

The success of deep reinforcement learning (DRL) relies on the availability and quality of training data, often requiring extensive interactions with specific environments.
Offline reinforcement learning (RL) provides a solution in real-world scenarios where data collection is costly and risky by utilizing data collected by domain experts to search for a batch-constrained optimal policy.
Transition Scoring (TS) is introduced as a method to assign scores to transitions based on their similarity to the target domain in mixed datasets, addressing the problem of source-target domain mismatch in offline RL.
Curriculum Learning-Based Trajectory Valuation (CLTV) effectively leverages transition scores to identify and prioritize high-quality trajectories, enhancing the performance and transferability of policies learned by offline RL algorithms.

Read Full Article

23 Likes

Arxiv

336

Image Credit: Arxiv

FedRIR: Rethinking Information Representation in Federated Learning

Mobile and Web-of-Things (WoT) devices generate vast amounts of data for machine learning applications.
Federated Learning (FL) allows clients to collaboratively train a shared model without transferring private data.
Existing FL methods prioritize either global generalization or local personalization, limiting the potential of diverse client data.
The proposed FedRIR framework enhances global generalization and local personalization by rethinking information representation.

Read Full Article

20 Likes

Arxiv

264

Image Credit: Arxiv

Seismic Facies Analysis: A Deep Domain Adaptation Approach

Deep neural networks (DNNs) often fail to generalize on test data sampled from different input distributions when labeled data is scarce.
Unsupervised Deep Domain Adaptation (DDA) techniques have been proven useful in addressing this challenge, particularly when labeled data is unavailable and distribution shifts are observed in the target domain.
In a recent study, seismic images of the F3 block 3D dataset from offshore Netherlands (source domain) and Penobscot 3D survey data from Canada (target domain) were used to evaluate a deep neural network architecture named EarthAdaptNet (EAN).
The EAN achieved high accuracy in semantically segmenting the seismic images and demonstrated the potential for classifying seismic facies classes with high accuracy.

Read Full Article

15 Likes

Arxiv

236

Image Credit: Arxiv

Deep Learning-Based Automatic Diagnosis System for Developmental Dysplasia of the Hip

Researchers have developed a deep learning-based automatic diagnosis system for Developmental Dysplasia of the Hip (DDH).
The system accurately identifies anatomical keypoints from pelvic radiographs and calculates key radiological angles.
It demonstrated superior consistency in angle measurements compared to a cohort of experienced orthopedists.
This AI-powered solution reduces variability and potential errors, providing clinicians with a more reliable tool for DDH diagnosis.

Read Full Article

14 Likes

For uninterrupted reading, download the app