menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

22h

read

33

img
dot

Image Credit: Arxiv

Global explainability of a deep abstaining classifier

  • Researchers have developed a global explainability method for a deep abstaining classifier (DAC) used in the histology prediction task of cancer pathology reports.
  • The DAC framework allows the model to abstain on ambiguous or confusing cases, achieving high accuracy on retained samples but with decreased coverage.
  • By utilizing a local explainability technique, researchers were able to identify sources of errors and gain contextual reasoning for individual predictions.
  • The study suggests strategies such as exclusion criteria and focused annotation to improve the DAC's performance in complex real-world implementations.

Read Full Article

like

2 Likes

source image

Arxiv

22h

read

171

img
dot

Image Credit: Arxiv

Prompting Forgetting: Unlearning in GANs via Textual Guidance

  • State-of-the-art generative models pose ethical and legal challenges to service providers.
  • Text-to-Unlearn is a framework that selectively unlearns concepts from pre-trained GANs using text prompts.
  • It enables feature unlearning, identity unlearning, and tasks like expression and multi-attribute removal.
  • Text-to-Unlearn offers a scalable and efficient solution without requiring additional datasets or fine-tuning.

Read Full Article

like

10 Likes

source image

Arxiv

22h

read

175

img
dot

Image Credit: Arxiv

Gradient-free Continual Learning

  • Continual learning (CL) is a challenge in training neural networks on sequential tasks without catastrophic forgetting.
  • Traditional CL approaches rely on gradient-based optimization using stochastic gradient descent (SGD) or its variants.
  • The limitation of gradient-based CL arises when previous data is not available, resulting in uncontrolled parameter changes and significant forgetting of previously learned tasks.
  • This work explores the use of gradient-free optimization methods as a robust alternative to address forgetting in CL.

Read Full Article

like

10 Likes

source image

Arxiv

22h

read

264

img
dot

Image Credit: Arxiv

AutoML Benchmark with shorter time constraints and early stopping

  • Automated Machine Learning (AutoML) frameworks are evaluated using the AutoML Benchmark (AMLB).
  • AMLB proposed to evaluate frameworks using 1- and 4-hour time budgets.
  • This work argues for considering shorter time constraints in the benchmark for practical value.
  • Evaluations on 104 tasks show consistent rankings across time constraints and greater variety in model performance with early stopping.

Read Full Article

like

15 Likes

source image

Arxiv

22h

read

138

img
dot

Image Credit: Arxiv

Dynamic Graph Structure Estimation for Learning Multivariate Point Process using Spiking Neural Networks

  • Researchers have introduced the Spiking Dynamic Graph Network (SDGN) for modeling and predicting temporal point processes (TPPs).
  • SDGN leverages the temporal processing capabilities of spiking neural networks (SNNs) and spike-timing-dependent plasticity (STDP) to dynamically estimate underlying spatio-temporal functional graphs.
  • Unlike existing methods, SDGN adapts to any dataset by learning dynamic spatio-temporal dependencies directly from the event data, enhancing generalizability and robustness.
  • Evaluations on synthetic and real-world datasets show that SDGN achieves superior predictive accuracy while maintaining computational efficiency.

Read Full Article

like

8 Likes

source image

Arxiv

22h

read

231

img
dot

Image Credit: Arxiv

R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

  • This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks.
  • R2DNs are constructed as a feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, making the models stable and robust to small input perturbations by design.
  • The parameterization of R2DNs is similar to recurrent equilibrium networks (RENs) but does not require iterative solution of an equilibrium layer at each time-step, resulting in faster model evaluation and backpropagation on GPUs.
  • Comparisons of R2DNs to RENs on different problems show that R2DNs have faster training and inference times with similar test set performance, and their scalability outperforms RENs with respect to model expressivity.

Read Full Article

like

13 Likes

source image

Arxiv

22h

read

208

img
dot

Image Credit: Arxiv

Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding

  • A comprehensive framework that enhances Retrieval-Augmented Generation (RAG) systems is presented.
  • The framework integrates Policy-Optimized Retrieval-Augmented Generation (PORAG) and Adaptive Token-Layer Attention Scoring (ATLAS).
  • The techniques improve the utilization and relevance of retrieved content, enhancing factual accuracy and response quality.
  • The framework demonstrates efficiency, scalability, and reduced hallucinations in RAG systems.

Read Full Article

like

12 Likes

source image

Arxiv

22h

read

145

img
dot

Image Credit: Arxiv

Flexible and Explainable Graph Analysis for EEG-based Alzheimer's Disease Classification

  • Alzheimer's Disease is a common form of dementia that causes memory, reasoning, and behavior decline.
  • Researchers have utilized EEG data to identify biomarkers for diagnosing Alzheimer's Disease.
  • A Flexible and Explainable Gated Graph Convolutional Network (GGCN) was proposed for classification.
  • The research achieved high efficacy in distinguishing Alzheimer's patients and healthy individuals.

Read Full Article

like

8 Likes

source image

Arxiv

22h

read

253

img
dot

Image Credit: Arxiv

Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design

  • Mixture-of-Experts (MoE) has successfully scaled up models while maintaining nearly constant computing costs.
  • Efficiency of MoE is challenging to achieve due to imbalanced expert activation and communication overhead.
  • The paper proposes a collaboration-constrained routing (C2R) strategy to improve expert utilization and reduce communication costs.
  • Experiments show an average performance improvement of 0.51% and 0.33% on two MoE models across ten NLP benchmarks.

Read Full Article

like

15 Likes

source image

Arxiv

22h

read

328

img
dot

Image Credit: Arxiv

UniFault: A Fault Diagnosis Foundation Model from Bearing Data

  • Machine fault diagnosis (FD) is a critical task for predictive maintenance, enabling early fault detection and preventing unexpected failures.
  • Existing FD models have limited generalization across diverse datasets, making them operation-specific.
  • UniFault is a foundation model for fault diagnosis that addresses the challenges of diverse and heterogeneous FD datasets.
  • UniFault achieves state-of-the-art performance, setting a new benchmark for fault diagnosis models in predictive maintenance.

Read Full Article

like

19 Likes

source image

Arxiv

22h

read

93

img
dot

Image Credit: Arxiv

De Novo Molecular Design Enabled by Direct Preference Optimization and Curriculum Learning

  • De novo molecular design has extensive applications in drug discovery and materials science.
  • Efficient molecular generation and screening methods are essential for accelerating drug discovery and reducing costs.
  • Direct Preference Optimization (DPO) is adopted from NLP to optimize molecular properties by maximizing the likelihood difference between high- and low-quality molecules.
  • The proposed method achieved excellent scores on the GuacaMol Benchmark and demonstrated practical efficacy in target protein binding experiments.

Read Full Article

like

5 Likes

source image

Arxiv

22h

read

205

img
dot

Image Credit: Arxiv

Cause or Trigger? From Philosophy to Causal Modeling

  • Not much has been written about the role of triggers in the literature on causal reasoning, causal modeling, or philosophy.
  • The paper focuses on describing triggers and causes in the metaphysical sense and differentiating them.
  • A mathematical model and the Cause-Trigger algorithm are proposed to determine whether a process is a cause or a trigger of an effect.
  • The algorithm is demonstrated on the climatological data of two recent cyclones, Freddy and Zazu, successfully detecting triggers for high wind speed.

Read Full Article

like

12 Likes

source image

Arxiv

22h

read

313

img
dot

Image Credit: Arxiv

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

  • Language models often struggle with cross-mode knowledge retrieval, i.e., accessing knowledge learned in one format when queried in another.
  • Models trained on multiple data sources show reduced accuracy when retrieving knowledge in a different format from their original training mode.
  • A controlled study of random token sequence memorization across different modes quantitatively investigates this limitation.
  • CASCADE, a novel pretraining algorithm using cascading datasets with varying sequence lengths, outperforms dataset rewriting approaches and enhances language models' cross-mode knowledge retrieval.

Read Full Article

like

18 Likes

source image

Arxiv

22h

read

347

img
dot

Image Credit: Arxiv

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning

  • Researchers propose a new approach called Probabilistic Curriculum Learning (PCL) for goal-based reinforcement learning (RL)
  • PCL algorithm suggests goals for RL agents in continuous control and navigation tasks
  • The method enables agents to learn skills in a progressive manner, similar to how humans learn
  • Automating goal creation in RL remains a challenge that PCL aims to address

Read Full Article

like

20 Likes

source image

Arxiv

22h

read

294

img
dot

Image Credit: Arxiv

A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown L\'evy Process Dynamics

  • This paper introduces a model-based framework for continuous-time policy evaluation in reinforcement learning, using both Brownian and Lévy noise to model stochastic dynamics affected by rare and extreme events.
  • The approach formulates the problem as solving a partial integro-differential equation (PIDE) to compute the value function, with unknown coefficients.
  • The challenge is accurately recovering the coefficients, especially for Lévy processes with heavy tail effects, which is addressed using a robust numerical approach combining maximum likelihood estimation and iterative tail correction.
  • The method is demonstrated to be effective in recovering heavy-tailed Lévy dynamics and is verified through theoretical error analysis in policy evaluation.

Read Full Article

like

17 Likes

For uninterrupted reading, download the app