menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1w

read

214

img
dot

Image Credit: Arxiv

Neural Approximate Mirror Maps for Constrained Diffusion Models

  • Diffusion models struggle with meeting various constraints inherent in training data.
  • A new approach called neural approximate mirror maps (NAMMs) is proposed for handling general, possibly non-convex constraints in diffusion models.
  • NAMMs learn an approximate mirror map that transforms data into an unconstrained space, enabling reliable generation of valid synthetic data and solving constrained inverse problems.
  • Experimental results demonstrate the improvement in constraint satisfaction using NAMM-based mirror diffusion models (MDMs) compared to unconstrained diffusion models.

Read Full Article

like

12 Likes

source image

Arxiv

1w

read

348

img
dot

Image Credit: Arxiv

DeciMamba: Exploring the Length Extrapolation Potential of Mamba

  • Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length.
  • Mamba, an alternative to Transformers, demonstrates high performance and achieves Transformer-level capabilities with fewer computational resources.
  • The length-generalization capabilities of Mamba are found to be relatively limited.
  • DeciMamba, a context-extension method designed for Mamba, enables the trained model to extrapolate well to longer context lengths without additional training.

Read Full Article

like

20 Likes

source image

Arxiv

1w

read

88

img
dot

Image Credit: Arxiv

Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

  • Graph Neural Networks (GNNs) need reliable tools for explaining their predictions.
  • Existing faithfulness metrics are not interchangeable.
  • Optimizing for faithfulness may not always be a sensible design goal for regular GNN architectures.
  • Faithfulness is tightly linked to out-of-distribution generalization.

Read Full Article

like

5 Likes

source image

Arxiv

1w

read

189

img
dot

Image Credit: Arxiv

Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs

  • MNO (Mamba Neural Operator) is a novel framework that enhances neural operator-based techniques for solving PDEs.
  • MNO establishes a theoretical connection between structured state-space models (SSMs) and neural operators, offering a unified structure that can adapt to diverse architectures.
  • MNO captures long-range dependencies and continuous dynamics more effectively than traditional Transformers, making it a superior framework for PDE-related tasks.
  • Through extensive analysis, MNO has been shown to significantly boost the expressive power and accuracy of neural operators.

Read Full Article

like

11 Likes

source image

Arxiv

1w

read

0

img
dot

Image Credit: Arxiv

Radial Basis Operator Networks

  • Operator networks are designed to approximate nonlinear operators, which provide mappings between infinite-dimensional spaces.
  • The radial basis operator network (RBON) is introduced as the first operator network capable of learning an operator in both the time domain and frequency domain when adjusted to accept complex-valued inputs.
  • The RBON exhibits a small $L^2$ relative test error for in- and out-of-distribution data, with less than $1 imes 10^{-7}$ in some benchmark cases.
  • The RBON maintains small error on out-of-distribution data from entirely different function classes compared to the training data.

Read Full Article

like

Like

source image

Arxiv

1w

read

109

img
dot

Image Credit: Arxiv

Think While You Generate: Discrete Diffusion with Planned Denoising

  • Discrete Diffusion with Planned Denoising (DDPD) is a novel framework that achieves state-of-the-art performance in image and language modeling tasks.
  • DDPD separates the generation process into a planner and a denoiser model, enabling more efficient reconstruction by identifying and denoising corruptions in the optimal order.
  • DDPD outperforms traditional denoiser-only mask diffusion methods on language modeling benchmarks like text8 and OpenWebText, as well as token-based image generation on ImageNet.
  • DDPD reduces the performance gap between diffusion-based and autoregressive methods in the field of language modeling in terms of generative perplexity.

Read Full Article

like

6 Likes

source image

Arxiv

1w

read

134

img
dot

Image Credit: Arxiv

Distillation of Discrete Diffusion through Dimensional Correlations

  • Diffusion models have shown excellent performance in generative modeling but suffer from slow sampling speed.
  • Discrete diffusion models face challenges in capturing dependencies between elements due to the computational cost of processing high-dimensional joint distributions.
  • The proposed method introduces 'mixture' models for discrete diffusion that can capture dimensional correlations while being scalable.
  • Experimental results demonstrate the effectiveness of the method in distilling pretrained discrete diffusion models in image and language domains.

Read Full Article

like

8 Likes

source image

Arxiv

1w

read

277

img
dot

Image Credit: Arxiv

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

  • Model-based reinforcement learning (RL) offers a solution to the data inefficiency of model-free RL algorithms.
  • A new state space model (SSM)-based world model called Drama, with Mamba, achieves efficient training with longer sequences.
  • Drama addresses the challenges of vanishing gradients and capturing long-term dependencies in recurrent neural network (RNN) and transformer-based world models.
  • Drama achieves competitive performance on the Atari100k benchmark using a 7 million-parameter world model, making it accessible for training on standard hardware.

Read Full Article

like

16 Likes

source image

Arxiv

1w

read

58

img
dot

Image Credit: Arxiv

Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting

  • Developing algorithms to differentiate between machine-generated texts and human-written texts has garnered substantial attention in recent years.
  • In the online scenario, the ability to quickly and accurately determine if a source is an LLM (large language model) is crucial to prevent the spread of misinformation and misuse of LLMs.
  • To address the problem of online detection, an algorithm based on sequential hypothesis testing by betting has been developed.
  • Experiments were conducted to demonstrate the effectiveness of the proposed method.

Read Full Article

like

3 Likes

source image

Arxiv

1w

read

348

img
dot

Image Credit: Arxiv

Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review

  • The joint implementation of federated learning (FL) and explainable artificial intelligence (XAI) could allow training models from distributed data and explaining their inner workings while preserving privacy.
  • This scoping review examines publications that explore the interplay between FL and XAI, particularly focusing on model interpretability or post-hoc explanations.
  • Out of the 37 studies analyzed, only one quantitatively examined the impact of FL on model explanations, highlighting a significant research gap.
  • There is a need for more quantitative research and transparent practices to understand the mutual impact and conditions of FL and XAI integration.

Read Full Article

like

20 Likes

source image

Arxiv

1w

read

54

img
dot

Image Credit: Arxiv

Towards Scalable and Deep Graph Neural Networks via Noise Masking

  • Graph Neural Networks (GNNs) have achieved success in graph mining tasks.
  • Scaling GNNs to large graphs is challenging due to high computational and storage costs.
  • The random walk with noise masking (RMask) module addresses limitations of existing model-simplification works for GNNs.
  • RMask allows for exploring deeper GNNs while preserving scalability and achieving a good trade-off between accuracy and efficiency.

Read Full Article

like

3 Likes

source image

Arxiv

1w

read

8

img
dot

Image Credit: Arxiv

Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning

  • Researchers have introduced HiSPO, a hierarchical framework for Continual Reinforcement Learning.
  • The framework addresses the challenge of avoiding forgetting previously acquired knowledge while adapting to new tasks in navigation settings.
  • HiSPO leverages distinct policy subspaces of neural networks for efficient adaptation and preservation of existing knowledge.
  • Experimental results show competitive performance and adaptability in both maze environments and video game-like navigation simulations.

Read Full Article

like

Like

source image

Arxiv

1w

read

168

img
dot

Image Credit: Arxiv

Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions

  • Researchers introduce Centaurus, a class of networks composed of generalized state-space model (SSM) blocks.
  • The SSM operations can be treated as tensor contractions during training.
  • The optimal order of tensor contractions is determined for every SSM block to maximize training efficiency.
  • The Centaurus network outperforms its counterparts in raw audio processing tasks and achieves competitive performance in automatic speech recognition (ASR).

Read Full Article

like

10 Likes

source image

Arxiv

1w

read

289

img
dot

Image Credit: Arxiv

Deep Learning in Early Alzheimer's disease's Detection: A Comprehensive Survey of Classification, Segmentation, and Feature Extraction Methods

  • Alzheimer's disease is a deadly neurological condition that impairs memory and brain functions.
  • Early identification and therapy are crucial for preventing Alzheimer's disease.
  • Deep Learning techniques, such as CNN and RNN, have achieved high accuracy for Alzheimer's disease classification and mild cognitive impairment prediction.
  • This study evaluates Deep Learning algorithms for early Alzheimer's disease detection, identifies research gaps, and informs future research.

Read Full Article

like

17 Likes

source image

Arxiv

1w

read

315

img
dot

Image Credit: Arxiv

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

  • This paper presents a deep learning model-design framework that focuses on non-linear operators in stochastic analysis that leverage temporal structures.
  • The framework, called Causal Neural Operators, uses infinite-dimensional linear metric spaces, such as Banach spaces, as inputs and produces deep learning models for approximating operators with a temporal structure.
  • The models generated by this framework can uniformly approximate H"older or smooth trace class operators that causally map sequences between linear metric spaces.
  • The analysis also reveals new quantitative relationships on the latent state-space dimension of Causal Neural Operators, with implications for finite-dimensional Recurrent Neural Networks.

Read Full Article

like

18 Likes

For uninterrupted reading, download the app