menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

22h

read

156

img
dot

Image Credit: Arxiv

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks

  • Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
  • Deep learning theory seeks to understand how neural networks learn hierarchical features.
  • This study focuses on three-layer neural networks and their richer feature learning capabilities.
  • They present a theorem that bounds sample complexity and width needed for low test error when the target has hierarchical structure.

Read Full Article

like

9 Likes

source image

Arxiv

22h

read

119

img
dot

Image Credit: Arxiv

Data-Driven Knowledge Transfer in Batch $Q^*$ Learning

  • In data-driven decision-making, knowledge transfer can help address data scarcity in new ventures.
  • The authors propose a framework of Transferred Fitted $Q$-Iteration algorithm for knowledge transfer.
  • The framework enables direct estimation of the optimal action-state function using both target and source data.
  • The approach shows improved learning error rates compared to single task learning, both theoretically and empirically.

Read Full Article

like

7 Likes

source image

Arxiv

22h

read

313

img
dot

Image Credit: Arxiv

Learning Actionable Counterfactual Explanations in Large State Spaces

  • Recourse generators provide actionable insights, often through feature-based counterfactual explanations (CFEs).
  • Introducing three novel recourse types grounded in real-world actions: high-level continuous (hl-continuous), high-level discrete (hl-discrete), and high-level ID (hl-id) CFEs.
  • Proposing data-driven CFE generation approaches that quickly provide optimal CFEs for new agents.
  • Empirical evaluation shows the effectiveness of the proposed forms of recourse over low-level CFEs.

Read Full Article

like

18 Likes

source image

Arxiv

22h

read

291

img
dot

Image Credit: Arxiv

Stochastic Reservoir Computers

  • Reservoir computing is a form of machine learning that utilizes nonlinear dynamical systems to perform complex tasks in a cost-effective manner.
  • Recent advancements in reservoir computing include the use of inherently stochastic reservoirs, such as quantum reservoir computing.
  • This paper investigates the universality of stochastic reservoir computers by using a stochastic system for reservoir computing.
  • The study proves that stochastic reservoir computers are universal approximating classes and demonstrates improved performance compared to deterministic reservoir computers in certain cases.

Read Full Article

like

17 Likes

source image

Arxiv

22h

read

37

img
dot

Image Credit: Arxiv

Fundamental computational limits of weak learnability in high-dimensional multi-index models

  • This paper examines the theoretical boundaries of efficient learnability in multi-index models.
  • The focus is on the minimum sample complexity required for weakly recovering their low-dimensional structure.
  • The findings uncover conditions for learning trivial subspaces, easy subspaces, and interactions between different directions.
  • The theory builds on the optimality of approximate message-passing among first-order iterative methods.

Read Full Article

like

2 Likes

source image

Arxiv

22h

read

18

img
dot

Image Credit: Arxiv

DeformTime: Capturing Variable Dependencies with Deformable Attention for Time Series Forecasting

  • DeformTime is a neural network architecture for multivariable time series (MTS) forecasting.
  • It uses deformable attention blocks (DABs) to capture correlated temporal patterns and improve prediction accuracy.
  • DeformTime performs well on various MTS data sets, outperforming previous methods and reducing mean absolute error by 7.2% on average.
  • The architecture shows consistent performance gains across longer forecasting horizons.

Read Full Article

like

1 Like

source image

Arxiv

22h

read

11

img
dot

Image Credit: Arxiv

Recurrent Stochastic Configuration Networks for Temporal Data Analytics

  • This paper introduces the concept of recurrent stochastic configuration networks (RSCNs) for temporal data analytics.
  • RSCNs are developed to solve problems in domains like time-series forecasting and control engineering.
  • The RSCN model is different from the well-known echo state networks (ESNs) and has unique properties.
  • Numerical results show that the proposed RSCN performs favorably compared to other methods.

Read Full Article

like

Like

source image

Arxiv

22h

read

167

img
dot

Image Credit: Arxiv

Amelia: A Large Dataset and Model for Airport Surface Movement Forecasting

  • The growing demand for air travel necessitates advancements in air traffic management technologies to ensure safe and efficient operations.
  • The absence of large-scale curated surface movement datasets in the public domain has hindered the development of scalable and generalizable approaches.
  • The proposal of the Amelia framework, which includes a large dataset of airport surface movement, a transformer-based baseline for trajectory forecasting, and a training and evaluation benchmark.
  • The framework and tools have been released to encourage further aviation research in the forecasting domain and beyond.

Read Full Article

like

10 Likes

source image

Arxiv

22h

read

164

img
dot

Image Credit: Arxiv

Hyper-Compression: Model Compression via Hyperfunction

  • The researchers propose a novel approach called hyper-compression for model compression.
  • Hyper-compression represents the parameters of the target network using dynamic systems as hyperfunctions.
  • This approach offers a preferable compression ratio, no post-hoc retraining, affordable inference time, and short compression time.
  • The hyper-compression method achieves close-to-int4-quantization performance with less than 1% performance drop.

Read Full Article

like

9 Likes

source image

Arxiv

22h

read

320

img
dot

Image Credit: Arxiv

DEPT: Decoupled Embeddings for Pre-training Language Models

  • Language Model pre-training uses broad data mixtures to enhance performance across domains and languages.
  • DEPT proposes a communication-efficient pre-training framework that decouples embeddings from the transformer body.
  • DEPT can handle significant data heterogeneity and minimize token embedding parameters.
  • DEPT improves transformer body plasticity, generalization, and overall performance.

Read Full Article

like

19 Likes

source image

Arxiv

22h

read

175

img
dot

Image Credit: Arxiv

What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias

  • Knowledge Distillation is a commonly used Deep Neural Network (DNN) compression method, which often maintains overall generalization performance.
  • Even for balanced image classification datasets, as many as 41% of the classes are statistically significantly affected by distillation when comparing class-wise accuracy.
  • Increasing the distillation temperature improves the distilled student model's fairness, potentially surpassing the fairness of the teacher model at high temperatures.
  • Distillation can have uneven effects on certain classes and play a significant role in fairness, requiring caution when using distilled models for sensitive applications.

Read Full Article

like

10 Likes

source image

Arxiv

22h

read

156

img
dot

Image Credit: Arxiv

Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models

  • Fine-tuning text-to-image diffusion models with human feedback is an effective method for aligning model behavior with human intentions.
  • A novel automated data filtering algorithm called FiFA is proposed to enhance the fine-tuning of diffusion models using human feedback datasets with preference optimization.
  • FiFA selects data based on preference margin, text quality, and text diversity, ensuring informative samples and prevention of harmful content.
  • Experimental results show that FiFA significantly improves training stability and achieves better performance with reduced data usage.

Read Full Article

like

9 Likes

source image

Arxiv

22h

read

123

img
dot

Image Credit: Arxiv

A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations

  • A unified framework called Generalized Forward-Inverse (GFI) is proposed for subsurface imaging.
  • The framework aims to solve forward and inverse problems using deep learning techniques.
  • GFI encompasses previous works and introduces two new model architectures, Latent U-Net and Invertible X-Net.
  • The proposed models achieve state-of-the-art performance on synthetic datasets and show promise on real-world data.

Read Full Article

like

7 Likes

source image

Arxiv

22h

read

309

img
dot

Image Credit: Arxiv

AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design

  • AgentForge is a flexible low-code platform designed to optimize any parameter set in a reinforcement learning (RL) system.
  • Existing optimization-as-a-service platforms are impractical for RL systems due to the need for manual user mapping of parameters.
  • AgentForge simplifies the optimization process by allowing users to define an optimization problem in a few lines of code.
  • The platform has been evaluated for performance in a vision-based RL problem.

Read Full Article

like

18 Likes

source image

Arxiv

22h

read

108

img
dot

Image Credit: Arxiv

CGKN: A Deep Learning Framework for Modeling Complex Dynamical Systems and Efficient Data Assimilation

  • Deep learning models present challenges for simultaneous data assimilation (DA) in complex dynamical systems.
  • The Conditional Gaussian Koopman Network (CGKN) framework addresses these challenges by integrating ensemble-based DA methods with deep learning.
  • CGKN transforms nonlinear systems into neural differential equations with conditional Gaussian structures.
  • CGKN proves effective for prediction and DA in strongly nonlinear and non-Gaussian systems.

Read Full Article

like

6 Likes

For uninterrupted reading, download the app