menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

15h

read

62

img
dot

Image Credit: Arxiv

Time-Series Forecasting via Topological Information Supervised Framework with Efficient Topological Feature Learning

  • Topological Data Analysis (TDA) is used for extracting features from complex data structures.
  • Integration of TDA with time-series prediction faces challenges related to temporal dependencies and computational bottlenecks.
  • The Topological Information Supervised (TIS) Prediction framework proposes using neural networks and CGANs to generate synthetic topological features.
  • TIS models, TIS-BiGRU and TIS-Informer, outperform conventional predictors in capturing short-term and long-term temporal dependencies.

Read Full Article

like

3 Likes

source image

Arxiv

15h

read

171

img
dot

Image Credit: Arxiv

Accelerating High-Efficiency Organic Photovoltaic Discovery via Pretrained Graph Neural Networks and Generative Reinforcement Learning

  • A new framework combining graph neural networks and reinforcement learning has been proposed to design high-efficiency organic photovoltaic (OPV) molecules.
  • The integrated approach includes large-scale pretraining of graph neural networks and a Generative Pretrained Transformer 2 (GPT-2) based reinforcement learning strategy.
  • The proposed approach has predicted efficiencies approaching 21%, and provides design guidelines for enhancing power conversion efficiency (PCE).
  • To support further discovery, the largest open-source OPV dataset is being built, and collaboration with experimental teams is planned for synthesizing and characterizing AI-designed molecules.

Read Full Article

like

10 Likes

source image

Arxiv

15h

read

186

img
dot

Image Credit: Arxiv

An extension of linear self-attention for in-context learning

  • In-context learning is a key characteristic of transformers.
  • Self-attention mechanism in transformers lacks flexibility in certain tasks.
  • Linear self-attention is extended by introducing a bias matrix.
  • The extended linear self-attention enables flexible matrix manipulations.

Read Full Article

like

11 Likes

source image

Arxiv

15h

read

204

img
dot

Image Credit: Arxiv

Conformal uncertainty quantification to evaluate predictive fairness of foundation AI model for skin lesion classes across patient demographics

  • Deep learning based diagnostic AI systems based on medical images are starting to provide similar performance as human experts.
  • Lack of transparency in complex AI systems hinders their adoption in high-risk applications like healthcare.
  • Conformal analysis is being deployed to address the problem of lack of transparency in foundation models for skin lesion classification.
  • The method of conformal analysis provides coverage guarantee at population level and uncertainty score for each individual, making it a helpful fairness metric for evaluating AI models.

Read Full Article

like

12 Likes

source image

Arxiv

15h

read

285

img
dot

Image Credit: Arxiv

When Counterfactual Reasoning Fails: Chaos and Real-World Complexity

  • Counterfactual reasoning is often seen as the 'holy grail' of causal learning, but its reliability in real-world complex settings is largely unexplored.
  • This work investigates the limitations of counterfactual reasoning within the framework of Structural Causal Models.
  • Realistic assumptions, such as model uncertainty and chaotic dynamics, can lead to counterintuitive outcomes.
  • This study encourages caution when applying counterfactual reasoning in chaotic and uncertain situations.

Read Full Article

like

17 Likes

source image

Arxiv

15h

read

40

img
dot

Image Credit: Arxiv

An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU function

  • Nonlinear matrix decomposition with the ReLU function finds application in various fields.
  • The standard ReLU-NMD model minimizes the least squares error while the Latent-ReLU-NMD model introduces a latent variable to achieve a different low-rank solution.
  • The 3B-ReLU-NMD model allows elimination of the rank constraint in Latent-ReLU-NMD.
  • A novel extrapolated variant, eBCD-NMD, of block coordinate descent (BCD) for 3B-ReLU-NMD is proven to be convergent and offers significant acceleration.

Read Full Article

like

2 Likes

source image

Arxiv

15h

read

40

img
dot

Image Credit: Arxiv

Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation

  • A new method called CE-LoRA (communication-efficient federated LoRA adaptation) has been introduced to address challenges in fine-tuning pre-trained foundation models in federated learning.
  • CE-LoRA utilizes a tri-factorization low-rank adaptation approach with personalized model parameter aggregation.
  • By introducing a small-size dense matrix and considering client similarity, CE-LoRA reduces communication cost and achieves comparable empirical performance.
  • Experiments show that CE-LoRA significantly reduces communication overhead, improves performance under non-iid data conditions, and enhances data privacy protection.

Read Full Article

like

2 Likes

source image

Arxiv

15h

read

259

img
dot

Image Credit: Arxiv

An End-to-End Comprehensive Gear Fault Diagnosis Method Based on Multi-Scale Feature-Level Fusion Strategy

  • An integrated intelligent method of fault diagnosis for gears using acceleration signals is proposed.
  • The method is based on Gabor-based Adaptive Short-Time Fourier Transform (Gabor-ASTFT) and Dual-Tree Complex Wavelet Transform(DTCWT) algorithms.
  • The proposed method incorporates a dilated residual structure and a feature fusion layer for multi-scale analysis of fault features.
  • Comparative experiments demonstrate the effectiveness of the proposed method for end-to-end fault diagnosis of gears.

Read Full Article

like

15 Likes

source image

Arxiv

15h

read

358

img
dot

Image Credit: Arxiv

DiffScale: Continuous Downscaling and Bias Correction of Subseasonal Wind Speed Forecasts using Diffusion Models

  • Renewable resources can benefit from skillful subseasonal to seasonal (S2S) wind speed forecasts.
  • DiffScale is a diffusion model that enhances S2S wind speed predictions by downscaling and correcting forecast errors.
  • DiffScale can super-resolve spatial information for continuous downscaling factors and lead times, without auto-regression or sequence prediction.
  • Synthetic experiments showed that DiffScale significantly improves wind speed prediction quality, outperforming baselines up to week 3.

Read Full Article

like

21 Likes

source image

Arxiv

15h

read

270

img
dot

Image Credit: Arxiv

Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations

  • This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines.
  • For Discriminative models, the study examines various architectures and hyperparameters during training and inference and identifies energy-efficient practices.
  • For Generative AI, the study focuses on Large Language Models (LLMs) and assesses energy consumption across different model sizes and varying service requests.
  • The results indicate that optimizing architectures, hyperparameters, and hardware can significantly reduce energy consumption for Discriminative models without sacrificing performance.

Read Full Article

like

16 Likes

source image

Arxiv

15h

read

351

img
dot

Image Credit: Arxiv

Noise-based reward-modulated learning

  • Recent advances in reinforcement learning (RL) have led to significant improvements in task performance.
  • Noise-based alternatives like reward-modulated Hebbian learning (RMHL) have been proposed, but their performance has been limited in scenarios with delayed rewards.
  • A novel noise-based learning rule has been derived, which combines directional derivative theory and Hebbian-like updates, enabling efficient, gradient-free learning in RL.
  • The proposed method significantly outperforms RMHL and is competitive with backpropagation-based baselines, making it suitable for low-power and real-time applications.

Read Full Article

like

21 Likes

source image

Arxiv

15h

read

69

img
dot

Image Credit: Arxiv

Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving

  • Key-Value cache ( exttt{KV} exttt{cache}) compression has emerged as a promising technique to optimize Large Language Model (LLM) serving.
  • The paper comprehensively reviews existing algorithmic designs and benchmark studies, identifying missing performance measurement aspects that hinder practical adoption.
  • Representative exttt{KV} exttt{cache} compression methods are evaluated, uncovering issues that affect computational efficiency and end-to-end latency.
  • Tools are provided to aid future exttt{KV} exttt{cache} compression studies and facilitate practical deployment in production.

Read Full Article

like

4 Likes

source image

Arxiv

15h

read

95

img
dot

Image Credit: Arxiv

CITRAS: Covariate-Informed Transformer for Time Series Forecasting

  • CITRAS is a patch-based Transformer that addresses challenges in covariate-informed time series forecasting.
  • It leverages multiple targets and covariates, considering both past and future forecasting horizons.
  • CITRAS introduces two novel mechanisms: Key-Value (KV) Shift and Attention Score Smoothing.
  • Experimental results show that CITRAS achieves state-of-the-art performance in both covariate-informed and multivariate forecasting.

Read Full Article

like

5 Likes

source image

Arxiv

15h

read

204

img
dot

Image Credit: Arxiv

Bayesian Predictive Coding

  • Bayesian Predictive Coding (BPC) is a Bayesian extension to the influential theory of Predictive Coding (PC) in information processing in the brain.
  • BPC estimates a posterior distribution over network parameters, allowing for better quantification of epistemic uncertainty.
  • Compared to PC, BPC converges in fewer epochs in the full-batch setting and remains competitive in the mini-batch setting.
  • BPC provides a biologically plausible method for Bayesian learning in the brain and offers attractive uncertainty quantification in deep learning.

Read Full Article

like

12 Likes

source image

Arxiv

15h

read

277

img
dot

Image Credit: Arxiv

Accelerated Airfoil Design Using Neural Network Approaches

  • This paper demonstrates the use of Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs) to predict airfoil shapes from targeted pressure distribution and vice versa.
  • The dataset used in this study consists of 1600 airfoil shapes simulated at various Reynolds numbers and angles of attack.
  • The refined models show improved efficiency and reduced training time compared to the CNN model for complex datasets.
  • The proposed CNN and DNN models show promising results and have the potential to accelerate aerodynamic optimization and design of high-performance airfoils.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app