menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

4d

read

6

img
dot

Image Credit: Arxiv

A Novel Frequency-Spatial Domain Aware Network for Fast Thermal Prediction in 2.5D ICs

  • A novel frequency-spatial dual domain aware prediction network (FSA-Heat) is proposed for fast and accurate thermal prediction in 2.5D ICs.
  • The network integrates a high-to-low frequency and spatial domain encoder to capture global thermal features and achieve high-to-low frequency and global-to-local thermal dissipation feature extraction.
  • A frequency-spatial hybrid loss is designed to attenuate high-frequency thermal gradient noise and spatial misalignments.
  • The experimental results show significant performance enhancements, outperforming the GCN+PNA method with over 99% RMSE reduction and 4.23X inference time speedup.

Read Full Article

like

Like

source image

Arxiv

4d

read

225

img
dot

Image Credit: Arxiv

A Pre-Training and Adaptive Fine-Tuning Framework for Graph Anomaly Detection

  • Graph anomaly detection (GAD) is challenging due to scarcity of abnormal nodes and high cost of label annotations.
  • Graph pre-training has emerged as an effective approach for label-efficient learning in GAD, but the mix of homophily and heterophily in anomalies requires selective filters for individual nodes.
  • The PAF framework, Pre-Training and Adaptive Fine-tuning, is proposed to address the challenges in GAD by implementing joint training with low- and high-pass filters in the pre-training phase, and using a gated fusion network during fine-tuning.
  • Experiments on ten benchmark datasets consistently demonstrate the effectiveness of PAF in graph anomaly detection.

Read Full Article

like

13 Likes

source image

Arxiv

4d

read

367

img
dot

Image Credit: Arxiv

Generative emulation of chaotic dynamics with coherent prior

  • Data-driven emulation of nonlinear dynamics is challenging due to skill decay and unrealistic outputs.
  • Generative modeling with coherent priors aims to improve the quality of generated simulations.
  • The method presented in this work, Cohesion, unifies turbulence principles with diffusion-based modeling.
  • Cohesion demonstrates superior long-range forecasting skill and can generate physically-consistent simulations.

Read Full Article

like

22 Likes

source image

Arxiv

4d

read

384

img
dot

Image Credit: Arxiv

Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning

  • This paper introduces a reinforcement learning (RL) framework for optimizing numerical precision in the conjugate gradient (CG) method.
  • The framework models precision selection as a Markov Decision Process (MDP), using Q-learning to assign precision levels to operations and optimize the trade-off between computational efficiency and numerical accuracy.
  • The algorithm is trained on a set of data and can subsequently perform precision selection on out-of-sample data without retraining.
  • Results demonstrate the effectiveness of RL in improving solver performance and the potential for AI-driven advancements in scientific computing.

Read Full Article

like

23 Likes

source image

Arxiv

4d

read

225

img
dot

Image Credit: Arxiv

SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM

  • SRPO is presented as a cross-domain implementation of large-scale reinforcement learning on Large Language Models (LLMs).
  • Recent advances in reasoning models, such as OpenAI's o1 and DeepSeek's R1, demonstrate the potential of RL in enhancing the reasoning capabilities of LLMs.
  • SRPO surpasses the performance of DeepSeek-R1-Zero-32B on the AIME24 and LiveCodeBench benchmarks using the same base model (Qwen2.5-32B) without prior Supervised Fine-Tuning (SFT).
  • SRPO introduces a two-stage cross-domain training paradigm and History Resampling (HR) technique, which address the development of mathematical reasoning and coding proficiency, as well as ineffective samples.

Read Full Article

like

13 Likes

source image

Arxiv

4d

read

251

img
dot

Image Credit: Arxiv

Learning and Generating Diverse Residential Load Patterns Using GAN with Weakly-Supervised Training and Weight Selection

  • The paper proposes a Generative Adversarial Network-based Synthetic Residential Load Pattern (RLP-GAN) generation model.
  • RLP-GAN leverages an over-complete autoencoder to capture dependencies within complex and diverse load patterns.
  • A model weight selection method is incorporated to address the mode collapse problem and generate load patterns with high diversity.
  • The results demonstrate that RLP-GAN outperforms state-of-the-art models in capturing temporal dependencies and generating load patterns with higher similarity to real data.

Read Full Article

like

15 Likes

source image

Arxiv

4d

read

258

img
dot

Image Credit: Arxiv

Learning to Score

  • This paper discusses a scenario where target labels are not available, but related side information is present.
  • The authors propose a scoring model that combines representation learning, side information, and metric learning.
  • The model can be useful in various domains, such as healthcare, to create severity scores for diseases with undefined progression criteria.
  • The scoring system is tested on benchmark datasets and biomedical patient records.

Read Full Article

like

15 Likes

source image

Arxiv

4d

read

278

img
dot

Image Credit: Arxiv

Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation

  • Advances in self-distillation have shown that when knowledge is distilled from a teacher to a student using the same deep learning (DL) architecture, the student performance can surpass the teacher particularly when the network is overparameterized and the teacher is trained with early stopping.
  • This paper proposes to train only one model and generate multiple diverse teacher representations using distillation-time dropout.
  • To overcome noisy representations, a novel stochastic self-distillation (SSD) training strategy is introduced, which uses student-guided knowledge distillation (SGKD) to filter and weight teacher representations.
  • Experimental results show that the proposed SSD method outperforms state-of-the-art methods without increasing the model size, incurs negligible computational complexity, and achieves superior performance on various datasets.

Read Full Article

like

16 Likes

source image

Arxiv

4d

read

399

img
dot

Image Credit: Arxiv

Local distribution-based adaptive oversampling for imbalanced regression

  • Imbalanced regression occurs when continuous target variables have skewed distributions, creating sparse regions that are difficult for machine learning models to predict accurately.
  • Existing approaches often rely on arbitrary thresholds to categorize samples as rare or frequent, ignoring the continuous nature of target distributions.
  • To address these limitations, the proposed approach called LDAO (Local Distribution-based Adaptive Oversampling) learns the global distribution structure by decomposing the dataset into a mixture of local distributions and models each distribution independently before merging them into a balanced training set.
  • In extensive evaluations, LDAO outperforms state-of-the-art oversampling methods on both frequent and rare target values, demonstrating its effectiveness for addressing the challenge of imbalanced regression.

Read Full Article

like

24 Likes

source image

Arxiv

4d

read

210

img
dot

Image Credit: Arxiv

Improving RL Exploration for LLM Reasoning through Retrospective Replay

  • A new algorithm named Retrospective Replay-based Reinforcement Learning (RRL) has been proposed to improve RL exploration for large language models (LLMs).
  • During the early stages of training, LLMs exhibit strong exploratory capabilities, but are limited in their ability to solve complex problems.
  • RRL introduces a dynamic replay mechanism throughout the training process, allowing the model to revisit and re-explore promising states identified in the early stages.
  • Experimental results show that RRL significantly enhances the effectiveness of RL in optimizing LLMs for complicated reasoning tasks.

Read Full Article

like

12 Likes

source image

Arxiv

4d

read

213

img
dot

Image Credit: Arxiv

Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator

  • Large language model (LLM) pruning with fixed N:M structured sparsity significantly limits the expressivity of the sparse model, yielding sub-optimal performance.
  • A flexible layer-wise outlier-density-aware N:M sparsity (FLOW) selection method allows for higher representational freedom in LLMs by simultaneously accounting for the presence and distribution of outliers, resulting in an accuracy improvement of up to 36% compared to existing alternatives.
  • The introduction of a flexible, low-overhead digital compute-in-memory architecture (FlexCiM) enables diverse sparsity patterns in sparse models by adaptively aggregating and disaggregating smaller sub-macros, achieving up to 1.75x lower inference latency and 1.5x lower energy consumption compared to existing sparse accelerators.
  • The code for the project is available at: https://github.com/FLOW-open-project/FLOW

Read Full Article

like

12 Likes

source image

Arxiv

4d

read

225

img
dot

Image Credit: Arxiv

Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data

  • Differentially private (DP) machine learning often relies on the availability of public data for tasks like privacy-utility trade-off estimation, hyperparameter tuning, and pretraining.
  • For tabular data, the assumption of public data may not hold due to heterogeneity across domains.
  • To address this, the proposal is to generate surrogate public data from schema-level specifications without accessing sensitive records.
  • Experiments demonstrate that surrogate public tabular data can effectively replace traditional public data for tasks such as pretraining differentially private tabular classifiers.

Read Full Article

like

13 Likes

source image

Arxiv

4d

read

329

img
dot

Image Credit: Arxiv

Learning Enhanced Structural Representations with Block-Based Uncertainties for Ocean Floor Mapping

  • Accurate ocean modeling and coastal hazard prediction require high-resolution bathymetric data.
  • Existing deep learning methods face difficulties in producing detailed ocean floor maps with consistent structure and quantifiable uncertainties.
  • A novel uncertainty-aware mechanism using spatial blocks and block-based conformal prediction is proposed in this work.
  • Experimental results show increased reconstruction quality and improved reliability of uncertainty estimation, benefiting climate modeling and coastal hazard assessment.

Read Full Article

like

19 Likes

source image

Arxiv

4d

read

339

img
dot

Image Credit: Arxiv

Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts

  • Training conversational question-answering systems with in-domain data is challenging due to its scarcity.
  • Traditional top-down methods use a large language model to generate multi-turn dialogues, but lack content control and are susceptible to hallucinations.
  • A bottom-up approach is introduced, generating QA pairs first and then combining them into coherent dialogues, offering greater control and precision.
  • Human and automated evaluations show that the bottom-up approach produces more realistic and higher-quality dialogues compared to top-down methods.

Read Full Article

like

20 Likes

source image

Arxiv

4d

read

72

img
dot

Image Credit: Arxiv

Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

  • The rapid growth of healthcare data and advances in computational power have accelerated the adoption of artificial intelligence (AI) in medicine.
  • To address potential disparities in healthcare AI, a novel gradient reconciliation framework called FairGrad has been proposed.
  • FairGrad balances predictive performance and multi-attribute fairness optimization in healthcare AI models by projecting each gradient vector onto the orthogonal plane of the others.
  • FairGrad achieved statistically significant improvements in multi-attribute fairness metrics while maintaining competitive predictive accuracy in real-world healthcare datasets.

Read Full Article

like

4 Likes

For uninterrupted reading, download the app