menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

3d

read

369

img
dot

Image Credit: Arxiv

Few-shot Learning on AMS Circuits and Its Application to Parasitic Capacitance Prediction

  • Graph representation learning is utilized to extract features from graph-structured data like analog/mixed-signal (AMS) circuits.
  • CircuitGPS, a few-shot learning method, is introduced for predicting parasitic effects in AMS circuits.
  • The method involves pre-training on link prediction and fine-tuning on edge regression, utilizing a hybrid graph Transformer and positional encoding.
  • CircuitGPS enhances coupling existence accuracy by at least 20% and reduces capacitance estimation MAE by at least 0.067, showcasing scalability and applicability to diverse AMS circuit designs.

Read Full Article

like

22 Likes

source image

Arxiv

3d

read

71

img
dot

Image Credit: Arxiv

A Single Merging Suffices: Recovering Server-based Learning Performance in Decentralized Learning

  • Decentralized learning is seen as a scalable alternative to traditional parameter-server-based training, but faces challenges due to limited peer-to-peer communication.
  • Researchers studied how communication is scheduled in decentralized learning and found that concentrating communication in later stages improves global generalization.
  • The study revealed that fully connected communication with a single global merging at the final step can match the performance of server-based training.
  • Theoretical contributions of the research show that globally merged decentralized SGD can converge faster than centralized mini-batch SGD, challenging common beliefs about decentralized learning.

Read Full Article

like

4 Likes

source image

Arxiv

3d

read

99

img
dot

Image Credit: Arxiv

Deep-Learning-Based Pre-Layout Parasitic Capacitance Prediction on SRAM Designs

  • Researchers propose a deep-learning-based model for predicting parasitic capacitance in pre-layout stages of SRAM designs to enhance system energy efficiency.
  • The model utilizes a Graph Neural Network (GNN) classifier and Multi-Layer Perceptron (MLP) regressors to accurately predict parasitics in SRAM circuits.
  • Experiments on 4 real SRAM designs demonstrate that the proposed approach outperforms the state-of-the-art model, reducing prediction error by up to 19 times and speeding up the simulation process by up to 598 times.

Read Full Article

like

5 Likes

source image

Arxiv

3d

read

218

img
dot

Image Credit: Arxiv

The Primacy of Magnitude in Low-Rank Adaptation

  • Low-Rank Adaptation (LoRA) is a parameter-efficient method for fine-tuning large models, addressing shortcomings in existing initialization methods like 'Noise & Zeros'.
  • Update magnitude plays a crucial role in determining LoRA performance, leading to the proposal of a new 'Basis & Basis' initialization scheme called LoRAM, which matches spectral methods' effectiveness without their computational overhead.
  • The research highlights the significance of update magnitudes in low-rank structures and suggests optimization mechanisms like learning rate tuning, scaling factor adjustments, and initialization techniques to regulate magnitudes for better convergence.
  • Extensive experiments support the efficacy of LoRAM as a competitive alternative to spectral initialization, showcasing its efficiency and performance across various benchmarks.

Read Full Article

like

13 Likes

source image

Arxiv

3d

read

338

img
dot

Image Credit: Arxiv

SlimCaching: Edge Caching of Mixture-of-Experts for Distributed Inference

  • Mixture-of-Experts (MoE) models enhance the scalability of large language models by activating relevant experts per input.
  • The high number of expert networks in an MoE model poses storage challenges for edge devices.
  • A study addresses expert caching on edge servers under storage constraints for efficient distributed inference using a Top-K selection strategy.
  • Proposed algorithms aim to minimize latency for expert co-activation within MoE layers, showing improved inference speed in simulations.

Read Full Article

like

20 Likes

source image

Arxiv

3d

read

47

img
dot

Image Credit: Arxiv

From Data-Centric to Sample-Centric: Enhancing LLM Reasoning via Progressive Optimization

  • Researchers introduce LPPO framework to enhance Large Language Models' reasoning capabilities through progressive optimization.
  • LPPO framework leverages a small set of high-quality demonstrations using prefix-guided sampling and learning-progress weighting.
  • Prefix-guided sampling augments data with partial solution prefixes from expert demonstrations to improve policy guidance.
  • Learning-progress weighting adjusts sample influence based on model progression, leading to faster convergence and improved performance on mathematical-reasoning benchmarks.

Read Full Article

like

2 Likes

source image

Arxiv

3d

read

218

img
dot

Image Credit: Arxiv

Generalization in Reinforcement Learning for Radio Access Networks

  • Reaserchers propose a generalization-centered RL framework for RAN control due to challenges posed by dynamic and heterogeneous environments in radio access networks.
  • The framework encodes cell topology and node attributes, applies domain randomization, and uses distributed data generation to improve generalization.
  • Applied to downlink link adaptation in 5G benchmarks, the proposed policy enhances throughput and spectral efficiency by over 10% in various scenarios.
  • The results indicate promising performance gains, offering a scalable architecture for potential future adoption in AI-driven 6G RAN development.

Read Full Article

like

13 Likes

source image

Arxiv

3d

read

345

img
dot

Image Credit: Arxiv

Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation

  • The article discusses a new framework called Denoising Multi-Beta VAE that aims to balance between disentanglement and generation quality in generative models.
  • Traditionally, achieving interpretable latent representations in generative models comes at the expense of generation quality. The $eta$-VAE method introduces a hyperparameter $eta$ to manage the trade-off between disentanglement and reconstruction quality.
  • The Denoising Multi-Beta VAE framework aims to address the disentanglement-reconstruction quality trade-off by utilizing a range of $eta$ values to learn multiple corresponding latent representations. It leverages a non-linear diffusion model to transition between latent representations smoothly.
  • The proposed framework is evaluated for its disentanglement and generation quality, showing promising results in achieving both sharp reconstructions and consistent manipulation of generated outputs with respect to changes in $eta.

Read Full Article

like

20 Likes

source image

Arxiv

3d

read

353

img
dot

Image Credit: Arxiv

Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance

  • Multi-task reinforcement learning aims to efficiently learn multiple tasks simultaneously by leveraging shared information.
  • A new framework called Cross-Task Policy Guidance (CTPG) is introduced to provide guidance for unmastered tasks by utilizing control policies of proficient tasks.
  • CTPG uses guide policies to select behavior policies from various tasks, enhancing training trajectories.
  • Empirical evaluations show that integrating CTPG with existing approaches improves performance in manipulation and locomotion benchmarks.

Read Full Article

like

21 Likes

source image

Arxiv

3d

read

75

img
dot

Image Credit: Arxiv

UniOD: A Universal Model for Outlier Detection across Diverse Domains

  • Outlier detection (OD) is essential in distinguishing inliers and outliers in unlabeled datasets across various domains but often requires dataset-specific tuning and model training.
  • UniOD is introduced as a universal OD framework that uses labeled datasets to create a single model capable of detecting outliers in diverse domains.
  • UniOD transforms datasets into graphs, maintains consistent node features, and treats outlier detection as a node-classification task, enabling generalization to new domains.
  • Evaluation of UniOD on 15 benchmark OD datasets against 15 state-of-the-art approaches showcases its effectiveness in avoiding model tuning, reducing computational costs, and improving accuracy in real-world applications.

Read Full Article

like

4 Likes

source image

Arxiv

3d

read

91

img
dot

Image Credit: Arxiv

Goal-Oriented Skill Abstraction for Offline Multi-Task Reinforcement Learning

  • Offline multi-task reinforcement learning faces challenges in sharing knowledge across tasks.
  • Goal-Oriented Skill Abstraction (GO-Skill) proposed to enhance knowledge transfer and task performance.
  • GO-Skill extracts reusable skills through a goal-oriented process and constructs a discrete skill library using vector quantization.
  • Experiments on robotic manipulation tasks show the effectiveness and versatility of GO-Skill in MetaWorld benchmark.

Read Full Article

like

5 Likes

source image

Arxiv

3d

read

31

img
dot

Image Credit: Arxiv

Deep Disentangled Representation Network for Treatment Effect Estimation

  • Estimating individual-level treatment effect from observational data is a crucial task in causal inference, with applications in various domains.
  • A new algorithm is proposed in this work that uses disentangled representation methods to decompose observed covariates into instrumental, confounding, and adjustment factors.
  • The algorithm incorporates a mixture of experts with multi-head attention and a linear orthogonal regularizer to softly decompose pre-treatment variables and eliminate selection bias through importance sampling re-weighting techniques.
  • Extensive experiments on both public semi-synthetic and real-world datasets demonstrate that the proposed algorithm surpasses existing methods in estimating individual treatment effects.

Read Full Article

like

1 Like

source image

Arxiv

3d

read

39

img
dot

Image Credit: Arxiv

Federated Learning Inspired Fuzzy Systems: Decentralized Rule Updating for Privacy and Scalable Decision Making

  • Fuzzy systems, which manage uncertainty, are being enhanced through machine learning and federated learning techniques.
  • Federated learning offers advantages like improved privacy, reduced networking burden, and decreased latency for model updates.
  • The paper proposes updating fuzzy rules based on federated learning principles to enhance fuzzy systems over time.
  • The improvements discussed require further exploration to assess their full potential in enhancing fuzzy systems.

Read Full Article

like

2 Likes

source image

Arxiv

3d

read

143

img
dot

Image Credit: Arxiv

Heterogeneous Graph Neural Networks for Short-term State Forecasting in Power Systems across Domains and Time Scales: A Hydroelectric Power Plant Case Study

  • Accurate short-term state forecasting is crucial for efficient and stable operation of modern power systems impacted by renewable energy sources.
  • Graph Neural Networks (GNNs) are effective for system state forecasting by leveraging sensor network structures.
  • Heterogeneous Graph Attention Networks are proposed to model both homogeneous and heterogeneous sensor data relationships in multi-domain power systems.
  • Experimental results show that the proposed approach outperforms conventional methods by 35.5% in power system state forecasting accuracy.

Read Full Article

like

8 Likes

source image

Arxiv

3d

read

59

img
dot

Image Credit: Arxiv

Value from Observations: Towards Large-Scale Imitation Learning via Self-Improvement

  • Imitation Learning from Observation (IfO) enables large-scale behavior learning by using action-free demonstrations.
  • Current IfO research typically focuses on idealized scenarios with limited data distributions.
  • This paper introduces a method to learn from more nuanced data distributions, aiming for iterative self-improvement in imitation learning.
  • The study adapts RL-based imitation learning to action-free demonstrations with a value function and highlights the importance of more practical IfO techniques for scalable behavior learning.

Read Full Article

like

3 Likes

For uninterrupted reading, download the app