Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

369

Image Credit: Arxiv

Few-shot Learning on AMS Circuits and Its Application to Parasitic Capacitance Prediction

Graph representation learning is utilized to extract features from graph-structured data like analog/mixed-signal (AMS) circuits.
CircuitGPS, a few-shot learning method, is introduced for predicting parasitic effects in AMS circuits.
The method involves pre-training on link prediction and fine-tuning on edge regression, utilizing a hybrid graph Transformer and positional encoding.
CircuitGPS enhances coupling existence accuracy by at least 20% and reduces capacitance estimation MAE by at least 0.067, showcasing scalability and applicability to diverse AMS circuit designs.

Read Full Article

22 Likes

Arxiv

Image Credit: Arxiv

A Single Merging Suffices: Recovering Server-based Learning Performance in Decentralized Learning

Decentralized learning is seen as a scalable alternative to traditional parameter-server-based training, but faces challenges due to limited peer-to-peer communication.
Researchers studied how communication is scheduled in decentralized learning and found that concentrating communication in later stages improves global generalization.
The study revealed that fully connected communication with a single global merging at the final step can match the performance of server-based training.
Theoretical contributions of the research show that globally merged decentralized SGD can converge faster than centralized mini-batch SGD, challenging common beliefs about decentralized learning.

Read Full Article

4 Likes

Arxiv

Image Credit: Arxiv

Deep-Learning-Based Pre-Layout Parasitic Capacitance Prediction on SRAM Designs

Researchers propose a deep-learning-based model for predicting parasitic capacitance in pre-layout stages of SRAM designs to enhance system energy efficiency.
The model utilizes a Graph Neural Network (GNN) classifier and Multi-Layer Perceptron (MLP) regressors to accurately predict parasitics in SRAM circuits.
Experiments on 4 real SRAM designs demonstrate that the proposed approach outperforms the state-of-the-art model, reducing prediction error by up to 19 times and speeding up the simulation process by up to 598 times.

Read Full Article

5 Likes

Arxiv

218

Image Credit: Arxiv

The Primacy of Magnitude in Low-Rank Adaptation

Low-Rank Adaptation (LoRA) is a parameter-efficient method for fine-tuning large models, addressing shortcomings in existing initialization methods like 'Noise & Zeros'.
Update magnitude plays a crucial role in determining LoRA performance, leading to the proposal of a new 'Basis & Basis' initialization scheme called LoRAM, which matches spectral methods' effectiveness without their computational overhead.
The research highlights the significance of update magnitudes in low-rank structures and suggests optimization mechanisms like learning rate tuning, scaling factor adjustments, and initialization techniques to regulate magnitudes for better convergence.
Extensive experiments support the efficacy of LoRAM as a competitive alternative to spectral initialization, showcasing its efficiency and performance across various benchmarks.

Read Full Article

13 Likes

Discover more

Arxiv

338

Image Credit: Arxiv

SlimCaching: Edge Caching of Mixture-of-Experts for Distributed Inference

Mixture-of-Experts (MoE) models enhance the scalability of large language models by activating relevant experts per input.
The high number of expert networks in an MoE model poses storage challenges for edge devices.
A study addresses expert caching on edge servers under storage constraints for efficient distributed inference using a Top-K selection strategy.
Proposed algorithms aim to minimize latency for expert co-activation within MoE layers, showing improved inference speed in simulations.

Read Full Article

20 Likes

Arxiv

Image Credit: Arxiv

From Data-Centric to Sample-Centric: Enhancing LLM Reasoning via Progressive Optimization

Researchers introduce LPPO framework to enhance Large Language Models' reasoning capabilities through progressive optimization.
LPPO framework leverages a small set of high-quality demonstrations using prefix-guided sampling and learning-progress weighting.
Prefix-guided sampling augments data with partial solution prefixes from expert demonstrations to improve policy guidance.
Learning-progress weighting adjusts sample influence based on model progression, leading to faster convergence and improved performance on mathematical-reasoning benchmarks.

Read Full Article

2 Likes

Arxiv

218

Image Credit: Arxiv

Generalization in Reinforcement Learning for Radio Access Networks

Reaserchers propose a generalization-centered RL framework for RAN control due to challenges posed by dynamic and heterogeneous environments in radio access networks.
The framework encodes cell topology and node attributes, applies domain randomization, and uses distributed data generation to improve generalization.
Applied to downlink link adaptation in 5G benchmarks, the proposed policy enhances throughput and spectral efficiency by over 10% in various scenarios.
The results indicate promising performance gains, offering a scalable architecture for potential future adoption in AI-driven 6G RAN development.

Read Full Article

13 Likes

Arxiv

345

Image Credit: Arxiv

Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation

The article discusses a new framework called Denoising Multi-Beta VAE that aims to balance between disentanglement and generation quality in generative models.
Traditionally, achieving interpretable latent representations in generative models comes at the expense of generation quality. The $eta$-VAE method introduces a hyperparameter $eta$ to manage the trade-off between disentanglement and reconstruction quality.
The Denoising Multi-Beta VAE framework aims to address the disentanglement-reconstruction quality trade-off by utilizing a range of $eta$ values to learn multiple corresponding latent representations. It leverages a non-linear diffusion model to transition between latent representations smoothly.
The proposed framework is evaluated for its disentanglement and generation quality, showing promising results in achieving both sharp reconstructions and consistent manipulation of generated outputs with respect to changes in $eta.

Read Full Article

20 Likes

Arxiv

353

Image Credit: Arxiv

Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance

Multi-task reinforcement learning aims to efficiently learn multiple tasks simultaneously by leveraging shared information.
A new framework called Cross-Task Policy Guidance (CTPG) is introduced to provide guidance for unmastered tasks by utilizing control policies of proficient tasks.
CTPG uses guide policies to select behavior policies from various tasks, enhancing training trajectories.
Empirical evaluations show that integrating CTPG with existing approaches improves performance in manipulation and locomotion benchmarks.

Read Full Article

21 Likes

Arxiv

Image Credit: Arxiv

UniOD: A Universal Model for Outlier Detection across Diverse Domains

Outlier detection (OD) is essential in distinguishing inliers and outliers in unlabeled datasets across various domains but often requires dataset-specific tuning and model training.
UniOD is introduced as a universal OD framework that uses labeled datasets to create a single model capable of detecting outliers in diverse domains.
UniOD transforms datasets into graphs, maintains consistent node features, and treats outlier detection as a node-classification task, enabling generalization to new domains.
Evaluation of UniOD on 15 benchmark OD datasets against 15 state-of-the-art approaches showcases its effectiveness in avoiding model tuning, reducing computational costs, and improving accuracy in real-world applications.

Read Full Article

4 Likes

Arxiv

Image Credit: Arxiv

Goal-Oriented Skill Abstraction for Offline Multi-Task Reinforcement Learning

Offline multi-task reinforcement learning faces challenges in sharing knowledge across tasks.
Goal-Oriented Skill Abstraction (GO-Skill) proposed to enhance knowledge transfer and task performance.
GO-Skill extracts reusable skills through a goal-oriented process and constructs a discrete skill library using vector quantization.
Experiments on robotic manipulation tasks show the effectiveness and versatility of GO-Skill in MetaWorld benchmark.

Read Full Article

5 Likes

Arxiv

Image Credit: Arxiv

Deep Disentangled Representation Network for Treatment Effect Estimation

Estimating individual-level treatment effect from observational data is a crucial task in causal inference, with applications in various domains.
A new algorithm is proposed in this work that uses disentangled representation methods to decompose observed covariates into instrumental, confounding, and adjustment factors.
The algorithm incorporates a mixture of experts with multi-head attention and a linear orthogonal regularizer to softly decompose pre-treatment variables and eliminate selection bias through importance sampling re-weighting techniques.
Extensive experiments on both public semi-synthetic and real-world datasets demonstrate that the proposed algorithm surpasses existing methods in estimating individual treatment effects.

Read Full Article

1 Like

Arxiv

Image Credit: Arxiv

Federated Learning Inspired Fuzzy Systems: Decentralized Rule Updating for Privacy and Scalable Decision Making

Fuzzy systems, which manage uncertainty, are being enhanced through machine learning and federated learning techniques.
Federated learning offers advantages like improved privacy, reduced networking burden, and decreased latency for model updates.
The paper proposes updating fuzzy rules based on federated learning principles to enhance fuzzy systems over time.
The improvements discussed require further exploration to assess their full potential in enhancing fuzzy systems.

Read Full Article

2 Likes

Arxiv

143

Image Credit: Arxiv

Heterogeneous Graph Neural Networks for Short-term State Forecasting in Power Systems across Domains and Time Scales: A Hydroelectric Power Plant Case Study

Accurate short-term state forecasting is crucial for efficient and stable operation of modern power systems impacted by renewable energy sources.
Graph Neural Networks (GNNs) are effective for system state forecasting by leveraging sensor network structures.
Heterogeneous Graph Attention Networks are proposed to model both homogeneous and heterogeneous sensor data relationships in multi-domain power systems.
Experimental results show that the proposed approach outperforms conventional methods by 35.5% in power system state forecasting accuracy.

Read Full Article

8 Likes

Arxiv

Image Credit: Arxiv

Value from Observations: Towards Large-Scale Imitation Learning via Self-Improvement

Imitation Learning from Observation (IfO) enables large-scale behavior learning by using action-free demonstrations.
Current IfO research typically focuses on idealized scenarios with limited data distributions.
This paper introduces a method to learn from more nuanced data distributions, aiming for iterative self-improvement in imitation learning.
The study adapts RL-based imitation learning to action-free demonstrations with a value function and highlights the importance of more practical IfO techniques for scalable behavior learning.

Read Full Article

3 Likes

For uninterrupted reading, download the app