menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1d

read

50

img
dot

Image Credit: Arxiv

CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging

  • Model merging based on task vectors, i.e. parameter differences between fine-tuned models and a shared base model, is an efficient way to integrate multiple task-specific models into a multitask model without retraining.
  • Recent works have tried to address conflicts between task vectors through sparsification, but they are limited by high parameter overlap and unbalanced weight distribution.
  • To overcome these limitations, the authors propose a framework called CABS (Conflict-Aware and Balanced Sparsification) consisting of Conflict-Aware Sparsification (CA) and Balanced Sparsification (BS).
  • The experiments demonstrate that CABS outperforms state-of-the-art methods in various tasks and model sizes.

Read Full Article

like

3 Likes

source image

Arxiv

1d

read

58

img
dot

Image Credit: Arxiv

Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models

  • Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
  • Human-in-the-loop (HitL) robot deployment has gained significant attention as a semi-autonomous paradigm.
  • A method is proposed to allow diffusion policies to actively seek human assistance only when necessary, reducing reliance on constant human oversight.
  • Experimental results demonstrate that this approach enhances policy performance during deployment.

Read Full Article

like

3 Likes

source image

Arxiv

1d

read

62

img
dot

Image Credit: Arxiv

Starjob: Dataset for LLM-Driven Job Shop Scheduling

  • Large Language Models (LLMs) have shown remarkable capabilities across various domains, but their potential for solving combinatorial optimization problems remains largely unexplored.
  • Researchers have introduced Starjob, the first supervised dataset for the Job Shop Scheduling Problem (JSSP), consisting of 130k instances designed for training LLMs.
  • By leveraging the Starjob dataset, researchers fine-tuned the LLaMA 8B 4-bit quantized model with the LoRA method to develop an end-to-end scheduling approach.
  • Evaluation on standard benchmarks showed that the LLM-based method outperformed traditional Priority Dispatching Rules (PDRs) and achieved notable improvements over state-of-the-art neural approaches, highlighting the potential of LLMs in tackling combinatorial optimization problems.

Read Full Article

like

3 Likes

source image

Arxiv

1d

read

65

img
dot

Image Credit: Arxiv

District Vitality Index Using Machine Learning Methods for Urban Planners

  • City leaders can identify city districts that require revitalization using a Current Vitality Index and a Long-Term Vitality Index.
  • The indexes are based on a carefully curated set of indicators and employ machine learning methods such as K-Nearest Neighbors imputation, Random Forest, and k-means clustering.
  • Current vitality is visualized through an interactive map, while Long-Term Vitality is tracked over 15 years with predictions made using Multilayer Perceptron or Linear Regression.
  • The results show promise in optimizing urban planning and improving citizens' quality of life, with potential for further improvement as more data becomes available.

Read Full Article

like

3 Likes

source image

Arxiv

1d

read

159

img
dot

Image Credit: Arxiv

Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching

  • Deep Reinforcement Learning (RL) models often fail to generalize when changes occur in the environment's observations or task requirements.
  • This paper proposes a zero-shot method for mapping between latent spaces across different agents trained on different visual and task variations.
  • The approach learns a transformation that maps embeddings from one agent's encoder to another agent's encoder without further fine-tuning.
  • The framework preserves high performance under visual and task domain shifts, allowing for more robust reinforcement learning in dynamically changing environments.

Read Full Article

like

9 Likes

source image

Arxiv

1d

read

162

img
dot

Image Credit: Arxiv

Constructing balanced datasets for predicting failure modes in structural systems under seismic hazards

  • Accurate prediction of structural failure modes under seismic excitations is crucial for assessing seismic risk and resilience.
  • A study proposes a framework to construct balanced datasets that include distinct failure modes.
  • The framework consists of three steps, including identifying critical ground motion features, estimating probability densities of failure domains, and generating samples transformed into ground motion time histories.
  • Numerical investigations using different structural models show that the framework effectively addresses dataset imbalance and improves machine learning performance in seismic failure mode prediction.

Read Full Article

like

9 Likes

source image

Arxiv

1d

read

166

img
dot

Image Credit: Arxiv

Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

  • Offline design optimization problem arises in numerous science and engineering applications.
  • Surrogate functions are used to predict and maximize the target objective over candidate designs.
  • A theoretical framework is presented to understand offline black-box optimization.
  • A black-box gradient matching algorithm is proposed to improve surrogate models for offline optimization.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

170

img
dot

Image Credit: Arxiv

Contextual Quantum Neural Networks for Stock Price Prediction

  • Researchers have developed a contextual quantum neural network for stock price prediction using quantum machine learning (QML).
  • The approach incorporates recent trends to predict future stock price distributions, surpassing traditional models that rely solely on historical data.
  • The quantum batch gradient update (QBGU) is introduced as a training technique to improve convergence and accelerate stochastic gradient descent (SGD) in quantum applications.
  • The quantum multi-task learning (QMTL) architecture, specifically the share-and-specify ansatz, enables efficient training and portfolio representation for multiple assets on the same quantum circuit.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

174

img
dot

Image Credit: Arxiv

Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks

  • Many dynamic decision problems, such as robotic control, involve diverse and unknown tasks at training time.
  • Existing approaches like multi-task and meta reinforcement learning struggle to generalize with task diversity.
  • This paper proposes a policy committee approach to address the challenges of task diversity.
  • Experiments demonstrate that the proposed approach outperforms existing baselines in training, generalization, and few-shot learning.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

182

img
dot

Image Credit: Arxiv

When Continue Learning Meets Multimodal Large Language Model: A Survey

  • Recent advancements in Artificial Intelligence have led to the development of Multimodal Large Language Models (MLLMs).
  • Fine-tuning MLLMs for specific tasks often causes performance degradation in the model's prior knowledge domain, known as 'Catastrophic Forgetting'.
  • This review paper provides an overview and analysis of 440 research papers in the field of MLLM continual learning.
  • The paper discusses the challenges and future directions of continual learning in MLLMs, aiming to inspire future research and development in the field.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

186

img
dot

Image Credit: Arxiv

Enhancing Transformer with GNN Structural Knowledge via Distillation: A Novel Approach

  • This paper introduces a knowledge distillation framework to integrate structural knowledge from Graph Neural Networks (GNNs) into Transformer models.
  • GNNs excel in capturing localized topological patterns, while Transformers are better at modeling long-range dependencies and global contextual information.
  • The proposed framework enables the transfer of multiscale structural knowledge, bridging the gap between GNNs and Transformers.
  • The approach establishes a new way of inheriting graph structural biases in Transformer architectures with wide-ranging applications.

Read Full Article

like

11 Likes

source image

Arxiv

1d

read

283

img
dot

Image Credit: Arxiv

Recognition of Dysarthria in Amyotrophic Lateral Sclerosis patients using Hypernetworks

  • Amyotrophic Lateral Sclerosis (ALS) patients often suffer from dysarthria, a decline in speech intelligibility.
  • Existing studies rely on feature extraction and customized convolutional neural networks to recognize dysarthria in ALS patients.
  • This research introduces the use of hypernetworks to recognize dysarthria in ALS patients by generating weights for a target network.
  • Experimental results on the VOC-ALS dataset show that the proposed approach outperforms strong baselines, achieving up to 82.66% accuracy.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

294

img
dot

Image Credit: Arxiv

Evaluating System 1 vs. 2 Reasoning Approaches for Zero-Shot Time-Series Forecasting: A Benchmark and Insights

  • This news article discusses a benchmark study that evaluates the effectiveness of different reasoning strategies for zero-shot time-series forecasting.
  • The study focuses on understanding the applicability and impact of reasoning strategies in zero-shot time-series forecasting, specifically in the context of challenging tasks.
  • The benchmark, called ReC4TS, conducts comprehensive evaluations across datasets in eight domains and covers both unimodal and multimodal forecasting tasks.
  • Insights from the study suggest that self-consistency is the most effective test-time reasoning strategy, and multimodal time-series forecasting benefits more from reasoning strategies compared to unimodal forecasting.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

302

img
dot

Image Credit: Arxiv

Continual Learning-Aided Super-Resolution Scheme for Channel Reconstruction and Generalization in OFDM Systems

  • Researchers propose a novel deep learning-based scheme for efficient OFDM channel estimation in wireless communication systems.
  • The scheme includes a dual-attention-aided super-resolution neural network (DA-SRNN) for channel reconstruction and continual learning (CL)-aided training strategies for generalization.
  • The DA-SRNN utilizes a channel-spatial attention mechanism and a lightweight super-resolution module.
  • The CL-aided training strategies help the neural network adapt to different channel distributions and improve performance.

Read Full Article

like

18 Likes

source image

Arxiv

1d

read

306

img
dot

Image Credit: Arxiv

VAEs and GANs: Implicitly Approximating Complex Distributions with Simple Base Distributions and Deep Neural Networks -- Principles, Necessity, and Limitations

  • This tutorial focuses on the architectures of Variational Autoencoders (VAE) and Generative Adversarial Networks (GAN).
  • Both VAE and GAN utilize simple distributions, such as Gaussians, as a basis.
  • They leverage the nonlinear transformation capabilities of neural networks to approximate complex data distributions.
  • The choice of a simple latent prior introduces limitations.

Read Full Article

like

18 Likes

For uninterrupted reading, download the app