Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

345

Image Credit: Arxiv

LLM-guided Chemical Process Optimization with a Multi-Agent Approach

Chemical process optimization is essential for efficiency and economic performance.
A multi-agent framework using large language models (LLMs) autonomously infers operating constraints from minimal process descriptions.
The framework, named AutoGen, demonstrated competitive performance with conventional optimization methods while achieving better computational efficiency.
The approach shows potential for scenarios where operational constraints are poorly defined, particularly for emerging processes and retrofit applications.

Read Full Article

20 Likes

Arxiv

321

Image Credit: Arxiv

Interpretable Representation Learning for Additive Rule Ensembles

Small additive ensembles of symbolic rules that offer interpretable prediction models traditionally use rule conditions based on threshold propositions, resulting in axis-parallel polytopes as decision regions.
A new approach introduces logical propositions with learnable sparse linear transformations of input variables, enabling decision regions as general polytopes with oblique faces.
The proposed learning method utilizes a sequential greedy optimization based on logistic regression to efficiently construct rule ensembles with reduced model complexity across benchmark datasets.
Experimental results show that the new method achieves the same test risk as state-of-the-art methods while decreasing model complexity.

Read Full Article

19 Likes

Arxiv

Image Credit: Arxiv

Model State Arithmetic for Machine Unlearning

Large language models trained on web data may include problematic datapoints.
Complete retraining to eliminate such datapoints is computationally prohibitive.
A new algorithm called MSA proposes an efficient way to estimate and undo the influence of individual datapoints by leveraging model checkpoints.
Experimental results show that MSA outperforms existing machine unlearning algorithms, suggesting it could lead to more flexible large language models capable of data erasure.

Read Full Article

5 Likes

Arxiv

349

Image Credit: Arxiv

Antibody Design and Optimization with Multi-scale Equivariant Graph Diffusion Models for Accurate Complex Antigen Binding

Antibody design is challenging for complex antigens with diverse binding interfaces.
Current computational methods face limitations in capturing geometric features and generalizing novel antigen interfaces.
AbMEGD is a new framework proposed to address these challenges by integrating Multi-scale Equivariant Graph Diffusion models.
Experiments show AbMEGD's improved amino acid recovery, percentage improvement, and reduced root mean square deviation, establishing a new benchmark for antibody design.

Read Full Article

21 Likes

Discover more

Arxiv

280

Image Credit: Arxiv

SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

A new method called SharpZO has been proposed for fine-tuning vision language models without the need for backpropagation, making them suitable for memory-constrained edge devices.
SharpZO utilizes a sharpness-aware two-stage optimization process that includes a global exploration stage using evolutionary strategies and a fine-grained local search phase with zeroth-order optimization.
The approach solely relies on forward passes during optimization and has shown significant improvements in accuracy and convergence speed compared to existing forward-only methods, achieving up to a 7% average gain in experiments on CLIP models.

Read Full Article

16 Likes

Arxiv

Image Credit: Arxiv

Distilling Normalizing Flows

Explicit density learners are gaining popularity as generative models for their ability to model probability distributions, offering advantages over Generative Adversarial Networks.
Normalizing flows use bijective functions to make complex probability functions manageable, but can be challenging to train and may have lower sampling quality.
Novel knowledge distillation techniques are introduced to improve sampling quality and density estimation in smaller student normalizing flows.
The study explores knowledge distillation in Compositional Normalizing Flows, showing significant performance gains and increased throughput with smaller models.

Read Full Article

Arxiv

148

Image Credit: Arxiv

TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence

TRIDENT is a new framework for molecular representation learning that integrates molecular SMILES, textual descriptions, and taxonomic functional annotations to learn rich molecular representations.
The framework utilizes a volume-based alignment objective to align tri-modal features globally and introduces a local alignment objective to capture detailed relationships between molecular substructures and their corresponding sub-textual descriptions.
TRIDENT achieves state-of-the-art performance on 11 downstream tasks, showcasing the benefits of combining multiple modalities for molecular property prediction.
The article presents a new approach in molecular property prediction that takes into account textual and taxonomic information, leading to improved performance across various tasks.

Read Full Article

8 Likes

Arxiv

Image Credit: Arxiv

Little By Little: Continual Learning via Self-Activated Sparse Mixture-of-Rank Adaptive Learning

Continual learning with large pre-trained models is challenging due to catastrophic forgetting and task interference.
A new approach called MoRA is proposed to address challenges like interference, redundancy, and ambiguity in existing Mixture-of-Experts (MoE) methods.
MoRA utilizes a Mixture-of-Rank Adaptive learning approach with self-activated and sparse rank activation to improve continual learning tasks with pre-trained models like CLIP and large language models (LLMs).
The proposed MoRA approach demonstrates effectiveness in enhancing continual learning with pre-trained models, improving generalization, and mitigating forgetting.

Read Full Article

2 Likes

Arxiv

Image Credit: Arxiv

An Information-Theoretic Analysis for Federated Learning under Concept Drift

Recent studies in federated learning (FL) have focused on static datasets, but in real-world scenarios, data often arrives in streams with shifting distributions, leading to concept drift and performance degradation.
A new paper presents an information-theoretic analysis of FL performance under concept drift, introducing the concept of Stationary Generalization Error to evaluate a model's ability to capture characteristics of future unseen data.
The paper models concept drift as a Markov chain and proposes an algorithm that incorporates KL divergence and mutual information to mitigate performance degradation caused by drift patterns like periodic, gradual, and random changes in data distribution.
Experimental results using a Raspberry Pi4 FL testbed validate the proposed algorithm, showing improved performance over existing approaches in adapting to concept drift, highlighting the importance of considering shifting data distributions in FL.

Read Full Article

1 Like

Arxiv

Image Credit: Arxiv

RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Modern deep architectures rely on large-scale datasets, leading to high computational costs.
Data selection can help reduce redundancy in datasets, improving training efficiency.
The concept of epsilon-sample cover quantifies sample redundancy based on inter-sample relationships.
RL-Selector introduces a reinforcement learning approach to data selection, outperforming existing methods in enhancing generalization performance and training efficiency.

Read Full Article

1 Like

Arxiv

317

Image Credit: Arxiv

Efficient Skill Discovery via Regret-Aware Optimization

Unsupervised skill discovery in reinforcement learning aims to learn diverse behaviors efficiently.
Existing methods focus on diversity through exploration, mutual information optimization, and temporal representation learning.
A new regret-aware method is proposed, framing skill discovery as a min-max game of skill generation and policy learning.
Experimental results demonstrate the method's outperformance of baselines in efficiency and diversity, with a 15% zero-shot improvement in high-dimensional environments.

Read Full Article

19 Likes

Arxiv

192

Image Credit: Arxiv

FedDAA: Dynamic Client Clustering for Concept Drift Adaptation in Federated Learning

Federated Learning (FL) faces challenges from concept drift where client data distributions change over time.
Existing FL methods focus on real drift but struggle with virtual and label drift, leading to catastrophic forgetting.
FedDAA is introduced as a dynamic clustered FL framework to address multi-source concept drift by incorporating modules for cluster number determination, real drift detection, and concept drift adaptation.
Experiments demonstrate that FedDAA outperforms state-of-the-art methods with significant accuracy improvements on datasets like Fashion-MNIST, CIFAR-10, and CIFAR-100.

Read Full Article

11 Likes

Arxiv

357

Image Credit: Arxiv

Enhancing LLM Tool Use with High-quality Instruction Data from Knowledge Graph

Teaching large language models (LLMs) to use tools effectively is crucial for improving their problem-solving abilities and expanding their applications.
Previous methods of generating instruction data for LLMs lacked in quality, as they relied on the LLMs themselves.
A new method proposed in this paper uses knowledge graphs to create high-quality instruction data for LLMs by extracting query pathways and translating relationships into actionable tools.
Experiments show that fine-tuning LLMs on synthetic data generated through knowledge graphs can significantly enhance their tool utilization and overall capabilities.

Read Full Article

21 Likes

Arxiv

100

Image Credit: Arxiv

Chain-of-Thought Enhanced Shallow Transformers for Wireless Symbol Detection

Transformers have shown potential in wireless communication problem-solving through in-context learning, but current models require many layers for satisfactory performance, leading to high costs.
A new approach, CHOOSE, enhances shallow Transformers for wireless symbol detection by incorporating autoregressive reasoning steps in the hidden space.
CHOOSE significantly boosts the reasoning capacity of 1-2 layer models without increasing model depth, allowing for lightweight Transformers to achieve detection performance comparable to deeper models.
Experimental results indicate that CHOOSE outperforms traditional shallow Transformers, offering performance similar to deep models while maintaining storage and computational efficiency, making it suitable for resource-constrained mobile devices.

Read Full Article

6 Likes

Arxiv

Image Credit: Arxiv

FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation

Federated Learning (FL) focuses on collaborative model training without sharing private data but faces fairness concerns due to biases in clients' datasets.
Heterogeneous data distributions across clients can lead to unfair models impacting different clients differently.
FeDa4Fair introduces a library to generate datasets and benchmarks for evaluating fairness in FL methods at global and client levels.
The paper aims to support more robust fairness research by facilitating consistent benchmarking and evaluating fairness outcomes for diverse clients.

Read Full Article

5 Likes

For uninterrupted reading, download the app