menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

15h

read

321

img
dot

Image Credit: Arxiv

Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification, An Interpretable Multi-Omics Approach

  • This study introduces the Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN), a deep learning model that integrates messenger RNA, micro RNA sequences, and DNA methylation data with Protein-Protein Interaction (PPI) networks for accurate and interpretable cancer classification across 31 cancer types.
  • MOGKAN achieves classification accuracy of 96.28 percent and demonstrates low experimental variability with a standard deviation that is reduced by 1.58 to 7.30 percents compared to Convolutional Neural Networks (CNNs) and Graph Neural Networks (GNNs).
  • The biomarkers identified by MOGKAN have been validated as cancer-related markers through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis.
  • The proposed model presents an ability to uncover molecular oncogenesis mechanisms by detecting phosphoinositide-binding substances and regulating sphingolipid cellular processes.

Read Full Article

like

19 Likes

source image

Arxiv

15h

read

168

img
dot

Image Credit: Arxiv

MNT-TNN: Spatiotemporal Traffic Data Imputation via Compact Multimode Nonlinear Transform-based Tensor Nuclear Norm

  • Imputation of random or non-random missing data is a long-standing research topic and a crucial application for Intelligent Transportation Systems (ITS).
  • A novel spatiotemporal traffic imputation method, Multimode Nonlinear Transformed Tensor Nuclear Norm (MNT-TNN), is proposed to address the challenges in random missing value imputation and spatiotemporal dependency modeling.
  • MNT-TNN utilizes the Transform-based Tensor Nuclear Norm (TTNN) optimization framework, extending it to a multimode transform with nonlinear activation to capture spatiotemporal correlations and low-rankness of the traffic tensor.
  • Experimental results show that MNT-TNN and its enhancement framework, ATTNNs, outperform existing imputation methods for random missing traffic value imputation.

Read Full Article

like

10 Likes

source image

Arxiv

15h

read

270

img
dot

Image Credit: Arxiv

Multimodal machine learning with large language embedding model for polymer property prediction

  • Contemporary large language models (LLMs) like GPT-4 and Llama, combined with molecular structure embeddings, enable accurate prediction of polymer properties.
  • PolyLLMem, a multimodal architecture, integrates text embeddings from Llama 3 with molecular structure embeddings derived from Uni-Mol.
  • Low-rank adaptation (LoRA) layers are incorporated to refine the embeddings based on limited polymer dataset, enhancing their chemical relevance.
  • PolyLLMem's performance is comparable to graph-based and transformer-based models, even with limited training data, accelerating the discovery of advanced polymeric materials.

Read Full Article

like

16 Likes

source image

Arxiv

15h

read

14

img
dot

Image Credit: Arxiv

Enhancing Federated Learning Through Secure Cluster-Weighted Client Aggregation

  • Federated learning (FL) is a promising paradigm in machine learning that enables collaborative model training across decentralized devices without sharing raw data.
  • The heterogeneous nature of local datasets in FL can cause model performance discrepancies, convergence challenges, and privacy concerns.
  • A novel FL framework called ClusterGuardFL is introduced, which uses dissimilarity scores, k-means clustering, and reconciliation confidence scores to assign weights to client updates.
  • Experimental results show that ClusterGuardFL improves model performance in diverse datasets.

Read Full Article

like

Like

source image

Arxiv

15h

read

153

img
dot

Image Credit: Arxiv

DC-SGD: Differentially Private SGD with Dynamic Clipping through Gradient Norm Distribution Estimation

  • Differentially Private Stochastic Gradient Descent (DP-SGD) is widely used for privacy-preserving deep learning.
  • The selection of the optimal clipping threshold C in DP-SGD poses a challenge, resulting in privacy and computational overhead.
  • A new framework called Dynamic Clipping DP-SGD (DC-SGD) is proposed, leveraging differentially private histograms to estimate gradient norm distributions and adjust the clipping threshold C dynamically.
  • Experimental results show that DC-SGD achieves up to 9 times acceleration in hyperparameter tuning compared to DP-SGD, with improved accuracy and privacy guarantees.

Read Full Article

like

9 Likes

source image

Arxiv

15h

read

266

img
dot

Image Credit: Arxiv

AuditVotes: A Framework Towards More Deployable Certified Robustness for Graph Neural Networks

  • Despite advancements in Graph Neural Networks (GNNs), adaptive attacks continue to challenge their robustness.
  • Certified robustness based on randomized smoothing has emerged as a promising solution, offering provable guarantees that a model's predictions remain stable under adversarial perturbations.
  • The proposed framework, AuditVotes, integrates randomized smoothing with augmentation and conditional smoothing to improve data quality and prediction consistency.
  • Experimental results demonstrate that AuditVotes significantly enhances clean accuracy, certified robustness, and empirical robustness for GNNs.

Read Full Article

like

16 Likes

source image

Arxiv

15h

read

106

img
dot

Image Credit: Arxiv

Buyer-Initiated Auction Mechanism for Data Redemption in Machine Unlearning

  • The rapid growth of artificial intelligence (AI) has raised privacy concerns over user data.
  • Machine unlearning allows AI service providers to remove user data from trained models and training datasets to comply with privacy regulations like GDPR and CCPA.
  • To balance the cost of unlearning and privacy protection, a buyer-initiated auction mechanism for data redemption is proposed.
  • The mechanism enables service providers to purchase data from willing users with appropriate compensation, maximizing social welfare.

Read Full Article

like

6 Likes

source image

Arxiv

15h

read

109

img
dot

Image Credit: Arxiv

Learning Structure-enhanced Temporal Point Processes with Gromov-Wasserstein Regularization

  • Real-world event sequences often have clustering structures, but most existing temporal point processes (TPPs) ignore them.
  • A new study proposes learning structure-enhanced TPPs with Gromov-Wasserstein (GW) regularization.
  • The proposed method imposes clustering structures on TPPs for improved interpretability in modeling and prediction.
  • The learned TPPs demonstrate clustered sequence embeddings and competitive predictive and clustering performance.

Read Full Article

like

6 Likes

source image

Arxiv

15h

read

230

img
dot

Image Credit: Arxiv

MSNGO: multi-species protein function annotation based on 3D protein structure and network propagation

  • Protein function prediction has improved using high-precision protein structures predicted by AlphaFold2.
  • A new model called MSNGO integrates structural features and network propagation methods for multi-species protein function prediction.
  • Using structural features significantly enhances the accuracy of multi-species protein function prediction.
  • MSNGO outperforms previous methods relying on sequence features and protein-protein interaction networks.

Read Full Article

like

13 Likes

source image

Arxiv

15h

read

106

img
dot

Image Credit: Arxiv

Function Fitting Based on Kolmogorov-Arnold Theorem and Kernel Functions

  • This paper proposes a unified theoretical framework based on the Kolmogorov-Arnold representation theorem and kernel methods.
  • The framework establishes a kernel-based feature fitting approach that unifies Kolmogorov-Arnold Networks (KANs) and self-attention mechanisms.
  • A low-rank Pseudo-Multi-Head Self-Attention module (Pseudo-MHSA) is introduced, which reduces parameter count by nearly 50% compared to traditional MHSA.
  • Experiments on the CIFAR-10 dataset demonstrate the performance and similarity of the proposed model to the ViT model under the MAE framework.

Read Full Article

like

6 Likes

source image

Arxiv

15h

read

303

img
dot

Image Credit: Arxiv

Prediction of 30-day hospital readmission with clinical notes and EHR information

  • High hospital readmission rates are associated with significant costs and health risks for patients.
  • Predictive models are crucial in supporting clinicians to determine patient hospital readmissions within a short period.
  • Combining clinical notes and electronic health records (EHRs) helps in predicting 30-day hospital readmissions.
  • A graph neural network (GNN) is used to integrate both information sources, achieving an AUROC of 0.72 and a balanced accuracy of 66.7%.

Read Full Article

like

18 Likes

source image

Arxiv

15h

read

51

img
dot

Image Credit: Arxiv

Unsupervised Anomaly Detection in Multivariate Time Series across Heterogeneous Domains

  • The widespread adoption of digital services has increased the need for anomaly detection in IT operations.
  • A unifying framework for benchmarking unsupervised anomaly detection methods is introduced.
  • The problem of shifts in normal behaviors in AIOps scenarios is highlighted.
  • The proposed approach, Domain-Invariant VAE for Anomaly Detection (DIVAD), outperforms existing methods.

Read Full Article

like

3 Likes

source image

Arxiv

15h

read

171

img
dot

Image Credit: Arxiv

TRACE: Intra-visit Clinical Event Nowcasting via Effective Patient Trajectory Encoding

  • Researchers propose a new model called TRACE for intra-visit clinical event nowcasting in electronic health records (EHR).
  • The model effectively encodes patient trajectories and captures temporal dependencies.
  • It outperforms previous methods in laboratory measurement prediction, improving patient care.
  • The code for the model is available on GitHub for further exploration and use.

Read Full Article

like

10 Likes

source image

Arxiv

15h

read

325

img
dot

Image Credit: Arxiv

Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models

  • Mixture of Experts (MoE) has emerged as a pivotal architectural paradigm for efficient scaling of Large Language Models (LLMs), operating through selective activation of parameter subsets for each input token.
  • In this paper, the authors introduce Mixture of Latent Experts (MoLE), a novel parameterization methodology that facilitates the mapping of specific experts into a shared latent space.
  • The MoLE architecture significantly reduces parameter count and computational requirements, addressing challenges such as excessive memory utilization and communication overhead during training and inference.
  • Empirical evaluations demonstrate that MoLE achieves performance comparable to standard MoE implementations while substantially reducing resource requirements.

Read Full Article

like

19 Likes

source image

Arxiv

15h

read

329

img
dot

Image Credit: Arxiv

RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations

  • Reinforcement learning (RL) can transform power grid operations by providing adaptive and scalable controllers essential for grid decarbonization.
  • RL2Grid is a benchmark designed in collaboration with power system operators to accelerate progress in grid control and foster RL maturity.
  • RL2Grid standardizes tasks, state and action spaces, and reward structures within a unified interface for systematic evaluation and comparison of RL approaches.
  • The benchmark results highlight the challenges power grids pose for RL methods, emphasizing the need for novel algorithms capable of handling real-world physical systems.

Read Full Article

like

19 Likes

For uninterrupted reading, download the app