menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

4d

read

116

img
dot

Image Credit: Arxiv

Adversarial Resilience against Clean-Label Attacks in Realizable and Noisy Settings

  • This paper investigates the challenge of establishing stochastic-like guarantees for learning from a stream of data that includes both unknown clean-label adversarial samples and noise.
  • The approach allows the learner to abstain from making predictions when uncertain, and measures regret in terms of misclassification and abstention error.
  • The study corrects inaccuracies in the work of Goel, Hanneke, Moran, and Shetty and explores methods for the agnostic setting with random labels.
  • The paper introduces the concept of a clean-label adversary in the agnostic context and provides a theoretical analysis of a disagreement-based learner subject to a clean-label adversary with noise.

Read Full Article

like

7 Likes

source image

Arxiv

4d

read

240

img
dot

Image Credit: Arxiv

Multiscale Tensor Summation Factorization as a New Neural Network Layer (MTS Layer) for Multidimensional Data Processing

  • Multiscale Tensor Summation Factorization (MTS) is introduced as a new neural network layer for multidimensional data processing.
  • MTS performs tensor summation at multiple scales using Tucker-decomposition-like mode products.
  • It reduces the number of parameters required and enhances the efficiency of weight optimization compared to traditional dense layers.
  • MTS demonstrates advantages over convolutional layers and shows effectiveness in various tasks, such as classification, compression, and signal restoration.

Read Full Article

like

14 Likes

source image

Arxiv

4d

read

346

img
dot

Image Credit: Arxiv

CacheFormer: High Attention-Based Segment Caching

  • Efficiently handling long contexts in transformer-based language models with low perplexity is an active area of research.
  • A new approach called CacheFormer is proposed to tackle this problem by dividing long contexts into small segments.
  • The design of CacheFormer includes retrieving nearby segments in an uncompressed form when high segment-level attention occurs at the compressed level.
  • CacheFormer outperforms existing state-of-the-art architectures with an average perplexity improvement of 8.5% over similar model sizes.

Read Full Article

like

20 Likes

source image

Arxiv

4d

read

379

img
dot

Image Credit: Arxiv

Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs

  • Large language models (LLMs) have become pivotal in artificial intelligence, but their deployment on edge devices is hindered by their substantial size.
  • Quantization is a widely used method to reduce memory usage and inference time, but LLMs present unique challenges due to the prevalence of outliers in their activations.
  • In this work, the authors propose a method based on gradual binary search and the use of Hadamard matrices to address the challenges of activation quantization in LLMs.
  • The proposed method enables 3-bit quantization for weights, activations, and key-value (KV) caches, resulting in improved model performance compared to state-of-the-art methods.

Read Full Article

like

22 Likes

source image

Arxiv

4d

read

65

img
dot

Image Credit: Arxiv

PC-DeepNet: A GNSS Positioning Error Minimization Framework Using Permutation-Invariant Deep Neural Network

  • New framework PC-DeepNet uses a permutation-invariant deep neural network to minimize positioning errors in global navigation satellite systems (GNSS) in urban and sub-urban areas.
  • PC-DeepNet addresses challenges posed by non-line-of-sight propagation, multipath effects, and low received power levels that result in non-linear and non-Gaussian measurement error distributions.
  • The framework leverages NLOS and multipath indicators as features to enhance positioning accuracy in challenging environments.
  • PC-DeepNet achieves superior accuracy and lower computational complexity compared to existing model-based and learning-based methods.

Read Full Article

like

3 Likes

source image

Arxiv

4d

read

77

img
dot

Image Credit: Arxiv

Deep Learning on Graphs for Mobile Network Topology Generation

  • Mobile networks consist of interconnected radio nodes strategically positioned across various geographical regions to provide connectivity services.
  • In this work, graph-based deep learning methods are used to determine mobility relations in mobile networks, trained on radio node configuration data and Automatic Neighbor Relations (ANR).
  • The evaluation of two deep learning models, graph neural network (GNN) model and multilayer perceptron, showed the effectiveness of considering graph structure in improving results.
  • The use of heuristics based on the distance between radio nodes was also investigated, which significantly improved precision and accuracy.

Read Full Article

like

4 Likes

source image

Arxiv

4d

read

143

img
dot

Image Credit: Arxiv

CAOTE: KV Caching through Attention Output Error based Token Eviction

  • CAOTE (KV Caching through Attention Output Error based Token Eviction) is a method proposed to optimize token eviction in large language models.
  • Token eviction is a post-training methodology used to alleviate memory and compute challenges in resource-restricted devices.
  • CAOTE integrates attention scores and value vectors to improve the accuracy on downstream tasks.
  • It is the first method to use value vector information in combination with attention-based eviction scores.

Read Full Article

like

8 Likes

source image

Arxiv

4d

read

294

img
dot

Image Credit: Arxiv

Contextual Embedding-based Clustering to Identify Topics for Healthcare Service Improvement

  • Understanding patient feedback is crucial for improving healthcare services, yet analyzing unlabeled short-text feedback presents significant challenges due to limited data and domain-specific nuances.
  • This study explores unsupervised methods to extract meaningful topics from patient feedback collected from a healthcare system in Wisconsin, USA.
  • The study employed a keyword-based filtering approach and explored various topic modeling methods, including LDA, GSDMM, and BERTopic.
  • The integration of BERT embeddings with k-means clustering, called kBERT, outperformed other models, achieving high coherence and distinct topic separation in short-text healthcare feedback analysis.

Read Full Article

like

17 Likes

source image

Arxiv

4d

read

254

img
dot

Image Credit: Arxiv

Personalizing Exposure Therapy via Reinforcement Learning

  • Personalized therapy can lead to improved health outcomes.
  • Approaches to automatically adapt therapeutic content exist but may not generalize to all individuals.
  • A new approach using physiological measures to adapt therapeutic content has been proposed.
  • The approach incorporates reinforcement learning and outperforms rules-based methods in a human subject study.

Read Full Article

like

15 Likes

source image

Arxiv

4d

read

270

img
dot

Image Credit: Arxiv

Predicting Stress and Damage in Carbon Fiber-Reinforced Composites Deformation Process using Composite U-Net Surrogate Model

  • A new surrogate model, Composite U-Net, is proposed to predict stress and damage in carbon fiber-reinforced composites (CFRC) during deformation.
  • Traditional FEM simulations struggle with computational efficiency, but existing data-driven surrogate models are limited in capturing the entire deformation history.
  • The Composite U-Net model accurately predicts stress and damage fields in CFRC while offering a significant speed-up compared to advanced FEM techniques.
  • The proposed model leverages the U-Net architecture to capture spatial features and integrate macro- and micro-scale phenomena.

Read Full Article

like

16 Likes

source image

Arxiv

4d

read

246

img
dot

Image Credit: Arxiv

A Physics-guided Multimodal Transformer Path to Weather and Climate Sciences

  • The paper discusses the use of AI models in meteorology to improve accuracy compared to traditional methods.
  • Meteorological data is transformed into 2D images or 3D videos and fed into AI models for learning.
  • Physical signals such as temperature, pressure, and wind speed are incorporated into the models to enhance accuracy and interpretability.
  • A new paradigm is proposed where multimodal data is integrated via transformers, with the ability to incorporate weather and climate knowledge.

Read Full Article

like

14 Likes

source image

Arxiv

4d

read

394

img
dot

Image Credit: Arxiv

FedC4: Graph Condensation Meets Client-Client Collaboration for Efficient and Private Federated Graph Learning

  • FedC4 is a novel framework that combines graph condensation with client-client collaboration for efficient and private federated graph learning.
  • Existing methods in federated graph learning (FGL) can be categorized into the server-client (S-C) paradigm and the client-client (C-C) paradigm.
  • FedC4 distills each client's private graph into a compact set of synthetic node embeddings, reducing communication overhead and enhancing privacy.
  • Extensive experiments show that FedC4 outperforms state-of-the-art baselines in both performance and communication efficiency.

Read Full Article

like

23 Likes

source image

Arxiv

4d

read

20

img
dot

Image Credit: Arxiv

DConAD: A Differencing-based Contrastive Representation Learning Framework for Time Series Anomaly Detection

  • DConAD is a differencing-based contrastive representation learning framework for time series anomaly detection.
  • It aims to capture robust and representative dependencies within time series for identifying anomalies.
  • DConAD generates differential data and utilizes transformer-based architecture to enhance the robustness of representation learning.
  • Experimental results demonstrate the superiority and effectiveness of DConAD compared to nine baselines.

Read Full Article

like

1 Like

source image

Arxiv

4d

read

24

img
dot

Image Credit: Arxiv

Dual-channel Heterophilic Message Passing for Graph Fraud Detection

  • A new framework, Dual-channel Heterophilic Message Passing (DHMP), is proposed for fraud detection.
  • DHMP leverages a heterophily separation module to divide the graph into homophilic and heterophilic subgraphs.
  • It applies shared weights to capture signals at different frequencies independently and incorporates a customized sampling strategy for training.
  • Extensive experiments demonstrate that DHMP outperforms existing methods for fraud detection.

Read Full Article

like

1 Like

source image

Arxiv

4d

read

28

img
dot

Image Credit: Arxiv

Decomposition-based multi-scale transformer framework for time series anomaly detection

  • Researchers propose a transformer-based framework called TransDe for multi-scale time series anomaly detection.
  • TransDe combines time series decomposition and transformers to effectively model complex patterns in normal time series data.
  • A multi-scale patch-based transformer architecture is used to capture dependencies of each decomposed component of the time series.
  • TransDe outperforms twelve baselines in terms of F1 score in extensive experiments on five public datasets.

Read Full Article

like

1 Like

For uninterrupted reading, download the app