menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

2d

read

301

img
dot

Image Credit: Arxiv

Instance-Prototype Affinity Learning for Non-Exemplar Continual Graph Learning

  • Graph Neural Networks (GNN) face catastrophic forgetting, hindering knowledge preservation.
  • Non-Exemplar methods like Prototype Replay (PR) address memory issues in GNN.
  • Prototype Contrastive Learning (PCL) shows reduced drift compared to conventional PR.
  • Instance-Prototype Affinity Learning (IPAL) is proposed for Non-Exemplar Continual Graph Learning (NECGL), outperforming existing methods.

Read Full Article

like

18 Likes

source image

Arxiv

2d

read

178

img
dot

Image Credit: Arxiv

Financial Fraud Detection Using Explainable AI and Stacking Ensemble Methods

  • Researchers propose a fraud detection framework using a stacking ensemble of XGBoost, LightGBM, and CatBoost models.
  • Explainable artificial intelligence (XAI) techniques like SHAP, LIME, PDP, and PFI are employed to enhance model transparency and interpretability.
  • The model achieved high performance metrics with 99% accuracy and an AUC-ROC score of 0.99 on the IEEE-CIS Fraud Detection dataset.
  • Combining high prediction accuracy with transparent interpretability could lead to a more ethical and trustworthy solution in financial fraud detection.

Read Full Article

like

10 Likes

source image

Arxiv

2d

read

150

img
dot

Image Credit: Arxiv

JointDistill: Adaptive Multi-Task Distillation for Joint Depth Estimation and Scene Segmentation

  • Depth estimation and scene segmentation are crucial tasks in intelligent transportation systems, and joint modeling of these tasks can reduce storage and training requirements.
  • This work introduces an adaptive multi-task distillation method to enhance unified modeling by dynamically adjusting the knowledge transfer from multiple teachers based on the student's learning ability.
  • To prevent knowledge forgetfulness during distillation with multiple teachers, a knowledge trajectory is proposed to maintain essential information learned by the model. This trajectory-based distillation loss helps guide the student model effectively.
  • Evaluation on benchmark datasets like Cityscapes and NYU-v2 shows that the proposed method outperforms existing solutions, with the code available in the supplementary materials.

Read Full Article

like

9 Likes

source image

Arxiv

2d

read

194

img
dot

Image Credit: Arxiv

ChronoSteer: Bridging Large Language Model and Time Series Foundation Model via Synthetic Data

  • Conventional forecasting methods are limited by unimodal time series data, hindering the utilization of textual information.
  • Integrating large language models (LLMs) and time series foundation models (TSFMs) has become a crucial research challenge for improved future inference.
  • ChronoSteer, a proposed multimodal model, combines LLM for textual event transformation and TSFM for temporal modeling by using revision instructions.
  • By employing a two-stage training strategy with synthetic data, ChronoSteer achieves a 25.7% enhancement in prediction accuracy over the unimodal backbone and a 22.5% improvement compared to the prior state-of-the-art multimodal method.

Read Full Article

like

11 Likes

source image

Arxiv

2d

read

3

img
dot

Image Credit: Arxiv

Learning Virtual Machine Scheduling in Cloud Computing through Language Agents

  • Virtual Machine (VM) scheduling in cloud services is a challenging Online Dynamic Multidimensional Bin Packing (ODMBP) problem due to its complexity and fluctuating demands.
  • A new hierarchical language agent framework called MiCo has been proposed to address the limitations of traditional optimization methods and domain-expert-designed heuristic approaches.
  • MiCo utilizes a large language model (LLM)-driven heuristic design paradigm, formulating ODMBP as a Semi-Markov Decision Process with Options (SMDP-Option) for dynamic scheduling.
  • Experiments show that MiCo achieves a 96.9% competitive ratio in large-scale scenarios with over 10,000 virtual machines, demonstrating its effectiveness in complex and large-scale cloud environments.

Read Full Article

like

Like

source image

Arxiv

2d

read

305

img
dot

Image Credit: Arxiv

All You Need Is Synthetic Task Augmentation

  • Injecting rule-based models like Random Forests into differentiable neural network frameworks is a challenge in machine learning.
  • A novel strategy is proposed to jointly train a Graph Transformer neural network on both experimental and synthetic molecular property targets.
  • The synthetic tasks, derived from XGBoost models trained on Osmordred molecular descriptors, improve performance significantly in molecular property prediction tasks.
  • The results indicate that synthetic task augmentation enhances neural model performance without requiring feature injection or pretraining.

Read Full Article

like

18 Likes

source image

Arxiv

2d

read

285

img
dot

Image Credit: Arxiv

Enhancing the Performance of Global Model by Improving the Adaptability of Local Models in Federated Learning

  • Federated learning involves training a global model from local models, but heterogeneous data distributions and data privacy pose challenges.
  • This study focuses on improving the adaptability of local models to enhance the performance of the global model.
  • The method introduces the concept of adaptability of local models and proposes a solution to optimize it without direct knowledge of other clients' data distributions.
  • Experimental results show that this approach boosts the adaptability of local models, leading to better overall performance compared to baseline methods.

Read Full Article

like

17 Likes

source image

Arxiv

2d

read

273

img
dot

Image Credit: Arxiv

Robust Federated Learning on Edge Devices with Domain Heterogeneity

  • Federated Learning (FL) is a popular solution for privacy-sensitive applications that allows collaborative training while ensuring data privacy across distributed edge devices.
  • FL faces challenges due to statistical heterogeneity, particularly domain heterogeneity, which hinders the convergence of the global model.
  • A new framework called FedAPC (Federated Augmented Prototype Contrastive Learning) has been introduced to address the challenge by improving the generalization ability of the FL global model under domain heterogeneity using prototype augmentation.
  • Experimental results on the Office-10 and Digits datasets show that FedAPC outperforms state-of-the-art baselines, demonstrating superior performance in enhancing feature diversity and model robustness.

Read Full Article

like

16 Likes

source image

Arxiv

2d

read

182

img
dot

Image Credit: Arxiv

QuXAI: Explainers for Hybrid Quantum Machine Learning Models

  • Hybrid quantum-classical machine learning models bring new possibilities in computational intelligence but their complexity often results in black-box behavior.
  • A research gap exists in explainability approaches for HQML architectures that combine quantum feature encoding and classical learning, leading to the development of QuXAI framework.
  • QuXAI, based on Q-MEDLEY explainer, helps in understanding feature importance in hybrid systems by showcasing classical aspects and reducing noise.
  • This work aims to enhance interpretability and reliability of HQML models, ensuring safer and more responsible use of quantum-enhanced AI technology.

Read Full Article

like

10 Likes

source image

Arxiv

2d

read

79

img
dot

Image Credit: Arxiv

Does Scaling Law Apply in Time Series Forecasting?

  • Rapid expansion of model size in time series forecasting poses a challenge, with models growing from early Transformers to recent architectures like TimesNet.
  • Alinear is introduced as an ultra-lightweight forecasting model that competes with larger models using minimal parameters, utilizing a horizon-aware adaptive decomposition mechanism and progressive frequency attenuation strategy.
  • Extensive experiments on seven benchmark datasets show that Alinear outperforms large-scale models while using less than 1% of their parameters across various forecasting horizons.
  • The study challenges the belief that larger models are inherently superior in time series forecasting and advocates for a shift towards more efficient modeling approaches.

Read Full Article

like

4 Likes

source image

Arxiv

2d

read

230

img
dot

Image Credit: Arxiv

Defect Detection in Photolithographic Patterns Using Deep Learning Models Trained on Synthetic Data

  • Defect detection in photolithographic patterns is crucial for semiconductor manufacturing during EUV pattering.
  • The small size of defects in patterns leads to false or missed detections during inspection.
  • A study focuses on using deep learning models trained on synthetic data for defect detection, where SEM images with known defects are artificially generated and annotated.
  • The YOLOv8 object detection model shows the best mean average precision of 96% for detecting smaller defects, outperforming EfficientNet and SSD.

Read Full Article

like

13 Likes

source image

Arxiv

2d

read

206

img
dot

Image Credit: Arxiv

A multi-head deep fusion model for recognition of cattle foraging events using sound and movement signals

  • Monitoring cattle feeding behavior is crucial for efficient herd management and optimal resource utilization in grazing cattle.
  • The ability to automatically recognize feeding events in animals through jaw movement identification can lead to improved diet formulation and early detection of health issues.
  • A novel deep neural network model has been introduced in this work, combining acoustic and inertial signals through feature-level fusion.
  • The proposed model outperformed traditional and deep learning approaches with an F1-score value of 0.802, representing a 14% increase compared to previous methods.

Read Full Article

like

12 Likes

source image

Arxiv

2d

read

174

img
dot

Image Credit: Arxiv

Informed Forecasting: Leveraging Auxiliary Knowledge to Boost LLM Performance on Time Series Forecasting

  • A novel cross-domain knowledge transfer framework is proposed to enhance the performance of Large Language Models (LLMs) in time series forecasting.
  • The approach systematically infuses LLMs with structured temporal information to improve their forecasting accuracy in fields like energy systems, finance, and healthcare.
  • Results from evaluating the proposed method on a real-world time series dataset show that knowledge-informed forecasting significantly outperforms a naive baseline with no auxiliary information.
  • These findings demonstrate the potential of knowledge transfer strategies to improve the predictive accuracy and generalization of LLMs in domain-specific forecasting tasks.

Read Full Article

like

10 Likes

source image

Arxiv

2d

read

57

img
dot

Image Credit: Arxiv

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

  • Transformer models face challenges in integrating positional information while allowing multi-head attention flexibility.
  • ComplexFormer introduces Complex Multi-Head Attention-CMHA to model semantic and positional differences in the complex plane, enhancing representational capacity.
  • Key improvements in ComplexFormer include per-head Euler transformation and adaptive differential rotation mechanism for head-specific complex subspace operation.
  • Extensive experiments show that ComplexFormer outperforms strong baselines like RoPE-Transformers in various tasks, demonstrating superior performance, lower generation perplexity, and improved long-context coherence.

Read Full Article

like

3 Likes

source image

Arxiv

2d

read

55

img
dot

Image Credit: Arxiv

SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices

  • Efficient LLM inference on resource-constrained devices faces challenges in compute and memory utilization due to limited GPU memory.
  • Existing systems offload model weights to CPU memory, leading to inefficiencies such as underutilized GPU cores and minimal impact of GPU memory capacity on performance.
  • SpecOffload is a proposed high-throughput inference engine that incorporates speculative decoding into offloading, unlocking latent GPU resources for accelerating inference at near-zero additional cost.
  • SpecOffload improves GPU core utilization by 4.49x and boosts inference throughput by 2.54x compared to the best baseline, by orchestration of interleaved execution of target and draft models in speculative decoding.

Read Full Article

like

3 Likes

For uninterrupted reading, download the app