menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

2h

read

105

img
dot

Image Credit: Arxiv

GIMS: Image Matching System Based on Adaptive Graph Construction and Graph Neural Network

  • Feature-based image matching has extensive applications in computer vision. In this paper, an innovative adaptive graph construction method is introduced for image matching. The method dynamically adjusts the criteria for incorporating new vertices based on the characteristics of existing vertices, allowing for more precise and robust graph structures. The vertex processing capabilities of Graph Neural Networks (GNNs) are combined with Transformers to enhance the model's representation of spatial and feature information. The system achieves significant improvements in overall matching performance.

Read Full Article

like

6 Likes

source image

Arxiv

2h

read

102

img
dot

Image Credit: Arxiv

Leveraging Convolutional Neural Network-Transformer Synergy for Predictive Modeling in Risk-Based Applications

  • This paper proposes a deep learning model combining convolutional neural networks (CNN) and Transformer for credit user default prediction.
  • The model combines the advantages of CNN in local feature extraction and Transformer in global dependency modeling.
  • Experimental results show that the CNN+Transformer model outperforms traditional machine learning models in accuracy, AUC, and KS value.
  • The study provides a new idea for credit default prediction and supports risk assessment and intelligent decision-making in the financial field.

Read Full Article

like

6 Likes

source image

Arxiv

2h

read

307

img
dot

Image Credit: Arxiv

OMG-HD: A High-Resolution AI Weather Model for End-to-End Forecasts from Observations

  • OMG-HD is a high-resolution AI weather model designed to make forecasts directly from observational data.
  • The model outperforms traditional NWP models at lead times of up to 12 hours in the CONUS region.
  • It achieves improvements in temperature, wind speed, humidity, and surface pressure compared to existing models.
  • The use of observational data without relying on NWP-derived fields shows promise for operational forecasts with AIWP models.

Read Full Article

like

18 Likes

source image

Arxiv

2h

read

234

img
dot

Image Credit: Arxiv

Towards Modality Generalization: A Benchmark and Prospective Analysis

  • Multi-modal learning has achieved remarkable success in integrating information from various modalities and outperforming uni-modal approaches in tasks like recognition and retrieval.
  • However, current methods struggle to address the challenge of generalizing to novel modalities that are unseen during training.
  • This paper introduces Modality Generalization (MG) to enable models to generalize to unseen modalities.
  • The authors propose a benchmark and identify key directions for future research to advance robust and adaptable multi-modal models.

Read Full Article

like

14 Likes

source image

Arxiv

2h

read

152

img
dot

Image Credit: Arxiv

GDM4MMIMO: Generative Diffusion Models for Massive MIMO Communications

  • Massive multiple-input multiple-output (MIMO) is a key technology for 5G and 6G wireless communication systems.
  • Generative diffusion models (GDM) have emerged as powerful generative AI models in various fields.
  • This paper explores the potential applications of GDM in massive MIMO communications.
  • A case study on near-field channel estimation demonstrates the promising potential of GDM for efficient channel information acquisition.

Read Full Article

like

9 Likes

source image

Arxiv

2h

read

52

img
dot

Image Credit: Arxiv

Dissipation alters modes of information encoding in small quantum reservoirs near criticality

  • Quantum reservoir computing (QRC) is a promising paradigm for utilizing near-term quantum devices in temporal machine learning tasks.
  • A study investigates a minimal model of a driven-dissipative quantum reservoir consisting of two coupled Kerr-nonlinear oscillators.
  • Using Partial Information Decomposition (PID), the researchers analyze how different dynamical regimes encode input drive signals.
  • The results reveal a transition from redundant to synergistic encoding near a critical point, with synergy enhancing short-term responsiveness and dissipation supporting long-term memory retention.

Read Full Article

like

3 Likes

source image

Arxiv

2h

read

49

img
dot

Image Credit: Arxiv

DeepCRCEval: Revisiting the Evaluation of Code Review Comment Generation

  • Code review comment generation evaluation is revisited.
  • Traditional evaluation methods based on text similarity face challenges.
  • DeepCRCEval framework integrates human evaluators and Large Language Models (LLMs).
  • LLM-Reviewer baseline shows potential in efficient comment generation.

Read Full Article

like

2 Likes

source image

Arxiv

2h

read

29

img
dot

Image Credit: Arxiv

Learning to Play Against Unknown Opponents

  • We consider the problem of a learning agent playing a general sum game against an unknown strategic opponent.
  • The agent knows their own payoff function but is uncertain about the opponent's payoff distribution.
  • A polynomial-time algorithm is presented to maximize the agent's utility within a small epsilon of optimal utility.
  • When the algorithm is constrained to be a no-regret algorithm, an optimal learning algorithm can be constructed in polynomial time.

Read Full Article

like

1 Like

source image

Arxiv

2h

read

26

img
dot

Image Credit: Arxiv

Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight

  • This paper discusses the use of large language models (LLMs) and vision-language models (VLMs) in video anomaly detection (VAD) in 2024.
  • The integration of LLMs and VLMs in VAD helps enhance interpretability, capture temporal relationships, enable few-shot and zero-shot detection, and address open-world and class-agnostic anomalies.
  • LLMs and VLMs offer semantic insights, textual explanations, and motion features for spatiotemporal coherence, making visual anomalies more understandable.
  • The paper explores the potential of LLMs and VLMs in redefining the landscape of VAD and proposes future directions for leveraging the synergy between visual and textual modalities.

Read Full Article

like

1 Like

source image

Arxiv

2h

read

105

img
dot

Image Credit: Arxiv

FameBias: Embedding Manipulation Bias Attack in Text-to-Image Models

  • Researchers have introduced FameBias, a Text-to-Image (T2I) biasing attack.
  • This attack manipulates input prompts to generate images of specific public figures.
  • FameBias achieves a high attack success rate while maintaining the semantic context of the original prompts.
  • The study highlights the potential for T2I models to be used for propaganda and malicious purposes.

Read Full Article

like

6 Likes

source image

Arxiv

2h

read

26

img
dot

Image Credit: Arxiv

Mitigating Label Noise using Prompt-Based Hyperbolic Meta-Learning in Open-Set Domain Generalization

  • Mitigating Label Noise using Prompt-Based Hyperbolic Meta-Learning in Open-Set Domain Generalization
  • Open-Set Domain Generalization (OSDG) is a challenging task requiring models to accurately predict familiar categories while minimizing confidence for unknown categories.
  • Label noise can mislead model optimization, thereby exacerbating the challenges of open-set recognition in novel domains.
  • HyProMeta, a novel framework, integrates hyperbolic category prototypes for label noise-aware meta-learning alongside a learnable new-category agnostic prompt designed to enhance generalization to unseen classes.

Read Full Article

like

1 Like

source image

Arxiv

2h

read

244

img
dot

Image Credit: Arxiv

Addressing Spatial-Temporal Data Heterogeneity in Federated Continual Learning via Tail Anchor

  • Federated continual learning (FCL) allows each client to continually update its knowledge from task streams.
  • FCL needs to address spatial data heterogeneity between clients and temporal data heterogeneity between tasks.
  • The proposed Federated Tail Anchor (FedTA) overcomes parameter-forgetting and output-forgetting using trainable Tail Anchor and frozen output features.
  • FedTA also includes Input Enhancement, Selective Input Knowledge Fusion, and Best Global Prototype Selection for improved performance in downstream tasks.

Read Full Article

like

14 Likes

source image

Arxiv

2h

read

33

img
dot

Image Credit: Arxiv

ChaI-TeA: A Benchmark for Evaluating Autocompletion of Interactions with LLM-based Chatbots

  • The rise of LLMs has led to more interactions with LLM-based chatbots.
  • Phrasing messages for these chatbots is time-consuming, so an autocomplete solution is needed.
  • ChaI-TeA is an evaluation framework for autocompleting LLM-based chatbot interactions.
  • While off-the-shelf models perform decently, there is still room for improvement in suggestion ranking.

Read Full Article

like

1 Like

source image

Arxiv

2h

read

261

img
dot

Image Credit: Arxiv

Weak Scaling Capability in Token Space: An Observation from Large Vision Language Model

  • The study investigates the scaling capability of vision-language models with respect to the number of vision tokens.
  • The model exhibits weak scaling capabilities on the length of vision tokens, with performance approximately following a power-law relationship.
  • The scaling behavior remains unaffected by the inclusion or exclusion of the user's question in the input.
  • Fusing the user's question with the vision token can enhance model performance when the question is relevant.

Read Full Article

like

15 Likes

source image

Arxiv

2h

read

181

img
dot

Image Credit: Arxiv

RDPM: Solve Diffusion Probabilistic Models via Recurrent Token Prediction

  • Diffusion Probabilistic Models (DPMs) are widely used for high-fidelity image synthesis.
  • A new generative framework called Recurrent Diffusion Probabilistic Model (RDPM) is introduced.
  • RDPM enhances the diffusion process through recurrent token prediction.
  • RDPM demonstrates superior performance and can be used for multimodal generation.

Read Full Article

like

10 Likes

For uninterrupted reading, download the app