menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1d

read

162

img
dot

Image Credit: Arxiv

Unsupervised Waste Classification By Dual-Encoder Contrastive Learning and Multi-Clustering Voting (DECMCV)

  • Unsupervised Waste Classification By Dual-Encoder Contrastive Learning and Multi-Clustering Voting (DECMCV)
  • Constructed a bias-free and cost-effective unsupervised waste classification method called DECMCV.
  • Utilizes a pre-trained ConvNeXt model for image encoding and VisionTransformer for generating positive samples.
  • DECMCV achieves high accuracies on TrashNet and Huawei Cloud datasets, improving classification accuracy by 29.85% compared to supervised models.

Read Full Article

like

9 Likes

source image

Arxiv

1d

read

298

img
dot

Image Credit: Arxiv

AxBERT: An Interpretable Chinese Spelling Correction Method Driven by Associative Knowledge Network

  • AxBERT is an interpretable deep learning model proposed for Chinese spelling correction.
  • It aligns with an associative knowledge network (AKN) constructed based on co-occurrence relations among Chinese characters.
  • A translator matrix between BERT and AKN is introduced for alignment and regulation of the attention component in BERT.
  • Experimental results on SIGHAN datasets show that AxBERT achieves extraordinary performance and interpretability.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

143

img
dot

Image Credit: Arxiv

Nonlinear energy-preserving model reduction with lifting transformations that quadratize the energy

  • Existing model reduction techniques for high-dimensional models of conservative partial differential equations (PDEs) encounter computational bottlenecks when dealing with systems featuring non-polynomial nonlinearities.
  • This work presents a nonlinear model reduction method that employs lifting variable transformations to derive structure-preserving quadratic reduced-order models for conservative PDEs with general nonlinearities.
  • The proposed strategy combined with proper orthogonal decomposition model reduction yields quadratic reduced-order models that conserve the quadratized lifted energy exactly in high dimensions.
  • The numerical results show that the proposed lifting approach is competitive with the state-of-the-art structure-preserving hyper-reduction method in terms of both accuracy and computational efficiency in the online stage while providing significant computational gains in the offline stage.

Read Full Article

like

8 Likes

source image

Arxiv

1d

read

310

img
dot

Image Credit: Arxiv

On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process

  • Research attention is focused on the double descent phenomenon, which deviates from traditional bias-variance trade-off theory.
  • This study explores the relationship between shape/texture bias in the learning process of CNNs and epoch-wise double descent.
  • Quantitative evaluations reveal a correlation between test errors and bias values during the learning process.
  • The experimental results contribute to understanding the mechanisms behind the double descent phenomenon and the learning process of CNNs in image recognition.

Read Full Article

like

18 Likes

source image

Arxiv

1d

read

170

img
dot

Image Credit: Arxiv

PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models

  • Researchers have introduced PromptCoT, a method for automatically generating high-quality Olympiad-level math problems.
  • PromptCoT synthesizes complex problems based on mathematical concepts and the rationale behind problem construction.
  • The method outperforms existing problem generation methods on standard benchmarks and exhibits superior data scalability.
  • The implementation of PromptCoT is available at https://github.com/zhaoxlpku/PromptCoT.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

11

img
dot

Image Credit: Arxiv

GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning

  • Recent advances in video generation models have highlighted the need for effective evaluation.
  • Automated evaluation metrics lack high-level semantic understanding and reasoning capabilities for video.
  • To address this, GRADEO introduces a video evaluation model that uses multi-step reasoning.
  • Experiments show that GRADEO aligns better with human evaluations, revealing the limitations of current video generation models.

Read Full Article

like

Like

source image

Arxiv

1d

read

19

img
dot

Image Credit: Arxiv

DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability

  • Large Language Models (LLMs) generate content that frequently deviates from factual correctness or lacks logical reasoning.
  • A new decoding strategy called DeLTa aims to improve factual accuracy and inferential reasoning in LLMs without changing their architecture or pre-trained parameters.
  • DeLTa adjusts next-token probabilities by analyzing the trajectory of logits from lower to higher layers in Transformers and applying linear regression.
  • Experiments show that DeLTa achieves improvements of up to 4.9% over the baseline on TruthfulQA, 8.1% on StrategyQA, and 7.3% on GSM8K, which test reasoning abilities.

Read Full Article

like

1 Like

source image

Arxiv

1d

read

27

img
dot

Image Credit: Arxiv

CQ CNN: A Hybrid Classical Quantum Convolutional Neural Network for Alzheimer's Disease Detection Using Diffusion Generated and U Net Segmented 3D MRI

  • Researchers propose a hybrid classical quantum convolutional neural network (CQ CNN) for Alzheimer's Disease (AD) detection using 3D MRI.
  • The CQ CNN incorporates quantum machine learning techniques, specifically parameterized quantum circuits (PQCs), with classical machine learning architectures.
  • The proposed CQ CNN achieves higher accuracy in fewer epochs compared to classical models, with a 97.50% accuracy using only 13K parameters.
  • The use of diffusion-generated data, along with real samples, represents a notable first in the field of quantum machine learning for AD detection.

Read Full Article

like

1 Like

source image

Arxiv

1d

read

139

img
dot

Image Credit: Arxiv

Controllable Motion Generation via Diffusion Modal Coupling

  • Diffusion models have gained attention in robotics for generating multi-modal distributions of system states and behaviors.
  • Ensuring precise control over generated outcomes without compromising realism remains a challenge in diffusion models.
  • A novel framework is proposed to enhance controllability in diffusion models by leveraging multi-modal prior distributions and enforcing strong modal coupling.
  • The framework achieves superior fidelity, diversity, and controllability in motion prediction and multi-task control experiments, providing a reliable and scalable solution for controllable motion generation in robotics.

Read Full Article

like

8 Likes

source image

Arxiv

1d

read

248

img
dot

Image Credit: Arxiv

BdSLW401: Transformer-Based Word-Level Bangla Sign Language Recognition Using Relative Quantization Encoding (RQE)

  • Researchers have developed a large-scale, multi-view, word-level Bangla Sign Language (BdSL) dataset called BdSLW401, which consists of 401 signs and 102,176 video samples.
  • To improve transformer-based Sign Language Recognition (SLR), they have introduced a method called Relative Quantization Encoding (RQE) that quantizes motion trajectories and anchors landmarks to physiological reference points.
  • The application of RQE has shown a reduction of 44.3% Word Error Rate (WER) in the WLASL100 dataset and 21.0% in the SignBD-200 dataset, along with significant gains in BdSLW60 and SignBD-90.
  • The researchers also introduced an extended variant of RQE called RQE-SF, which improves pose consistency in lateral view recognition.

Read Full Article

like

14 Likes

source image

Arxiv

1d

read

279

img
dot

Image Credit: Arxiv

Iterative Value Function Optimization for Guided Decoding

  • Reinforcement Learning from Human Feedback (RLHF) is a popular method for controlling language model outputs but has high computational costs and training instability.
  • Value-guided decoding offers a cost-effective alternative for controlling outputs without re-training models.
  • However, accurate estimation of the optimal value function is crucial for effective value-guided decoding.
  • The proposed Iterative Value Function Optimization framework addresses these limitations through Monte Carlo Value Estimation and Iterative On-Policy Optimization, leading to efficient and effective control of language models.

Read Full Article

like

16 Likes

source image

Arxiv

1d

read

135

img
dot

Image Credit: Arxiv

Robust detection of overlapping bioacoustic sound events

  • A method called Voxaboxen has been proposed for accurately detecting bioacoustic sound events, while being robust to overlapping events.
  • Voxaboxen takes inspiration from object detection methods in computer vision and incorporates advances in self-supervised audio encoders.
  • It predicts the start and duration of vocalizations for each time window and also predicts the end of vocalizations and the time since they started.
  • Voxaboxen demonstrates state-of-the-art results in detecting overlapping vocalizations on multiple datasets, including a new dataset of annotated zebra finch recordings.

Read Full Article

like

8 Likes

source image

Arxiv

1d

read

178

img
dot

Image Credit: Arxiv

Wyckoff Transformer: Generation of Symmetric Crystals

  • Symmetry rules that atoms obey when they bond together to form an ordered crystal play a fundamental role in determining their properties.
  • Generating stable crystal structures is still a challenge due to a lack of accounting for symmetry rules.
  • WyFormer is a generative model for materials that accounts for space group symmetry by using Wyckoff positions.
  • WyFormer demonstrates best-in-class symmetry-conditioned generation, physics-motivated bias, stability, material property prediction, and fast inference.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

287

img
dot

Image Credit: Arxiv

InfoGNN: End-to-end deep learning on mesh via graph neural networks

  • InfoGNN is an end-to-end framework for deep learning on mesh data using graph neural networks (GNN).
  • It treats mesh models as graphs, enabling efficient handling of irregular mesh data.
  • InfoGNN introduces modules for utilizing position information, face normals, and dihedral angles to leverage different types of data.
  • Experimental results demonstrate that InfoGNN achieves outstanding performance in mesh classification and segmentation tasks.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

11

img
dot

Image Credit: Arxiv

Aggregation Strategies for Efficient Annotation of Bioacoustic Sound Events Using Active Learning

  • The study explores efficient annotation strategies for Sound Event Detection (SED) applications using Active Learning (AL).
  • A novel uncertainty aggregation strategy called Top K Entropy is introduced, which prioritizes the most uncertain segments in an audio recording for annotation.
  • Compared to random sampling and Mean Entropy, Top K Entropy leads to improved annotation efficiency in sparse data scenarios.
  • Using Top K Entropy, the study demonstrates comparable model performance with only 8% of the labels compared to training on the fully labeled dataset.

Read Full Article

like

Like

For uninterrupted reading, download the app