menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

2d

read

27

img
dot

Image Credit: Arxiv

Evaluating machine learning models for predicting pesticides toxicity to honey bees

  • Small molecules play a critical role in the biomedical, environmental, and agrochemical domains.
  • This work focuses on ApisTox, the most comprehensive dataset of experimentally validated chemical toxicity to the honey bee (Apis mellifera).
  • The evaluation of ApisTox using various machine learning approaches reveals that it represents a distinct chemical space.
  • The limited generalizability of current state-of-the-art algorithms trained solely on biomedical data highlights the need for targeted model development in the agrochemical domain.

Read Full Article

like

1 Like

source image

Arxiv

2d

read

258

img
dot

Image Credit: Arxiv

NoProp: Training Neural Networks without Back-propagation or Forward-propagation

  • The paper introduces a new learning method named NoProp, which does not rely on either forward or backward propagation in deep learning.
  • NoProp takes inspiration from diffusion and flow matching methods to independently learn to denoise a noisy target at each layer.
  • The method demonstrates superior accuracy, ease of use, and computational efficiency compared to other back-propagation-free methods on image classification benchmarks such as MNIST, CIFAR-10, and CIFAR-100.
  • NoProp alters the traditional gradient-based learning paradigm, enabling more efficient distributed learning and potentially impacting other characteristics of the learning process.

Read Full Article

like

15 Likes

source image

Arxiv

2d

read

238

img
dot

Image Credit: Arxiv

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

  • Parameter generation has emerged as a novel paradigm for neural network development, offering an alternative to traditional neural network training by synthesizing high-quality model weights directly.
  • In this paper, a novel conditional recurrent diffusion framework called ORAL is introduced, which addresses the limitations of existing methods in achieving scalability and controllability.
  • ORAL incorporates a novel conditioning mechanism to generate task-specific Low-Rank Adaptation (LoRA) parameters that can seamlessly transfer across evolving language models.
  • Extensive experiments show that ORAL generates high-quality LoRA parameters, achieving comparable or superior performance to vanilla trained counterparts across various language, vision, and multimodal tasks.

Read Full Article

like

14 Likes

source image

Arxiv

2d

read

254

img
dot

Image Credit: Arxiv

SQuat: Subspace-orthogonal KV Cache Quantization

  • Researchers propose SQuat (Subspace-orthogonal KV cache quantization) to reduce memory usage in key-value (KV) cache used for LLMs decoding.
  • SQuat constructs a subspace spanned by query tensors to capture critical task-related information.
  • SQuat enforces orthogonality between (de)quantized and original keys in the subspace, minimizing the impact of quantization errors.
  • The method achieves reduced memory usage, improved throughput, and better benchmark scores compared to existing KV cache quantization algorithms.

Read Full Article

like

15 Likes

source image

Arxiv

2d

read

364

img
dot

Image Credit: Arxiv

Which LIME should I trust? Concepts, Challenges, and Solutions

  • Explainable Artificial Intelligence (XAI) is crucial for fostering trust and detecting potential misbehavior of opaque models.
  • LIME (Local Interpretable Model-agnostic Explanations) is a popular model-agnostic approach for generating explanations of black-box models.
  • LIME faces challenges related to fidelity, stability, and applicability to domain-specific problems.
  • A survey has been conducted to comprehensively explore and collect LIME's foundational concepts and known limitations, categorize and compare its enhancements, and offer a structured taxonomy for future research and practical application.

Read Full Article

like

21 Likes

source image

Arxiv

2d

read

74

img
dot

Image Credit: Arxiv

Effectively Controlling Reasoning Models through Thinking Intervention

  • Reasoning-enhanced large language models (LLMs) generate intermediate reasoning steps prior to generating final answers, excelling in complex problem-solving.
  • Thinking Intervention is a novel paradigm designed to guide the internal reasoning processes of LLMs by strategically inserting or revising specific thinking tokens.
  • Comprehensive evaluations show that Thinking Intervention outperforms baseline prompting approaches, achieving significant improvements in instruction following, instruction hierarchy, and safety alignment tasks.
  • The research on Thinking Intervention offers a promising new avenue for controlling reasoning LLMs.

Read Full Article

like

4 Likes

source image

Arxiv

2d

read

148

img
dot

Image Credit: Arxiv

Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing

  • Researchers have developed a framework to bypass safety filters of large language models (LLMs) and generate malicious code.
  • The framework employs distributed prompt processing and iterative refinements to achieve a 73.2% success rate (SR) in generating malicious code.
  • Comparative analysis shows that traditional single-LLM judge evaluation overestimates SRs compared to the LLM jury system.
  • The distributed architecture improves SRs by 12% compared to the non-distributed approach.

Read Full Article

like

8 Likes

source image

Arxiv

2d

read

223

img
dot

Image Credit: Arxiv

Truth in Text: A Meta-Analysis of ML-Based Cyber Information Influence Detection Approaches

  • Cyber information influence, or disinformation, is a significant threat to social progress and government stability.
  • ML techniques, including traditional ML algorithms and deep learning models, are being used to detect disinformation in online media.
  • A two-stage meta-analysis was conducted to assess the effectiveness of ML models in detecting disinformation.
  • The majority of the ML detection techniques sampled achieved over 80% accuracy, with a mean sample effectiveness of 79.18% accuracy.

Read Full Article

like

13 Likes

source image

Arxiv

2d

read

328

img
dot

Image Credit: Arxiv

Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA

  • Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA
  • The paper discusses the use of a Parameter-Efficient Fine-tuning method called Low-Rank Adaptation (LoRA) to fine-tune a more computationally efficient version of the automatic speech recognition model, Whisper, for aviation communication transcription.
  • The authors used the Air Traffic Control Corpus dataset and performed a grid search to optimize the hyperparameters of distil-Whisper using a 5-fold cross-validation.
  • The fine-tuned model achieved an average word error rate of 3.86% across five folds, indicating its potential for accurate transcription of aviation communication.

Read Full Article

like

19 Likes

source image

Arxiv

2d

read

375

img
dot

Image Credit: Arxiv

Modeling speech emotion with label variance and analyzing performance across speakers and unseen acoustic conditions

  • Spontaneous speech emotion data often have uncertainty in labels due to grader opinion variation.
  • Using the probability density function of emotion grades as targets instead of consensus grades improves performance on benchmark evaluation sets.
  • Saliency-driven foundation model representation selection helps train a state-of-the-art speech emotion model for both dimensional and categorical emotion recognition.
  • Performance evaluation across multiple test-sets, along with analysis across gender and speakers, is necessary to assess the usefulness of emotion models.

Read Full Article

like

22 Likes

source image

Arxiv

2d

read

379

img
dot

Image Credit: Arxiv

Risk-Calibrated Affective Speech Recognition via Conformal Coverage Guarantees: A Stochastic Calibrative Framework for Emergent Uncertainty Quantification

  • Traffic safety challenges arising from extreme driver emotions highlight the urgent need for reliable emotion recognition systems.
  • Traditional deep learning approaches in speech emotion recognition suffer from overfitting and poorly calibrated confidence estimates.
  • A framework integrating Conformal Prediction (CP) and Risk Control is proposed, using Mel-spectrogram features processed through a pre-trained convolutional neural network.
  • The Risk Control framework enables task-specific adaptation through customizable loss functions, dynamically adjusting prediction set sizes while maintaining coverage guarantees.

Read Full Article

like

22 Likes

source image

Arxiv

2d

read

383

img
dot

Image Credit: Arxiv

Chirp Localization via Fine-Tuned Transformer Model: A Proof-of-Concept Study

  • Researchers have developed a fine-tuned Transformer model to detect and localize chirp-like patterns in EEG spectrograms, which are important biomarkers for seizure dynamics.
  • The study utilized synthetic spectrograms with chirp parameters to create a benchmark for chirp localization.
  • The Vision Transformer (ViT) model was adapted for regression to predict chirp parameters, and attention layers were fine-tuned using Low-Rank Adaptation (LoRA).
  • The model achieved a strong alignment between predicted and actual labels, with a correlation of 0.9841 for chirp start time.

Read Full Article

like

23 Likes

source image

Arxiv

2d

read

129

img
dot

Image Credit: Arxiv

A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI

  • A large-scale vision-language dataset derived from open scientific literature, Biomedica, has been introduced to advance biomedical generalist AI.
  • The dataset contains over 6 million scientific articles, 24 million image-text pairs, and 27 metadata fields, including expert human annotations.
  • Scalable streaming and search APIs are provided for easy access to the dataset, facilitating seamless integration with AI systems.
  • The utility of the Biomedica dataset has been demonstrated through the development of embedding models, chat-style models, and retrieval-augmented chat agents, outperforming previous open systems.

Read Full Article

like

7 Likes

source image

Arxiv

2d

read

250

img
dot

Image Credit: Arxiv

Symmetry-Informed Graph Neural Networks for Carbon Dioxide Isotherm and Adsorption Prediction in Aluminum-Substituted Zeolites

  • Accurately predicting adsorption properties in nanoporous materials using Deep Learning models remains a challenging task.
  • SymGNN is a graph neural network architecture that leverages material symmetries to improve adsorption property prediction.
  • The model successfully captures key adsorption trends, including the influence of both the framework and aluminium distribution on CO2 adsorption.
  • The study suggests promising directions for fine-tuning with experimental data and generative approaches for the inverse design of multifunctional nanomaterials.

Read Full Article

like

15 Likes

source image

Arxiv

2d

read

195

img
dot

Image Credit: Arxiv

Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation

  • This paper explores the risks of unintentional and malicious disclosure in large language models trained for code generation.
  • Unintentional disclosure refers to the language model presenting secrets to users without user intent, while malicious disclosure refers to presenting secrets to an attacker.
  • The study assesses the risks of unintentional and malicious disclosure in the Open Language Model (OLMo) family of models and the Dolma training datasets.
  • The results show that changes in data source and processing greatly affect the risk of unintended memorization, and the risk of disclosing sensitive information varies based on prompt strategies and types of sensitive information.

Read Full Article

like

11 Likes

For uninterrupted reading, download the app