Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

112

Image Credit: Arxiv

Phishing Detection in the Gen-AI Era: Quantized LLMs vs Classical Models

Phishing attacks are growing more sophisticated, necessitating detection systems with high accuracy and efficiency.
A study compares traditional Machine Learning (ML), Deep Learning (DL), and quantized Large Language Models (LLMs) for phishing detection.
LLMs, though not as accurate as ML and DL methods, show potential in recognizing subtle phishing cues.
The study proposes the integration of optimized LLMs in cybersecurity frameworks for enhanced phishing defense.

Read Full Article

6 Likes

Arxiv

224

Image Credit: Arxiv

Hybrid LLM-Enhanced Intrusion Detection for Zero-Day Threats in IoT Networks

This paper introduces a new intrusion detection approach that combines traditional signature-based methods with the contextual understanding abilities of the GPT-2 Large Language Model (LLM).
As cyber threats in IoT networks grow more advanced, the necessity for dynamic and adaptive Intrusion Detection Systems (IDSs) is crucial.
While traditional methods are effective against known threats, they struggle to identify new and evolving attack patterns, unlike GPT-2 which excels at processing unstructured data and uncovering subtle zero-day attack vectors.
The proposed hybrid IDS framework integrates signature-based techniques with GPT-2-driven semantic analysis, showing improvements in detection accuracy, reduction in false positives, and maintaining near real-time responsiveness in experimental evaluations on an intrusion dataset.

Read Full Article

13 Likes

Arxiv

Image Credit: Arxiv

Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals

Sequential Monte Carlo Squared (SMC$^2$) method is designed to improve accuracy and computational efficiency in Bayesian inference.
The incorporation of second-order proposals, utilizing Hessian information, within SMC$^2$ framework enhances exploration of the posterior distribution.
Experimental results on synthetic models show the advantages of using second-order proposals in terms of step-size selection and posterior approximation accuracy.
This study extends the Metropolis-Adjusted Langevin Algorithm (MALA) by incorporating second-order information for better exploration of probability regions.

Read Full Article

2 Likes

Arxiv

296

Image Credit: Arxiv

Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models

The study focuses on the phenomenon of machine bullshit in large language models (LLMs), where statements are made without regard for their truthfulness.
The researchers introduce the concept of the Bullshit Index to quantify LLMs' indifference to truth and analyze four forms of bullshit: empty rhetoric, paltering, weasel words, and unverified claims.
Empirical evaluations on various datasets and a new BullshitEval benchmark reveal that model fine-tuning and inference-time prompts exacerbate machine bullshit, particularly in political contexts.
The study's results underscore challenges in AI alignment and suggest insights to promote more truthful behavior in large language models.

Read Full Article

17 Likes

Discover more

Arxiv

Image Credit: Arxiv

Semi-supervised learning and integration of multi-sequence MR-images for carotid vessel wall and plaque segmentation

The analysis of carotid arteries, especially plaques, in multi-sequence MRI data is crucial for assessing the risk of atherosclerosis and ischemic stroke.
A semi-supervised deep learning-based approach is proposed to integrate multi-sequence MRI data for accurate segmentation of carotid artery vessel wall and plaque.
The approach includes a coarse localization model followed by a fine segmentation model, along with fusion strategies and a multi-level multi-sequence U-Net architecture.
The method addresses challenges of limited labeled data and complex carotid artery MRI through consistency enforcement under various input transformations, showcasing promising results.

Read Full Article

1 Like

Arxiv

Image Credit: Arxiv

Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code

Enhancing reasoning capabilities in Large Language Models (LLMs) is a key focus in research.
A new approach, TeaR, has been proposed to teach LLMs to reason better by leveraging data curation and reinforcement learning.
TeaR aims to improve general reasoning abilities by guiding models in discovering optimal reasoning paths through code-related tasks.
Extensive experiments show significant performance improvements with TeaR, achieving a 35.9% improvement on Qwen2.5-7B and 5.9% on R1-Distilled-7B benchmarks.

Read Full Article

2 Likes

Arxiv

Image Credit: Arxiv

Divergence Minimization Preference Optimization for Diffusion Model Alignment

Diffusion models have been successful in generating diverse images from text prompts.
A new method called Divergence Minimization Preference Optimization (DMPO) is introduced to align diffusion models with human preferences by minimizing reverse KL divergence.
Experiments show that models fine-tuned with DMPO outperform existing techniques, achieving at least 64.6% improvement in PickScore.
DMPO offers a principled approach for aligning generative behavior with desired outputs in diffusion models.

Read Full Article

3 Likes

Arxiv

Image Credit: Arxiv

Position: We Need An Algorithmic Understanding of Generative AI

A position paper suggests the need for a deeper understanding of the algorithms used by Large Language Models (LLMs), as research focus has mainly been on scale and performance improvement.
The proposed AlgEval framework aims to investigate the algorithms LLMs learn and utilize, focusing on algorithmic primitives, attention mechanisms, and inference-time computation.
The framework includes studying the composition of algorithmic primitives to solve specific tasks, with a case study on emergent search algorithms.
The systematic evaluation of LLMs' problem-solving methods can provide insights into internal reasoning and lead to more efficient training methods and novel architectures.

Read Full Article

2 Likes

Arxiv

Image Credit: Arxiv

On Trustworthy Rule-Based Models and Explanations

A recent paper delves into the importance of trustworthy explanations in machine learning models, especially in high-risk domains.
Interpretable models like rule-based models, such as decision trees, are commonly used in high-risk applications despite their inherent shortcomings.
The paper examines the negative aspects of rule-based models like negative overlap and redundancy and proposes algorithms to analyze and address these issues.
It concludes that existing tools for learning rule-based ML models often lead to rule sets that exhibit these undesirable characteristics.

Read Full Article

1 Like

Arxiv

148

Image Credit: Arxiv

Exploring the Limits of Model Compression in LLMs: A Knowledge Distillation Study on QA Tasks

Large Language Models (LLMs) face challenges due to their high computational demands for deployment in resource-constrained environments.
A study investigated compressing LLMs using Knowledge Distillation (KD) without compromising Question Answering (QA) task performance.
Student models distilled from Pythia and Qwen2.5 families maintained over 90% of their teacher models' performance while reducing parameter counts by up to 57.1% on SQuAD and MLQA benchmarks.
One-shot prompting showed additional performance gains over zero-shot setups, highlighting the potential of KD and minimal prompting for creating efficient QA systems for resource-constrained applications.

Read Full Article

8 Likes

Arxiv

276

Image Credit: Arxiv

Machine Learning-Assisted Surrogate Modeling with Multi-Objective Optimization and Decision-Making of a Steam Methane Reforming Reactor

This study focused on a steam methane reforming (SMR) reactor and introduced an integrated modeling and optimization framework combining a mathematical model, artificial neural network (ANN)-based hybrid modeling, multi-objective optimization (MOO), and multi-criteria decision-making (MCDM) techniques.
A hybrid ANN surrogate model was developed to reduce computational costs by 93.8% while maintaining high predictive accuracy, embedded in three MOO scenarios that aimed to maximize methane conversion, hydrogen output, and simultaneously minimize carbon dioxide emissions.
The optimal trade-off solutions were ranked and selected using MCDM methods like technique for order of preference by similarity to the ideal solution (TOPSIS) and simplified preference ranking on the basis of ideal-average distance (sPROBID).
Overall, this methodology provides an efficient strategy for optimizing intricate catalytic reactor systems with conflicting objectives, offering scalable solutions for such complex systems.

Read Full Article

16 Likes

Arxiv

152

Image Credit: Arxiv

Learning Pole Structures of Hadronic States using Predictive Uncertainty Estimation

Identifying new hadronic states is challenging due to exotic signals near threshold arising from various physical mechanisms.
A machine learning approach has been introduced for classifying pole structures in S-matrix elements with uncertainty estimates.
The approach achieved a validation accuracy of nearly 95% by applying a rejection criterion based on predictive uncertainty.
The model generalizes to unseen experimental data, including identifying a genuine compact pentaquark in the presence of higher channel virtual state pole.

Read Full Article

9 Likes

Arxiv

381

Image Credit: Arxiv

Accelerating Transposed Convolutions on FPGA-based Edge Devices

Transposed Convolutions (TCONV) play a key role in generative Artificial Intelligence (AI) models for up-scaling.
Efforts are being made to address inefficiencies in implementing TCONV on resource-constrained edge devices.
MM2IM, a hardware-software co-designed accelerator combining MatMul with col2IM, is proposed to efficiently process TCONV layers.
MM2IM shows significant performance improvements, achieving speedups in processing TCONV layers compared to CPU baselines and other accelerators.

Read Full Article

22 Likes

Arxiv

389

Image Credit: Arxiv

Rationale-Enhanced Decoding for Multi-modal Chain-of-Thought

Large vision-language models (LVLMs) integrating vision encoders with language models use chain-of-thought (CoT) prompting for multi-modal reasoning.
Existing LVLMs struggle with incorporating the contents of generated rationales in CoT reasoning, impacting grounding and accuracy.
Researchers propose rationale-enhanced decoding (RED) as an inference-time strategy for improved multi-modal CoT reasoning.
Extensive experiments show RED significantly enhances reasoning over standard CoT and other decoding methods in LVLMs, improving faithfulness and accuracy.

Read Full Article

23 Likes

Arxiv

160

Image Credit: Arxiv

Adaptive Gaussian Mixture Models-based Anomaly Detection for under-constrained Cable-Driven Parallel Robots

Researchers propose an Adaptive Gaussian Mixture Models-based Anomaly Detection system for Cable-Driven Parallel Robots without additional sensors.
The system uses motor torque data to detect anomalies that could affect robot performance during load manipulation tasks with predefined toolpaths.
An adaptive, unsupervised outlier detection algorithm based on Gaussian Mixture Models is employed, showing high accuracy in detecting anomalies with minimal latency.
Validation tests demonstrate a 100% true positive rate, 95.4% average true negative rate, and increased robustness compared to other detection methods.

Read Full Article

9 Likes

For uninterrupted reading, download the app