Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Towards Data Science

109

The Basis of Cognitive Complexity: Teaching CNNs to See Connections

The article discusses the capabilities of artificial intelligence models, particularly convolutional neural networks (CNNs), in capturing human learning aspects.
It explores the similarities between CNNs and the human visual cortex, highlighting features like hierarchical processing, receptive fields, feature sharing, and spatial invariance.
While CNNs excel in visual tasks, they face challenges in understanding causal relations and learning abstract concepts compared to humans.
Studies show instances where AI models fail to generalize image classification or recognize objects in unusual poses.
The article outlines the difficulty CNNs face in learning simple causal relationships, emphasizing the lack of inductive bias necessary for such learning.
Meta-learning approaches like Model-Agnostic Meta-Learning (MAML) are proposed to enhance CNNs' abilities in abstraction and generalization.
Experiments demonstrate that shallow CNNs can indeed learn complex relationships like same-different relations with meta-learning, improving performance significantly.
Meta-learning encourages abstractive learning and optimal point identification across tasks, enhancing CNNs' reasoning and generalization capabilities.
Overall, the study suggests that utilizing meta-learning can empower CNNs to develop higher cognitive functions, addressing the limitations in learning abstract relations.
Efforts in creating new architectures and training paradigms hold promise in enhancing CNNs' relational reasoning abilities for improved AI generalization.

Read Full Article

5 Likes

Mit

130

Image Credit: Mit

New method efficiently safeguards sensitive AI training data

MIT researchers have developed a framework based on PAC Privacy to protect sensitive data in AI models.
The new PAC Privacy framework is more computationally efficient and minimizes the tradeoff between accuracy and privacy.
Researchers have created a four-step template to privatize various algorithms without needing to access their inner workings.
The team demonstrated that stable algorithms are easier to privatize using their method, as stable algorithms produce consistent predictions.
The use of PAC Privacy estimates the minimal noise required to protect an AI model's training data, enhancing privacy with minimal utility loss.
The new variant of PAC Privacy estimates anisotropic noise tailored to specific data characteristics, reducing overall noise while maintaining privacy levels.
More stable algorithms exhibit less variance in their outputs, requiring less noise for privatization, according to the research.
The researchers aim to explore co-designing algorithms with PAC Privacy for enhanced stability, security, and robustness from the outset.
The study showed that the new PAC Privacy requires fewer trials to estimate noise and successfully withstands state-of-the-art attacks in simulations.
The research marks a step towards automated and efficient private data analytics without requiring individual query analysis, as highlighted by Xiangyao Yu.

Read Full Article

7 Likes

Arxiv

203

Image Credit: Arxiv

Deep Sturm--Liouville: From Sample-Based to 1D Regularization with Learnable Orthogonal Basis Functions

Artificial Neural Networks (ANNs) have achieved remarkable success, but suffer from limited generalization.
To overcome this limitation, a novel function approximator called Deep Sturm--Liouville (DSL) is introduced.
DSL enables continuous 1D regularization along field lines in the input space and integrates the Sturm--Liouville Theorem (SLT) into the deep learning framework.
DSL achieves competitive performance and improved sample efficiency on diverse multivariate datasets.

Read Full Article

12 Likes

Arxiv

219

Image Credit: Arxiv

Compound Fault Diagnosis for Train Transmission Systems Using Deep Learning with Fourier-enhanced Representation

Fault diagnosis is critical for ensuring the stability and reliability of train transmission systems.
Data-driven fault diagnosis models offer advantages over traditional methods, but existing models are limited in their ability to handle compound faults.
A new approach using a frequency domain representation and a 1-dimensional CNN is proposed for compound fault diagnosis in train transmission systems.
The proposed model achieved accuracies of 97.67% and 93.93% on test sets for single and compound faults, respectively.

Read Full Article

13 Likes

Discover more

Arxiv

231

Image Credit: Arxiv

Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models

This technical report introduces Ring-Lite-Distill, a lightweight reasoning model derived from the Mixture-of-Experts (MoE) Large Language Models (LLMs) Ling-Lite.
The model demonstrates exceptional reasoning capabilities through high-quality data curation and training paradigms, maintaining a compact parameter-efficient architecture with 2.75 billion activated parameters.
The goal of the model is to achieve comprehensive competency coverage and preserve general capabilities, such as instruction following, tool use, and knowledge retention.
Ring-Lite-Distill's reasoning ability is comparable to DeepSeek-R1-Distill-Qwen-7B, with superior general capabilities.

Read Full Article

13 Likes

Arxiv

Image Credit: Arxiv

Trustworthy AI Must Account for Intersectionality

Trustworthy AI encompasses aspects such as fairness, privacy, robustness, explainability, and uncertainty quantification.
Efforts to enhance one aspect often introduce unintended trade-offs that negatively impact others.
Addressing trustworthiness along each axis in isolation is insufficient.
Research on Trustworthy AI must account for intersectionality between aspects and adopt a holistic view.

Read Full Article

2 Likes

Arxiv

323

Image Credit: Arxiv

Prototype-Based Continual Learning with Label-free Replay Buffer and Cluster Preservation Loss

Continual learning techniques employ simple replay sample selection processes and use them during subsequent tasks.
This paper proposes a label-free replay buffer and introduces cluster preservation loss in order to maintain essential information from previously encountered tasks while adapting to new tasks.
The method includes 'push-away' and 'pull-toward' mechanisms to retain previously learned information and facilitate adaptation to new classes or domain shifts.
Experimental results on various benchmarks show that the label-free replay-based technique outperforms state-of-the-art continual learning methods and even surpasses offline learning in some cases.

Read Full Article

19 Likes

Arxiv

352

Image Credit: Arxiv

Resource-efficient Inference with Foundation Model Programs

The inference-time resource costs of large language and vision models pose challenges in production deployments.
A solution proposed is using foundation model programs, which can invoke foundation models with varying resource costs and performance.
A method is presented that translates a task into a program and learns a policy for resource allocation, selecting foundation model 'backends' for each program module.
Compared to monolithic multi-modal models, the implementation achieves up to 98% resource savings with minimal accuracy loss.

Read Full Article

21 Likes

Arxiv

170

Image Credit: Arxiv

Adapting to Online Distribution Shifts in Deep Learning: A Black-Box Approach

A new approach has been proposed to address the problem of online distribution shift in deep learning.
The proposed method is a meta-algorithm that can enhance the performance of any online learner under non-stationarity.
It automatically adapts to changes in the data distribution and selects the most appropriate 'attention span' for learning.
Experiments show consistent improvement in classification accuracy across various real-world datasets.

Read Full Article

10 Likes

Arxiv

327

Image Credit: Arxiv

A Multi-Phase Analysis of Blood Culture Stewardship: Machine Learning Prediction, Expert Recommendation Assessment, and LLM Automation

Blood cultures are often over-ordered without clear justification, placing strain on healthcare resources and contributing to inappropriate antibiotic use.
A study analyzed 135,483 emergency department (ED) blood culture orders, developing machine learning (ML) models to predict the risk of bacteremia.
The ML models, which integrated structured electronic health record (EHR) data and provider notes via a large language model (LLM), demonstrated improved performance.
The ML models achieved higher specificity without compromising sensitivity, offering enhanced diagnostic stewardship beyond existing standards of care.

Read Full Article

19 Likes

Arxiv

164

Image Credit: Arxiv

Data Fusion of Deep Learned Molecular Embeddings for Property Prediction

Data-driven approaches such as deep learning can result in predictive models for material properties with exceptional accuracy and efficiency.
To address the limitations of sparse datasets and weak correlations between properties, the authors propose a data fusion technique.
By combining the learned molecular embeddings from single-task models, the fused, multi-task models outperform standard multi-task models.
The experimental results demonstrate the enhanced prediction capabilities of the fused models for data-limited properties.

Read Full Article

9 Likes

Arxiv

327

Image Credit: Arxiv

Bregman-Hausdorff divergence: strengthening the connections between computational geometry and machine learning

This paper proposes an extension of the Hausdorff distance to spaces equipped with asymmetric distance measures, specifically focusing on the family of Bregman divergences.
The Bregman-Hausdorff divergence is used to compare probabilistic predictions produced by different machine learning models trained using the relative entropy loss.
The proposed algorithms are efficient even for large inputs with high dimensions.
The paper also provides a survey on Bregman geometry and computational geometry algorithms relevant to machine learning.

Read Full Article

19 Likes

Arxiv

273

Image Credit: Arxiv

PROPEL: Supervised and Reinforcement Learning for Large-Scale Supply Chain Planning

This paper introduces PROPEL, a framework that combines optimization with supervised and Deep Reinforcement Learning (DRL) for large-scale Supply Chain Planning (SCP) optimization problems.
PROPEL uses supervised learning to identify variables fixed to zero in the optimal solution, and DRL to select which fixed variables must be relaxed to improve solution quality.
The framework has been applied to industrial SCP optimizations with millions of variables, leading to significant improvements in solution times and quality.
The computational results show a 60% reduction in primal integral, an 88% primal gap reduction, and improvement factors of up to 13.57 and 15.92, respectively.

Read Full Article

16 Likes

Arxiv

298

Image Credit: Arxiv

Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression

Post-training quantization (PTQ) is a method to reduce a model's memory footprint without retraining.
A new mixed-precision PTQ approach called Task-Circuit Quantization (TaCQ) conditions the quantization process on specific weight circuits associated with downstream task performance.
TaCQ preserves task-specific weights by contrasting unquantized model weights with uniformly-quantized model weights.
Experimental results show that TaCQ outperforms existing mixed-precision quantization methods, achieving major improvements in the low 2- to 3-bit regime.

Read Full Article

17 Likes

Arxiv

403

Image Credit: Arxiv

State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements

Reliable state estimation is essential for autonomous systems operating in complex, noisy environments.
Classical filtering approaches, such as the Kalman filter, struggle with nonlinear dynamics and non-Gaussian noise.
An integrated framework is proposed that combines particle filtering with Q-learning and NEAT algorithms to address the challenge of noisy measurements.
Experiments show that the approach results in improved training stability, final performance, and success rates over baselines lacking advanced filtering.

Read Full Article

24 Likes

For uninterrupted reading, download the app