menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

4d

read

199

img
dot

Image Credit: Arxiv

Nearness of Neighbors Attention for Regression in Supervised Finetuning

  • Combining the feature extraction capabilities of neural networks with traditional algorithms like k-nearest neighbors (k-NN) in supervised machine learning is common.
  • Supervised fine-tuning (SFT) on a domain-appropriate feature extractor, followed by training a traditional predictor on the resulting SFT embeddings, often leads to improved performance.
  • Directly incorporating traditional algorithms into SFT as prediction layers can enhance performance, but challenges arise due to their non-differentiable nature.
  • Nearness of Neighbors Attention (NONA) regression layer, introduced as a solution, uses neural network attention mechanics and a novel attention-masking scheme to create a differentiable proxy of the k-NN regression algorithm, resulting in improved regression performance on various datasets.

Read Full Article

like

11 Likes

source image

Arxiv

4d

read

109

img
dot

Image Credit: Arxiv

AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists

  • AutoSDT is an automatic pipeline designed to address the challenge of data scarcity in building AI co-scientists for scientific discovery tasks.
  • It collects high-quality coding tasks from real-world data-driven workflows using LLMs to search for sources, select tasks, and synthesize instructions and code solutions.
  • AutoSDT-5K dataset, created using this pipeline, comprises 5,404 coding tasks spanning four scientific disciplines and 756 Python packages, making it the largest open dataset for data-driven scientific discovery generated automatically.
  • Expert feedback indicates that 93% of tasks collected are ecologically valid, and 92.2% of synthesized programs are functionally correct. AutoSDT-Coder models trained on this dataset show significant improvements on data-driven discovery benchmarks, matching the performance of GPT-4o on ScienceAgentBench and enhancing scores on DiscoveryBench.

Read Full Article

like

6 Likes

source image

Arxiv

4d

read

97

img
dot

Image Credit: Arxiv

Accelerating Spectral Clustering under Fairness Constraints

  • Fairness of decision-making algorithms is an increasingly important issue.
  • New efficient method for fair spectral clustering (Fair SC) presented by casting the Fair SC problem within the difference of convex functions framework.
  • Introduces a novel variable augmentation strategy and employs an alternating direction method of multipliers type of algorithm adapted to DC problems.
  • Numerical experiments demonstrate the effectiveness of the approach on synthetic and real-world benchmarks, showing significant speedups in computation time over prior art.

Read Full Article

like

5 Likes

source image

Arxiv

4d

read

85

img
dot

Image Credit: Arxiv

Fully data-driven inverse hyperelasticity with hyper-network neural ODE fields

  • A new framework has been proposed for identifying mechanical properties of heterogeneous materials without a closed-form constitutive equation.
  • The framework involves training a neural network with Fourier features to capture sharp gradients in displacement field data obtained from digital image correlation.
  • A physics-based data-driven method using ordinary neural differential equations (NODEs) is employed to discover constitutive equations, allowing for representation of arbitrary materials while satisfying constraints.
  • The framework includes a hyper-network that optimizes parameters to minimize a multi-objective loss function considering constraints in the theory of constitutive equations, showcasing robustness in identifying mechanical properties of heterogeneous materials with minimal assumptions.

Read Full Article

like

5 Likes

source image

Arxiv

4d

read

247

img
dot

Image Credit: Arxiv

BLUR: A Bi-Level Optimization Approach for LLM Unlearning

  • Enabling large language models (LLMs) to unlearn knowledge and capabilities acquired during training has become crucial for compliance with data regulations and promoting ethical practices in generative AI.
  • Existing unlearning algorithms face challenges in formulating the unlearning problem effectively, with the most common approach using a combination of forget and retain loss, leading to performance degradation.
  • A new approach, called Bi-Level UnleaRning (BLUR), is proposed in this work, focusing on a hierarchical structure of unlearning where forgetting certain knowledge and capabilities takes precedence over retaining model utility.
  • BLUR, based on a bi-level optimization formulation, outperforms existing algorithms in various unlearning tasks, models, and metrics, offering strong theoretical guarantees along with superior performance.

Read Full Article

like

14 Likes

source image

Arxiv

4d

read

235

img
dot

Image Credit: Arxiv

UniVarFL: Uniformity and Variance Regularized Federated Learning for Heterogeneous Data

  • Federated Learning (FL) faces performance issues with non-IID data due to local classifier bias.
  • UniVarFL is a novel FL framework that directly addresses these issues without global model dependency.
  • It leverages two regularization strategies during local training: Classifier Variance Regularization and Hyperspherical Uniformity Regularization.
  • Extensive experiments show UniVarFL outperforms existing methods in accuracy, making it a promising solution for real-world FL deployments.

Read Full Article

like

14 Likes

source image

Arxiv

4d

read

227

img
dot

Image Credit: Arxiv

Federated Learning on Stochastic Neural Networks

  • Federated learning leverages edge computing on client devices to optimize models while maintaining user privacy.
  • Latent noise in local datasets poses a challenge in federated learning due to factors like limited measurement capabilities or human errors.
  • To address this challenge, the proposal involves using stochastic neural networks as local models within the federated learning framework.
  • The approach, known as Federated Stochastic Neural Networks, aims to estimate underlying states of data and quantify latent noise, with numerical experiments demonstrating its effectiveness.

Read Full Article

like

13 Likes

source image

Arxiv

4d

read

113

img
dot

Image Credit: Arxiv

FedGA-Tree: Federated Decision Tree using Genetic Algorithm

  • Federated Learning is gaining prominence due to rising data privacy concerns as it allows collaborative training without raw data aggregation.
  • Current focus in Federated Learning has been on parametric gradient-based models, with relatively less attention on nonparametric models like decision trees.
  • A new approach using Genetic Algorithm is explored in a recent study to create personalized decision trees that can handle categorical and numerical data for classification and regression tasks in Federated Learning.
  • Experiments show that this new approach outperforms traditional decision trees trained on local data as well as a benchmark algorithm.

Read Full Article

like

6 Likes

source image

Arxiv

4d

read

345

img
dot

Image Credit: Arxiv

A Machine Learning Approach to Generate Residual Stress Distributions using Sparse Characterization Data in Friction-Stir Processed Parts

  • Residual stresses within components can impact performance, and accurately determining their distributions is crucial for structural integrity.
  • A machine learning-based Residual Stress Generator (RSG) was developed to infer full-field stresses from limited measurements.
  • The RSG utilized an extensive dataset from process simulations and a ML model based on U-Net architecture for prediction.
  • The model showed excellent predictive accuracy on simulated stresses and effectively predicted experimentally characterized data, reducing the need for extensive experimental efforts.

Read Full Article

like

20 Likes

source image

Arxiv

4d

read

215

img
dot

Image Credit: Arxiv

What makes an Ensemble (Un) Interpretable?

  • Ensemble models are known for their limited interpretability compared to single models like decision trees.
  • Factors like the number, size, and type of base models influence the interpretability of ensembles.
  • Applying concepts from computational complexity theory can help study the challenges of generating explanations for ensemble configurations.
  • Interpreting ensembles is shown to be intractable under certain complexity assumptions, with complexity patterns influenced by factors like the number and type of base models.

Read Full Article

like

12 Likes

source image

Arxiv

4d

read

89

img
dot

Image Credit: Arxiv

Mondrian: Transformer Operators via Domain Decomposition

  • Operator learning enables data-driven modeling of partial differential equations by learning mappings between function spaces.
  • Mondrian introduces transformer operators that decompose a domain into non-overlapping subdomains and apply attention over sequences of subdomain-restricted functions.
  • This approach decouples attention from discretization and supports local and global interactions through hierarchical windowed and neighborhood attention.
  • Mondrian achieves strong performance on Allen-Cahn and Navier-Stokes PDEs, showcasing resolution scaling without retraining.

Read Full Article

like

5 Likes

source image

Arxiv

4d

read

81

img
dot

Image Credit: Arxiv

Scaling Laws of Motion Forecasting and Planning -- A Technical Report

  • Study on scaling laws of encoder-decoder autoregressive transformer models for motion forecasting and planning in autonomous driving domain.
  • Model performance improves with total compute budget following a power-law function, similar to language modeling, with a correlation between training loss and evaluation metrics.
  • Closed-loop metrics also improve with scaling, impacting the suitability of open-loop metrics for model development and hill climbing.
  • Optimal scaling of transformer parameters and training data size shows the need to increase model size faster than dataset size as the training compute budget grows.

Read Full Article

like

4 Likes

source image

Arxiv

4d

read

390

img
dot

Image Credit: Arxiv

Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework

  • Large language models (LLMs) are being utilized to extract clinical data from electronic health records (EHRs) in oncology, improving scalability and efficiency.
  • A new framework, VALID, addresses challenges in ensuring reliability, accuracy, and fairness of LLM-extracted data for research, regulatory, and clinical applications in oncology.
  • VALID framework includes performance benchmarking against human experts, verification checks for consistency, and bias assessment across demographic subgroups.
  • This framework aims to enhance industry standards and support the trustworthy use of AI-powered evidence generation in oncology research and practice.

Read Full Article

like

23 Likes

source image

Arxiv

4d

read

255

img
dot

Image Credit: Arxiv

Temporalizing Confidence: Evaluation of Chain-of-Thought Reasoning with Signal Temporal Logic

  • Large Language Models have shown impressive performance in mathematical reasoning tasks when guided by Chain-of-Thought prompting.
  • A structured framework that models stepwise confidence as a temporal signal and evaluates it using Signal Temporal Logic (STL) has been proposed.
  • Formal STL-based constraints are defined to capture desirable temporal properties and compute robustness scores for structured, interpretable confidence estimates.
  • Experiments show that this approach consistently improves calibration metrics and provides more reliable uncertainty estimates than conventional methods.

Read Full Article

like

15 Likes

source image

Arxiv

4d

read

252

img
dot

Image Credit: Arxiv

Parameter-free approximate equivariance for tasks with finite group symmetry

  • Equivariant neural networks aim to improve performance by incorporating symmetries through group actions.
  • A new zero-parameter approach is proposed to impose approximate equivariance for a finite group in the latent representation.
  • Experiments show that the network learns a group representation on the latent space and prefers to learn the regular representation.
  • The proposed approach is benchmarked on three datasets and shows similar or better performance compared to existing equivariant methods with fewer parameters.

Read Full Article

like

15 Likes

For uninterrupted reading, download the app