menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

3d

read

320

img
dot

Image Credit: Arxiv

An Embedding is Worth a Thousand Noisy Labels

  • The performance of deep neural networks scales with dataset size and label quality.
  • In this work, a Weighted Adaptive Nearest Neighbor (WANN) approach is proposed to mitigate low-quality data annotations.
  • WANN outperforms reference methods and exhibits superior generalization on imbalanced data.
  • The proposed weighting scheme enhances supervised dimensionality reduction and minimizes latency and storage requirements.

Read Full Article

like

19 Likes

source image

Arxiv

3d

read

372

img
dot

Image Credit: Arxiv

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling

  • Variational inference (VI) is a computationally efficient and scalable methodology for approximate Bayesian inference, excelling at generative modelling and inversion tasks.
  • This paper provides an accessible and thorough technical introduction to VI for physics-related problems, explaining the standard derivations of the VI framework and its realization through deep learning.
  • It highlights the importance of the underlying physical model in capturing the dynamics of interest and offers flexibility in uncertainty quantification.
  • The target audience of this paper is the scientific community focusing on physics-based problems and uncertainty quantification.

Read Full Article

like

22 Likes

source image

Arxiv

3d

read

48

img
dot

Image Credit: Arxiv

Evaluating probabilistic and data-driven inference models for fiber-coupled NV-diamond temperature sensors

  • Researchers evaluate the impact of inference models on uncertainties in using continuous wave Optically Detected Magnetic Resonance (ODMR) measurements to infer temperature.
  • A probabilistic feedforward inference model is developed to maximize the likelihood of observed ODMR spectra by leveraging the temperature dependence of spin Hamiltonian parameters.
  • The probabilistic model achieves a prediction uncertainty of ±1 K across a temperature range of 243 K to 323 K.
  • When extrapolating beyond the training data range, the probabilistic model outperforms data-driven techniques such as Principal Component Regression (PCR) and a 1D Convolutional Neural Network (CNN), demonstrating robustness and generalizability.

Read Full Article

like

2 Likes

source image

Arxiv

3d

read

12

img
dot

Image Credit: Arxiv

Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids

  • This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to solve finite-horizon optimal control problems in mixed-logical dynamical systems efficiently.
  • The approach aims to mitigate the curse of dimensionality by decoupling the decision on the discrete variables from the decision on the continuous variables.
  • Reinforcement learning determines the discrete decision variables, simplifying the online optimization problem of the MPC controller and reducing computational time.
  • Simulation experiments on a microgrid system demonstrate that the proposed method substantially reduces the online computation time of MPC while maintaining high feasibility and low suboptimality.

Read Full Article

like

Like

source image

Arxiv

3d

read

368

img
dot

Image Credit: Arxiv

Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification

  • Ctrl-GenAug is a generative augmentation framework designed to improve medical sequence classification.
  • It addresses the limitations of existing generative augmentation methods in the medical field.
  • Ctrl-GenAug provides highly customizable and sequential synthesis of medical sequences.
  • It includes a noise filter to ensure the quality and reliability of synthetic data.

Read Full Article

like

22 Likes

source image

Arxiv

3d

read

156

img
dot

Image Credit: Arxiv

Automatic debiasing of neural networks via moment-constrained learning

  • Causal and nonparametric estimands in economics and biostatistics can often be viewed as the mean of a linear functional applied to an unknown outcome regression function.
  • Learning the Riesz representer (RR) of the target estimand through automatic debiasing (AD) can be challenging.
  • Moment-constrained learning is proposed as a new approach for RR learning, improving the robustness of RR estimates to optimization hyperparameters.
  • Numerical experiments on average treatment/derivative effect estimation using semi-synthetic data show improved performance compared to state-of-the-art benchmarks.

Read Full Article

like

9 Likes

source image

Arxiv

3d

read

44

img
dot

Image Credit: Arxiv

Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport

  • Aggregating data from multiple sources can be formalized as an Optimal Transport (OT) barycenter problem.
  • A novel scalable approach is proposed for estimating robust continuous barycenter, leveraging the dual formulation of the (semi-)unbalanced OT problem.
  • The method is adaptable to general cost functions and demonstrates robustness to outliers and class imbalance.
  • Source code for the method is publicly available on GitHub.

Read Full Article

like

2 Likes

source image

Arxiv

3d

read

360

img
dot

Image Credit: Arxiv

MM-Ego: Towards Building Egocentric Multimodal LLMs for Video QA

  • This research focuses on building a multimodal foundation model for egocentric video understanding.
  • The research includes generating a large dataset of high-quality QA samples for egocentric videos.
  • A challenging egocentric QA benchmark with videos and questions is introduced to evaluate the models' performance.
  • A specialized multimodal architecture with a novel memory pointer prompting mechanism is proposed to enhance video comprehension.

Read Full Article

like

21 Likes

source image

Arxiv

3d

read

292

img
dot

Image Credit: Arxiv

Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles

  • Tokenization can impact model performance by introducing a phenomenon called tokenization bias.
  • The Byte-Token Representation Lemma establishes a mapping between token and byte-level distributions.
  • A next-byte sampling algorithm eliminates tokenization bias, converting tokenized LMs into token-free ones.
  • The method shows improved performance in fill-in-the-middle tasks and model ensembles across different benchmarks.

Read Full Article

like

17 Likes

source image

Arxiv

3d

read

92

img
dot

Image Credit: Arxiv

Improving Colorectal Cancer Screening and Risk Assessment through Predictive Modeling on Medical Images and Records

  • Advances in digital pathology and deep learning enable the integration of pathology slides and medical records for more accurate CRC risk prediction.
  • A transformer-based model for histopathology image analysis was adapted to predict 5-year CRC risk using data from the New Hampshire Colonoscopy Registry.
  • Training the model to predict intermediate clinical variables improved 5-year CRC risk prediction compared to direct prediction.
  • Incorporating both imaging and non-imaging data further improved performance compared to traditional features from colonoscopy and microscopy reports.

Read Full Article

like

5 Likes

source image

Arxiv

3d

read

176

img
dot

Image Credit: Arxiv

Large Language Model-Enhanced Reinforcement Learning for Generic Bus Holding Control Strategies

  • Bus holding control is a widely adopted strategy for maintaining stability and improving the operational efficiency of bus systems.
  • Traditional model-based methods face challenges with low accuracy of bus state prediction and passenger demand estimation.
  • Reinforcement Learning (RL) has demonstrated potential in formulating bus holding strategies.
  • This study introduces an automatic reward generation paradigm, LLM-enhanced RL, which improves reward functions using Large Language Models (LLMs).

Read Full Article

like

10 Likes

source image

Arxiv

3d

read

332

img
dot

Image Credit: Arxiv

Improving Instruction-Following in Language Models through Activation Steering

  • Researchers have developed a method to improve instruction-following in language models through activation steering.
  • The method involves deriving instruction-specific vector representations from language models and using them to steer the models accordingly.
  • Activation vectors computed as the difference in activations between inputs with and without instructions enable modular approach to activation steering.
  • The approach enhances model adherence to constraints such as output format, length, and word inclusion, providing control over instruction following.

Read Full Article

like

20 Likes

source image

Arxiv

3d

read

380

img
dot

Image Credit: Arxiv

ResiDual Transformer Alignment with Spectral Decomposition

  • A recent study analyzes the phenomenon of residual specialization in transformer networks, particularly in vision transformers.
  • The study links the specialization of residual contributions to the low-dimensional structure of visual head representations.
  • The authors examine the effect of head specialization on multimodal models and its impact on zero-shot classification performance.
  • The study introduces ResiDual, a technique for spectral alignment of the residual stream, which demonstrates fine-tuning level performance on different data distributions.

Read Full Article

like

22 Likes

source image

Arxiv

3d

read

100

img
dot

Image Credit: Arxiv

To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling

  • The Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm allows the training of machine learning (ML) models with formal Differential Privacy (DP) guarantees.
  • A novel DP auditing procedure has been introduced to analyze DP-SGD with shuffling and it has been shown that DP models trained with this approach have considerably overestimated privacy guarantees.
  • The study assesses the impact on privacy leakage of several parameters, including batch size, privacy budget, and threat model.
  • The usage of shuffling instead of Poisson sub-sampling in DP-SGD can lead to significant privacy leakage, as observed in this research.

Read Full Article

like

6 Likes

source image

Arxiv

3d

read

40

img
dot

Image Credit: Arxiv

Implicit High-Order Moment Tensor Estimation and Learning Latent Variable Models

  • We study the task of learning latent-variable models.
  • We develop a general efficient algorithm for implicit moment tensor computation.
  • The algorithm enables poly-time learning algorithms for mixtures of linear regressions, mixtures of spherical Gaussians, and positive linear combinations of non-linear activations.
  • The complexity of the algorithm depends on the desired error and the target class of functions.

Read Full Article

like

2 Likes

For uninterrupted reading, download the app