menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1d

read

131

img
dot

Image Credit: Arxiv

A Formal Framework for Understanding Length Generalization in Transformers

  • A formal framework is introduced to analyze length generalization in transformers with learnable absolute positional encodings.
  • The framework characterizes identifiable functions from long inputs and proves the possibility of length generalization for a wide range of problems.
  • Experimental validation shows the theory as a predictor of success and failure of length generalization in various tasks.
  • The theory offers explanations for empirical observations and allows for provably predicting length generalization capabilities in transformers.

Read Full Article

like

7 Likes

source image

Arxiv

1d

read

116

img
dot

Image Credit: Arxiv

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

  • MTL-LoRA is a new approach for multi-task learning (MTL) scenarios.
  • It enhances the performance of LoRA, a popular method for domain adaptation.
  • MTL-LoRA incorporates additional task-adaptive parameters to capture task-specific information while maintaining low-dimensional spaces.
  • Experimental results show that MTL-LoRA outperforms LoRA and its variants in multi-task learning settings.

Read Full Article

like

7 Likes

source image

Arxiv

1d

read

161

img
dot

Image Credit: Arxiv

Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion

  • A novel continuous-space latent diffusion framework called VQ-LCMD is introduced, allowing generative modeling of discrete data.
  • VQ-LCMD combines joint embedding-diffusion training with a consistency-matching (CM) loss to stabilize training and enhance performance.
  • Experiments demonstrate that VQ-LCMD outperforms discrete-state latent diffusion models on FFHQ, LSUN Churches, and LSUN Bedrooms benchmarks.
  • VQ-LCMD achieves a FID of 6.81 for class-conditional image generation on ImageNet with 50 steps.

Read Full Article

like

9 Likes

source image

Arxiv

1d

read

286

img
dot

Image Credit: Arxiv

UniFlow: A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction

  • UniFlow is a foundational model for unified urban spatio-temporal flow prediction.
  • It combines grid-based and graph-based data to capture complex correlations and dynamics.
  • UniFlow utilizes a spatio-temporal transformer architecture and SpatioTemporal Memory Retrieval Augmentation (ST-MRA) for enhanced predictions.
  • Experiments show that UniFlow outperforms existing models, particularly in scenarios with limited data availability.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

192

img
dot

Image Credit: Arxiv

Robust Bayesian Optimization via Localized Online Conformal Prediction

  • Bayesian optimization (BO) is a sequential approach for optimizing black-box objective functions using zeroth-order noisy observations.
  • To address the issue of model misspecification in BO, a localized online conformal prediction-based Bayesian optimization (LOCBO) algorithm is introduced.
  • LOCBO corrects the likelihood of the Gaussian process (GP) model through localized online conformal prediction, resulting in a calibrated posterior distribution on the objective function.
  • Experiments on synthetic and real-world optimization tasks confirm that LOCBO outperforms state-of-the-art BO algorithms in the presence of model misspecification.

Read Full Article

like

11 Likes

source image

Arxiv

1d

read

135

img
dot

Image Credit: Arxiv

TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG

  • Retrieval-Augmented Generation (RAG) is widely used for enhancing LLM retrieval capabilities but has limitations in commercial use cases.
  • TOBUGraph is a graph-based retrieval framework that overcomes the limitations of RAG.
  • TOBUGraph extracts structured knowledge and relationships from unstructured data.
  • TOBUGraph outperforms multiple RAG implementations, improving retrieval accuracy for personal memory organization and retrieval.

Read Full Article

like

8 Likes

source image

Arxiv

1d

read

41

img
dot

Image Credit: Arxiv

Identifying Predictions That Influence the Future: Detecting Performative Concept Drift in Data Streams

  • Concept drift is extensively studied in stream learning, but the impact of model predictions on concept drift is often overlooked.
  • Performative drift refers to situations where a model's predictions induce concept drift in a self-fulfilling or self-negating manner.
  • A novel performative drift detection approach called CheckerBoard Performative Drift Detection (CB-PDD) is proposed.
  • CB-PDD shows high efficacy, low false detection rates, and the ability to effectively detect performative drift in datasets.

Read Full Article

like

2 Likes

source image

Arxiv

1d

read

338

img
dot

Image Credit: Arxiv

Provably-Safe Neural Network Training Using Hybrid Zonotope Reachability Analysis

  • This work addresses the challenge of enforcing constraints on the output of neural networks, particularly for safety-critical control applications.
  • The proposed method utilizes reachability analysis with scaled hybrid zonotopes, which allows for the exact image of a non-convex input set to be encouraged for a neural network with rectified linear unit (ReLU) nonlinearities.
  • The method has shown to be effective and fast for networks with up to 240 neurons, with the computational complexity dominated by inverse operations on matrices that scale linearly in size with the number of neurons and complexity of input and unsafe sets.
  • The practicality of the method has been demonstrated by training a forward-invariant neural network controller for a non-convex input set and generating safe reach-avoid plans for a black-box dynamical system.

Read Full Article

like

20 Likes

source image

Arxiv

1d

read

3

img
dot

Image Credit: Arxiv

Sparse identification of nonlinear dynamics and Koopman operators with Shallow Recurrent Decoder Networks

  • Researchers have developed a new method called SINDy-SHRED for modeling real-world spatio-temporal data.
  • SINDy-SHRED utilizes Gated Recurrent Units to model sparse sensor measurements and a shallow decoder network to reconstruct the full spatio-temporal field.
  • The algorithm introduces a SINDy-based regularization for converging to a linear Koopman-SHRED model.
  • SINDy-SHRED outperforms current baseline deep learning models in accuracy, training time, and data requirements for video predictions.

Read Full Article

like

Like

source image

Arxiv

1d

read

116

img
dot

Image Credit: Arxiv

Predicting human decisions with behavioral theories and machine learning

  • Predicting human decisions under risk and uncertainty remains a fundamental challenge.
  • BEAST Gradient Boosting (BEAST-GB) is a hybrid model integrating behavioral theory and machine learning.
  • BEAST-GB predicts risky choice more accurately than neural networks and existing behavioral models.
  • Integrating machine learning with behavioral theory improves the ability to predict and understand human behavior.

Read Full Article

like

7 Likes

source image

Arxiv

1d

read

252

img
dot

Image Credit: Arxiv

Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment

  • Algorithmic pre-trial risk assessments in the US criminal justice system provide deterministic classification scores and recommendations to help judges in release decisions.
  • A research study analyzes data from a field experiment on algorithmic pre-trial risk assessments to investigate the possibility of improving the scores and recommendations.
  • Using a maximin robust optimization approach, the study aims to find a policy that maximizes the worst-case expected utility, ensuring the statistical safety of policy improvement.
  • The analysis of the field experiment data shows certain components of the risk assessment instrument can be safely improved by classifying arrestees as lower risk under various utility specifications.

Read Full Article

like

15 Likes

source image

Arxiv

1d

read

282

img
dot

Image Credit: Arxiv

Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

  • A new method called Causal DVAE (CDVAE) has been developed for estimating treatment effects over time in longitudinal data.
  • CDVAE assumes the presence of unobserved risk factors that only affect the sequence of outcomes, targeting Individual Treatment Effect (ITE) estimation with unobserved heterogeneity.
  • The model combines a Dynamic Variational Autoencoder (DVAE) framework with a weighting strategy using propensity scores to estimate counterfactual responses.
  • Evaluations show that CDVAE outperforms existing state-of-the-art models in accurately estimating ITE and capturing heterogeneity in longitudinal data.

Read Full Article

like

16 Likes

source image

Arxiv

1d

read

131

img
dot

Image Credit: Arxiv

Individualized Policy Evaluation and Learning under Clustered Network Interference

  • Policy evaluation and learning often assume no interference among units, but this can lead to biased evaluation and learning outcomes.
  • The paper focuses on individualized treatment rules (ITR) under clustered network interference.
  • A semiparametric structural model is used to evaluate the performance of ITR.
  • The proposed methodology improves the performance of learned policies through more efficient evaluation.

Read Full Article

like

7 Likes

source image

Arxiv

1d

read

327

img
dot

Image Credit: Arxiv

DG-TTA: Out-of-domain Medical Image Segmentation through Augmentation and Descriptor-driven Domain Generalization and Test-Time Adaptation

  • DG-TTA: Out-of-domain Medical Image Segmentation through Augmentation and Descriptor-driven Domain Generalization and Test-Time Adaptation
  • Researchers propose using a powerful generalizing descriptor and augmentation to enable domain-generalized pre-training and test-time adaptation for high-quality segmentation in unseen domains.
  • The method was evaluated on five different publicly available datasets, including 3D CT and MRI images, in abdominal, spine, and cardiac imaging scenarios.
  • Results show significant improvements in cross-domain prediction for abdominal, spine, and cardiac scenarios, with increased Dice similarity scores ranging from 14.2% to 72.9%.

Read Full Article

like

19 Likes

source image

Arxiv

1d

read

161

img
dot

Image Credit: Arxiv

DT-DDNN: A Physical Layer Security Attack Detector in 5G RF Domain for CAVs

  • A new deep learning-based technique for detecting jammers in 5G Connected and Automated Vehicle (CAV) networks has been developed.
  • The technique focuses on the Synchronization Signal Block (SSB) and leverages RF domain features to improve network robustness.
  • By extracting PSS correlation and energy per null resource elements (EPNRE) characteristics, the method distinguishes between normal and jammed signals with high precision.
  • The proposed technique achieves a 96.4% detection rate at extra low jamming power, specifically with SJNR between 15 to 30 dB.

Read Full Article

like

9 Likes

For uninterrupted reading, download the app