menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1d

read

350

img
dot

Image Credit: Arxiv

mLaSDI: Multi-stage latent space dynamics identification

  • Determining accurate numerical solutions of partial differential equations (PDEs) is crucial, leading researchers to develop reduced-order models (ROMs) like Latent Space Dynamics Identification (LaSDI).
  • LaSDI is a data-driven, non-intrusive ROM framework that compresses training data using an autoencoder to learn a system of user-chosen ordinary differential equations (ODEs) for latent space dynamics.
  • LaSDI allows for rapid predictions by evolving the low-dimensional ODEs in the latent space.
  • The autoencoder in LaSDI can struggle to accurately reconstruct training data and satisfy imposed dynamics in complex or high-frequency scenarios.
  • To overcome this challenge, researchers propose multi-stage Latent Space Dynamics Identification (mLaSDI), where several autoencoders are trained sequentially to correct errors from previous stages.
  • Applying mLaSDI with small autoencoders leads to lower prediction and reconstruction errors, along with reduced training time compared to LaSDI.

Read Full Article

like

21 Likes

source image

Arxiv

1d

read

240

img
dot

Image Credit: Arxiv

Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs

  • The study discusses improving pooling methods in transformer models for reinforcement learning and vision tasks.
  • Pooling methods like AvgPool, MaxPool, and ClsToken are found to struggle with fluctuating signal-to-noise ratios.
  • An attention-based adaptive pooling technique is proposed to minimize signal loss in scenarios with varying SNRs.
  • The adaptive pooling method proves to be more robust and effective compared to traditional pooling approaches in various tasks.
  • The research emphasizes on vector quantization to optimize information retention in transformer model outputs.
  • Experiments on synthetic datasets and real-world tasks demonstrate the superiority of adaptive pooling in maintaining performance.
  • The study provides theoretical insights and practical validations for the effectiveness of the proposed adaptive pooling strategy.

Read Full Article

like

14 Likes

source image

Arxiv

1d

read

114

img
dot

Image Credit: Arxiv

SoK: Machine Unlearning for Large Language Models

  • Large language model (LLM) unlearning is crucial in machine learning to remove the influence of specific training data without retraining the model entirely.
  • Techniques like Gradient Ascent, model editing, and re-steering hidden representations have been proposed for LLM unlearning.
  • An intention-oriented taxonomy is proposed in the paper to classify unlearning methods based on whether they aim to truly remove internal knowledge or just suppress its effects.
  • The paper revisits findings suggesting that many removal methods may functionally behave like suppression and explores the necessity and achievability of true removal.
  • Existing evaluation strategies for unlearning are surveyed, current metrics and benchmarks are critiqued, and suggestions for more reliable evaluations are provided.
  • Practical challenges like scalability and support for sequential unlearning in the broader deployment of unlearning methods are highlighted.
  • This work aims to provide a comprehensive framework for understanding and advancing unlearning in generative AI, supporting future research and guiding policy decisions on data removal and privacy.

Read Full Article

like

6 Likes

source image

Arxiv

1d

read

259

img
dot

Image Credit: Arxiv

Agent-based Condition Monitoring Assistance with Multimodal Industrial Database Retrieval Augmented Generation

  • Condition monitoring (CM) is essential in ensuring reliability and efficiency in the process industry.
  • Computerized maintenance systems can detect and classify faults, but fault severity estimation and maintenance decisions often rely on human expert analysis.
  • Automated systems currently exhibit uncertainty and high false alarm rates, leading to increased workload and reduced efficiency.
  • A framework called MindRAG integrates large language model (LLM)-based reasoning agents with CM workflows.
  • The goal is to reduce false alarms, enhance fault severity estimation, improve decision support, and provide explainable interfaces.
  • MindRAG combines multimodal retrieval-augmented generation (RAG) with vector store structures designed for CM data.
  • Annotations and maintenance work orders are used as surrogates for labels in training predictive models on noisy real-world datasets.
  • Key contributions include structuring CM data into a multimodal vector store, developing tailored RAG techniques, creating practical reasoning agents, and presenting an experimental framework for evaluation.
  • Preliminary results suggest that MindRAG offers meaningful decision support for better alarm management and enhanced interpretability of CM systems.

Read Full Article

like

15 Likes

source image

Arxiv

1d

read

137

img
dot

Image Credit: Arxiv

CFMI: Flow Matching for Missing Data Imputation

  • CFMI (Conditional Flow Matching for Imputation) is a new method introduced for imputing missing data.
  • The methodology combines continuous normalizing flows, flow-matching, and shared conditional modeling to address traditional multiple imputation challenges.
  • Comparison with nine classical and state-of-the-art imputation methods on 24 small to moderate-dimensional datasets shows that CFMI matches or surpasses them across various metrics.
  • When applied to zero-shot imputation of time-series data, CFMI matches the accuracy of a diffusion-based method while being more computationally efficient.
  • CFMI performs as well as traditional methods on lower-dimensional data and scales effectively to high-dimensional settings, often outperforming other deep learning-based approaches.

Read Full Article

like

8 Likes

source image

Arxiv

1d

read

312

img
dot

Image Credit: Arxiv

Uncertainty Prioritized Experience Replay

  • Prioritized experience replay is a crucial component of value-based deep reinforcement learning models.
  • Transitions are typically prioritized based on temporal difference error, but this can favor noisy transitions.
  • Using epistemic uncertainty estimation is proposed to guide transition prioritization from the replay buffer.
  • Epistemic uncertainty quantifies uncertainty that can be reduced by learning, reducing sampled unpredictable transitions.
  • Benefits of epistemic uncertainty prioritized replay are illustrated in tabular toy models and evaluated on the Atari suite.
  • The approach outperformed quantile regression deep Q-learning benchmarks.
  • This method paves the way for uncertainty prioritized replay in reinforcement learning agents.

Read Full Article

like

18 Likes

source image

Arxiv

1d

read

305

img
dot

Image Credit: Arxiv

G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration

  • Constructing robust simulators is crucial for guiding policy in fields like healthcare and logistics.
  • Current methods often struggle with generalization and accuracy, especially when using Large Language Models (LLMs).
  • G-Sim is introduced as a hybrid framework for automating simulator construction.
  • G-Sim integrates LLM-driven structural design with empirical calibration.
  • It utilizes an LLM to propose and refine simulator components guided by domain knowledge.
  • G-Sim grounds the simulator in reality by estimating parameters using calibration techniques.
  • It can leverage likelihood-free and gradient-free methods for parameter estimation and simulation-based inference.
  • G-Sim is capable of handling non-differentiable and stochastic simulators.
  • By combining domain priors with empirical evidence, G-Sim generates reliable and causally-informed simulators.
  • This mitigates data-inefficiency and allows for robust system-level interventions in complex decision-making.

Read Full Article

like

18 Likes

source image

Arxiv

1d

read

289

img
dot

Image Credit: Arxiv

Learning The Minimum Action Distance

  • This paper introduces a framework for learning a state representation for Markov decision processes solely from state trajectories.
  • The framework does not require reward signals or actions executed by the agent.
  • The proposed framework focuses on learning the minimum action distance (MAD), which is the minimum number of actions needed to move between states.
  • MAD serves as a fundamental metric capturing the environment's structure and assists in goal-conditioned reinforcement learning and reward shaping.
  • The self-supervised learning approach constructs an embedding space where the distances between states correspond to their MAD.
  • The approach is evaluated on various environments with known MAD values, including deterministic and stochastic dynamics, discrete and continuous state spaces, and noisy observations.
  • Empirical results show that the proposed method efficiently learns accurate MAD representations and outperforms existing state representation methods in terms of quality.

Read Full Article

like

17 Likes

source image

Arxiv

1d

read

278

img
dot

Image Credit: Arxiv

A Topic Modeling Analysis of Stigma Dimensions, Social, and Related Behavioral Circumstances in Clinical Notes Among Patients with HIV

  • Research objective: Characterize stigma dimensions, social, and related behavioral circumstances in PLWHs' clinical notes using natural language processing.
  • Methodology: Utilized a cohort of 9,140 PLWHs, applied Latent Dirichlet Allocation for topic modeling analysis on EHR notes.
  • Methodology (contd.): Domain experts created a stigma keyword list, iteratively reviewed notes, and conducted word frequency analysis.
  • Findings: Uncovered various themes like 'Mental Health Concern and Stigma', 'Social Support', 'Limited Healthcare Access', 'Treatment Refusal', etc.
  • Conclusion: Topic modeling identified stigma and social themes, aiding in scalable assessment and enhancing patient outcomes.

Read Full Article

like

16 Likes

source image

Arxiv

1d

read

171

img
dot

Image Credit: Arxiv

Causal Graph Recovery in Neuroimaging through Answer Set Programming

  • Learning causal structures from time series data is challenging when the measurement frequency doesn't match the causal timescale.
  • Sub-sampling time series data can result in multiple equally possible causal graphs due to information loss.
  • Researchers are using answer set programming (ASP) to address the challenges of sub-sampling in deriving causal graphs.
  • ASP helps find the most probable underlying graph and provides an equivalence class of possible graphs for expert selection.
  • Using ASP and graph theory allows for faster and more accurate solutions compared to traditional approaches.
  • The approach was validated on simulated data and empirical brain connectivity, showing superiority over established methods.
  • The method achieved a 12% improvement in the F1 score compared to existing approaches.
  • State-of-the-art results were obtained in terms of precision and recall for reconstructing causal graphs from sub-sampled time series data.
  • The method displayed robustness to varying degrees of sub-sampling in realistic simulations.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

26

img
dot

Image Credit: Arxiv

ErrorEraser: Unlearning Data Bias for Improved Continual Learning

  • Continual Learning (CL) aims to prevent forgetting and transfer knowledge for learning new tasks.
  • A novel perspective suggests intentional forgetting is necessary in CL due to biases in real-world data.
  • Existing CL methods overlook biases, causing models to learn spurious correlations that hinder knowledge retention and transfer.
  • Proposed solution, ErrorEraser, removes biased memories in CL, improving performance on old and new tasks.
  • ErrorEraser includes two modules: Error Identification and Error Erasure to address biases.
  • Error Identification learns the data distribution in the feature space for accurate identification of biased samples.
  • Error Erasure ensures only erroneous knowledge is removed by adjusting the decision space for outliers.
  • An incremental feature distribution learning strategy reduces resource overhead in downstream tasks during error identification.
  • Experimental results demonstrate ErrorEraser reduces data bias impact, improving accuracy and reducing forgetting rates in CL.
  • The code for ErrorEraser is accessible at https://github.com/diadai/ErrorEraser.

Read Full Article

like

1 Like

source image

Arxiv

1d

read

167

img
dot

Image Credit: Arxiv

Anomaly Detection and Generation with Diffusion Models: A Survey

  • Anomaly detection (AD) is crucial in various domains like cybersecurity, finance, healthcare, and manufacturing by identifying unexpected patterns in real-world data.
  • Diffusion models (DMs) in deep learning have gained interest for their ability to learn complex data distributions and generate high-fidelity samples, serving as a robust framework for unsupervised AD.
  • A survey on anomaly detection and generation with diffusion models (ADGDM) analyzes theoretical foundations and practical implementations across different data types.
  • Unlike previous surveys that treat AD and generation as separate, this survey emphasizes their synergistic relationship, showcasing how generation and detection methods can enhance each other.
  • The survey categorizes ADGDM methods based on anomaly scoring mechanisms, conditioning strategies, and architectural designs, discussing their strengths and limitations.
  • Key challenges like scalability and computational efficiency are highlighted along with future directions such as efficient architectures and integration with foundation models.
  • The survey aims to assist researchers and practitioners in utilizing DMs for innovative AD solutions by synthesizing recent advances and identifying open research questions.

Read Full Article

like

10 Likes

source image

Arxiv

1d

read

68

img
dot

Image Credit: Arxiv

LPO: Towards Accurate GUI Agent Interaction via Location Preference Optimization

  • Autonomous agents are changing GUI interactions using natural language as an intermediary.
  • Supervised Fine-Tuning methods in GUI agents struggle with accurate positional data perception.
  • Reinforcement learning methods often fall short in assessing positional accuracy effectively.
  • Location Preference Optimization (LPO) is introduced to optimize interaction preferences using locational data.
  • LPO uses information entropy to predict interaction positions and introduces a dynamic location reward function based on physical distance.
  • LPO, supported by Group Relative Preference Optimization (GRPO), enhances interaction precision in GUI environments.
  • Experiments demonstrate LPO's superior performance, achieving state-of-the-art results in offline benchmarks and real-world online evaluations.
  • The code for LPO will be publicly available soon on GitHub at https://github.com/AIDC-AI/LPO.

Read Full Article

like

4 Likes

source image

Arxiv

1d

read

57

img
dot

Image Credit: Arxiv

Revisiting Diffusion Models: From Generative Pre-training to One-Step Generation

  • Diffusion distillation is a technique used to reduce sampling cost but can lead to degraded student performance.
  • Incorporating a GAN objective in diffusion distillation can improve results, though the mechanism is not fully understood.
  • Mismatched step sizes and parameter numbers between teacher and student models can hinder convergence in distillation.
  • A standalone GAN objective can convert diffusion models into efficient one-step generators without the need for distillation loss.
  • Diffusion training is proposed as a form of generative pre-training, enhancing models for lightweight GAN fine-tuning.
  • A one-step generation model was created by fine-tuning a pre-trained model with 85% frozen parameters.
  • Strong performance was achieved using only 0.2M images and near-SOTA results with 5M images in the one-step generation model.
  • Frequency-domain analysis was presented to explain the one-step generative capability acquired in diffusion training.
  • Overall, the study provides a new perspective on diffusion training, emphasizing its role as a powerful generative pre-training process.

Read Full Article

like

3 Likes

source image

Arxiv

1d

read

194

img
dot

Image Credit: Arxiv

Efficient Prediction of SO(3)-Equivariant Hamiltonian Matrices via SO(2) Local Frames

  • Researchers have introduced an efficient network, QHNetV2, for predicting Hamiltonian matrices to speed up electronic structure calculations.
  • The network achieves global SO(3) equivariance without using costly SO(3) Clebsch-Gordan tensor products.
  • The approach is based on the relationship between off-diagonal blocks of the Hamiltonian matrix and the SO(2) local frame.
  • New efficient and powerful SO(2)-equivariant operations are introduced to perform all off-diagonal feature updates and message passing within SO(2) local frames.
  • Continuous SO(2) tensor product is executed within the SO(2) local frame at each node to fuse node features.
  • Experiments on QH9 and MD17 datasets exhibit the model's high performance across various molecular structures and trajectories.
  • The proposed SO(2) operations offer a scalable and symmetry-aware approach for learning electronic structures.
  • The code will be accessible as part of the AIRS library on GitHub at https://github.com/divelab/AIRS.

Read Full Article

like

11 Likes

For uninterrupted reading, download the app