menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

15h

read

340

img
dot

Image Credit: Arxiv

Fast Training of Recurrent Neural Networks with Stationary State Feedbacks

  • Recurrent neural networks (RNNs) have shown strong performance and faster inference compared to Transformers.
  • A novel method is proposed to replace the computationally expensive backpropagation through time (BPTT) algorithm with a fixed gradient feedback mechanism.
  • The method leverages state-space model (SSM) principles to directly propagate gradients from future time steps, reducing training overhead.
  • Experiments on language modeling benchmarks demonstrate competitive perplexity scores while significantly reducing training costs.

Read Full Article

like

20 Likes

source image

Arxiv

15h

read

76

img
dot

Image Credit: Arxiv

How to safely discard features based on aggregate SHAP values

  • A study investigates the practice of discarding unimportant features based on small aggregate SHAP values.
  • The study finds that small aggregate SHAP values do not necessarily imply that the corresponding feature has no effect on the function.
  • To address this issue, the study suggests aggregating SHAP values over the extended support, which is the product of the marginals of the underlying distribution.
  • The study also extends the findings to KernelSHAP, demonstrating that a small aggregate value justifies feature removal, regardless of the accuracy of the KernelSHAP approximation.

Read Full Article

like

4 Likes

source image

Arxiv

15h

read

73

img
dot

Image Credit: Arxiv

Agent-Based Modeling and Deep Neural Networks for Establishing Digital Twins of Secure Facilities under Sensing Restrictions

  • Digital twin technologies help practitioners simulate, monitor, and predict undesirable outcomes in-silico, while avoiding the cost and risks of conducting live simulation exercises.
  • Virtual reality (VR) based digital twin technologies are especially useful when monitoring human Patterns of Life (POL) in secure nuclear facilities, where live simulation exercises are too dangerous and costly to ever perform.
  • The challenge of collecting data in high-security facilities led to the use of an agent-based model driven by human activity patterns to generate synthetic movement trajectories in a digital twin system called MetaPOL.
  • The study evaluates the efficacy of using deep neural networks to predict the simulated trajectories and distinguish NPC (non-player character) movement during normal operations from that during a simulated emergency response scenario.

Read Full Article

like

4 Likes

source image

Arxiv

15h

read

186

img
dot

Image Credit: Arxiv

Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

  • Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, and existing approaches often rely on handcrafted reasoning paths.
  • A novel set of partial rewards tailored for the Text-to-SQL task is proposed, which addresses the reward sparsity issue in reinforcement learning (RL).
  • The proposed rewards include schema-linking, AI feedback, n-gram similarity, and syntax check to enhance reasoning capabilities and generalization.
  • RL-only training with the proposed rewards achieves higher accuracy and superior generalization compared to supervised fine-tuning (SFT) approaches.

Read Full Article

like

11 Likes

source image

Arxiv

15h

read

299

img
dot

Image Credit: Arxiv

Graph ODEs and Beyond: A Comprehensive Survey on Integrating Differential Equations with Graph Neural Networks

  • Graph Neural Networks (GNNs) and differential equations (DEs) are two rapidly advancing areas of research that have shown remarkable synergy in recent years.
  • This survey provides a comprehensive overview of the research at the intersection of GNNs and DEs.
  • The survey categorizes existing methods, discusses their underlying principles, and highlights their applications across different domains.
  • Open challenges and future research directions in this interdisciplinary field are also identified.

Read Full Article

like

18 Likes

source image

Arxiv

15h

read

36

img
dot

Image Credit: Arxiv

TRA: Better Length Generalisation with Threshold Relative Attention

  • Transformers struggle with length generalisation, displaying poor performance even on basic tasks.
  • Two key failures of the self-attention mechanism in Transformers are identified: inability to fully remove irrelevant information and unintentional up-weighting of irrelevant information due to learned positional biases.
  • Selective sparsity and contextualised relative distance are proposed as two mitigations to improve the generalisation capabilities of decoder only transformers.
  • Refactoring the attention mechanism with these two mitigations in place can substantially enhance the performance of transformers in handling length generalisation.

Read Full Article

like

2 Likes

source image

Arxiv

15h

read

215

img
dot

Image Credit: Arxiv

A QUBO Framework for Team Formation

  • A QUBO framework for team formation has been introduced.
  • The objective is to find a set of experts that maximizes skill coverage while minimizing costs.
  • Three TeamFormation variants with different cost functions are formulated using quadratic unconstrained binary optimization (QUBO).
  • QUBO-based solutions leveraging graph neural networks enable transfer learning.

Read Full Article

like

12 Likes

source image

Arxiv

15h

read

179

img
dot

Image Credit: Arxiv

UP-ROM : Uncertainty-Aware and Parametrised dynamic Reduced-Order Model, application to unsteady flows

  • Reduced order models (ROMs) are important in fluid mechanics for low-cost predictions in engineering applications.
  • A new nonlinear reduction strategy is presented for transient flows, incorporating parametrization and uncertainty quantification.
  • The strategy uses a variational auto-encoder (VAE) with variational inference for confidence measurement.
  • The incorporation of attention mechanisms enhances generalization across different dynamics, improving model performance.

Read Full Article

like

10 Likes

source image

Arxiv

15h

read

21

img
dot

Image Credit: Arxiv

Two Heads Are Better than One: Model-Weight and Latent-Space Analysis for Federated Learning on Non-iid Data against Poisoning Attacks

  • Federated Learning is vulnerable to model poisoning attacks due to its distributed nature.
  • Existing defenses against model poisoning attacks assume the data at remote clients are under iid, while in practice they are non-iid.
  • GeminiGuard is a novel defense approach that addresses the gap in non-iid scenarios.
  • GeminiGuard incorporates model-weight analysis and latent-space analysis to enhance defense performance.

Read Full Article

like

1 Like

source image

Arxiv

15h

read

25

img
dot

Image Credit: Arxiv

Enhancing Physics-Informed Neural Networks with a Hybrid Parallel Kolmogorov-Arnold and MLP Architecture

  • Neural networks have emerged as powerful tools for modeling complex physical systems.
  • A novel architecture, called Hybrid Parallel Kolmogorov-Arnold Network (KAN) and Multi-Layer Perceptron (MLP) Physics-Informed Neural Network (HPKM-PINN) has been proposed.
  • HPKM-PINN combines the strengths of KAN's interpretable function approximation and MLP's nonlinear feature learning for enhanced predictive performance.
  • Benchmark experiments show that HPKM-PINN significantly reduces loss values compared to standalone KAN or MLP models in solving partial differential equations (PDEs).

Read Full Article

like

1 Like

source image

Arxiv

15h

read

51

img
dot

Image Credit: Arxiv

SalesRLAgent: A Reinforcement Learning Approach for Real-Time Sales Conversion Prediction and Optimization

  • SalesRLAgent is a novel framework utilizing reinforcement learning to predict conversion probability in real-time sales conversations.
  • It treats conversion prediction as a sequential decision problem, training on synthetic data generated using GPT-4O.
  • SalesRLAgent achieves 96.7% accuracy in conversion prediction, outperforming traditional LLM-only approaches by 34.7%.
  • Integration with existing sales platforms shows a 43.2% increase in conversion rates when representatives utilize SalesRLAgent's real-time guidance.

Read Full Article

like

3 Likes

source image

Arxiv

15h

read

32

img
dot

Image Credit: Arxiv

Solve sparse PCA problem by employing Hamiltonian system and leapfrog method

  • Principal Component Analysis (PCA) is widely used for dimensionality reduction but lacks interpretability due to dense linear combinations of features.
  • A novel sparse PCA algorithm is proposed that imposes sparsity through a smooth L1 penalty and utilizes a Hamiltonian formulation.
  • Two distinct numerical methods, Proximal Gradient (ISTA) and leapfrog (fourth-order Runge-Kutta), are employed to minimize the energy function.
  • Experimental evaluations on a face recognition dataset show that the proposed sparse PCA methods achieve higher classification accuracy than conventional PCA.

Read Full Article

like

1 Like

source image

Arxiv

15h

read

329

img
dot

Image Credit: Arxiv

Pareto Continual Learning: Preference-Conditioned Learning and Adaption for Dynamic Stability-Plasticity Trade-off

  • Continual learning aims to learn multiple tasks sequentially.
  • Pareto Continual Learning (ParetoCL) is a novel framework for balancing the stability and plasticity trade-off in continual learning.
  • ParetoCL formulates the trade-off as a multi-objective optimization problem and introduces a preference-conditioned model to dynamically adapt during inference.
  • Extensive experiments show that ParetoCL outperforms state-of-the-art methods in diverse continual learning scenarios.

Read Full Article

like

19 Likes

source image

Arxiv

15h

read

138

img
dot

Image Credit: Arxiv

What Makes an Evaluation Useful? Common Pitfalls and Best Practices

  • Following the rapid increase in Artificial Intelligence (AI) capabilities in recent years, the AI community has voiced concerns regarding possible safety risks.
  • To support decision-making on the safe use and development of AI systems, there is a growing need for high-quality evaluations of dangerous model capabilities.
  • In this practitioners' perspective paper, a set of best practices for safety evaluations is presented, drawing on prior work in model evaluation and illustrated through cybersecurity examples.
  • The paper discusses the steps of the initial thought process, characteristics of a useful evaluation, and additional considerations for building a comprehensive evaluation suite.

Read Full Article

like

8 Likes

source image

Arxiv

15h

read

252

img
dot

Image Credit: Arxiv

Towards Trustworthy GUI Agents: A Survey

  • GUI agents, powered by large foundation models, can interact with digital interfaces, enabling various applications in web automation, mobile navigation, and software testing.
  • This survey examines the trustworthiness of GUI agents in five critical dimensions: security vulnerabilities, reliability in dynamic environments, transparency and explainability, ethical considerations, and evaluation methodologies.
  • Major challenges include vulnerability to adversarial attacks, cascading failure modes in sequential decision-making, and a lack of realistic evaluation benchmarks.
  • Establishing robust safety standards and responsible development practices is essential to advance trustworthy GUI agents.

Read Full Article

like

15 Likes

For uninterrupted reading, download the app