menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1h

read

311

img
dot

Image Credit: Arxiv

Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints

  • HC-RLHF is a method proposed for language model alignment that ensures high-confidence safety guarantees while maximizing helpfulness.
  • The method decouples human preferences into helpfulness and harmlessness, employing a two-step process to find safe solutions.
  • It optimizes the reward function under a pessimistic cost constraint and undergoes a safety test to ensure performance stays within an upper-confidence bound of the actual cost constraint.
  • Empirical analysis shows that HC-RLHF can align language models with human preferences, producing safe models with high probability and improving harmlessness and helpfulness.

Read Full Article

like

18 Likes

source image

Arxiv

1h

read

308

img
dot

Image Credit: Arxiv

Sparse Interpretable Deep Learning with LIES Networks for Symbolic Regression

  • Symbolic regression (SR) aims to discover closed-form mathematical expressions that accurately describe data, offering interpretability and analytical insight beyond black-box models.
  • Introducing LIES (Logarithm, Identity, Exponential, Sine), a fixed neural network architecture with interpretable primitive activations optimized to model symbolic expressions.
  • The framework extracts compact formulae from LIES networks by training with oversampling strategy and a tailored loss function to promote sparsity and prevent gradient instability.
  • Experiments on SR benchmarks show that the LIES framework consistently produces sparse and accurate symbolic formulae outperforming all baselines, with the importance of each design component demonstrated through ablation studies.

Read Full Article

like

18 Likes

source image

Arxiv

1h

read

232

img
dot

Image Credit: Arxiv

SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space

  • Designing neural networks typically involves manual trial and error or neural architecture search (NAS) followed by weight training.
  • A new approach called SWAT-NN optimizes both the architecture and the weights of a neural network simultaneously.
  • This method uses a universal multi-scale autoencoder to embed architectural and parametric information into a continuous latent space.
  • Experiments show that SWAT-NN effectively discovers sparse and compact neural networks with strong performance on synthetic regression tasks.

Read Full Article

like

13 Likes

source image

Arxiv

1h

read

225

img
dot

Image Credit: Arxiv

Universal Differential Equations for Scientific Machine Learning of Node-Wise Battery Dynamics in Smart Grids

  • Universal Differential Equations (UDEs) combine neural networks with physical differential equations for scientific machine learning, aiding data-efficient and interpretable modeling.
  • In the smart grid domain, modeling node-wise battery dynamics poses challenges due to varying solar input and household load profiles, leading to the proposal of a UDE-based approach.
  • This approach utilizes synthetic data to simulate battery dynamics, with a neural residual capturing unmodeled dynamics arising from diverse node demand and environmental conditions.
  • Experiments show that the UDE model closely matches actual battery trajectories, demonstrates smooth convergence, and remains stable in long-term forecasts, indicating its effectiveness for battery modeling in renewable-integrated smart grids.

Read Full Article

like

13 Likes

source image

Arxiv

1h

read

219

img
dot

Image Credit: Arxiv

The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks

  • A research analyzing the impact of feature scaling on Machine Learning found distinct effects on various algorithms and datasets in classification and regression tasks.
  • The study evaluated 12 scaling techniques across 14 ML algorithms and 16 datasets, showing differences in predictive performance metrics and computational costs.
  • Ensemble methods like Random Forest and XGBoost demonstrated consistent performance regardless of scaling techniques, while Logistic Regression, SVMs, TabNet, and MLPs showed significant performance variations depending on the scaler used.
  • The research provides valuable insights for practitioners by emphasizing the importance of choosing appropriate feature scaling techniques based on specific machine learning models.

Read Full Article

like

13 Likes

source image

Arxiv

1h

read

28

img
dot

Image Credit: Arxiv

From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium

  • Researchers have developed a method called ECON to improve reasoning in large language models by recasting multi-LLM coordination as an incomplete-information game and seeking a Bayesian Nash equilibrium.
  • ECON is a hierarchical reinforcement-learning paradigm that allows each LLM to independently select responses based on its beliefs about co-agents, without the need for costly inter-agent exchanges.
  • Mathematical proofs show that ECON achieves a significantly tighter regret bound compared to non-equilibrium multi-agent schemes, and empirical results demonstrate an average performance improvement of 11.2% across six different benchmarks.
  • Experiments also confirm ECON's scalability and ability to incorporate additional models, indicating potential for larger and more powerful multi-LLM ensembles.

Read Full Article

like

1 Like

source image

Arxiv

1h

read

19

img
dot

Image Credit: Arxiv

From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?

  • Existing benchmarks primarily assess passive reasoning abilities of large language models (LLMs), providing all necessary information.
  • A new benchmark called AR-Bench is introduced to evaluate LLMs' active reasoning skills by requiring interaction with external systems to acquire missing evidence.
  • AR-Bench comprises task families like detective cases, situation puzzles, and guessing numbers to measure performance across various reasoning challenges.
  • Empirical evaluation on AR-Bench shows that current LLMs struggle with active reasoning, indicating a need for advancing methodology to enhance their capabilities.

Read Full Article

like

1 Like

source image

Arxiv

1h

read

9

img
dot

Image Credit: Arxiv

H$^2$GFM: Towards unifying Homogeneity and Heterogeneity on Text-Attributed Graphs

  • The Graph Foundation Model (GFM) aims to provide a unified model for graph learning across different graphs and tasks.
  • A novel framework called H$^2$GFM has been introduced to generalize across both homogeneous TAGs (HoTAGs) and heterogeneous TAGs (HeTAGs).
  • H$^2$GFM uses a context-adaptive graph transformer (CGT) to capture information from context neighbors and their relationships for robust node representations.
  • Experiments on various types of text-attributed graphs show that H$^2$GFM is effective in capturing structural patterns among different graph types.

Read Full Article

like

Like

source image

Arxiv

1h

read

63

img
dot

Image Credit: Arxiv

Learnable Spatial-Temporal Positional Encoding for Link Prediction

  • Accurate predictions in graph deep learning rely on positional encoding mechanisms like graph neural networks and graph transformers.
  • Limitations of current positional encodings include predefined functions, limited adaptability to complex graphs, and focus on structural information rather than real-world temporal evolution.
  • Researchers have developed Learnable Spatial-Temporal Positional Encoding (L-STEP) and a temporal link prediction model named L-STEP to address these limitations.
  • L-STEP demonstrates superior performance on various datasets and benchmarks, proving its effectiveness in temporal link prediction tasks.

Read Full Article

like

3 Likes

source image

Arxiv

1h

read

292

img
dot

Image Credit: Arxiv

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

  • Discrete diffusion models gradually undo noise with Markov process
  • Masking diffusion model performs best despite not denoising gradually
  • Masking diffusion utilizes fundamental difference in discrete Markov processes
  • Schedule-conditioned discrete diffusion (SCUD) outperforms masking diffusion

Read Full Article

like

17 Likes

source image

Arxiv

1h

read

193

img
dot

Image Credit: Arxiv

Graph Prompting for Graph Learning Models: Recent Advances and Future Directions

  • Graph learning models are effective in learning representations from graph data in various scenarios.
  • The 'pre-training, adaptation' scheme is commonly used for training graph learning models.
  • Graph prompting has emerged as a promising approach during the adaptation phase, allowing trainable prompts while keeping pre-trained models unchanged.
  • Recent advancements in graph prompting, pre-training methods, mainstream techniques, real-world applications, and future directions are reviewed in this paper.

Read Full Article

like

11 Likes

source image

Arxiv

1h

read

92

img
dot

Image Credit: Arxiv

A Simple Analysis of Discretization Error in Diffusion Models

  • Diffusion models, based on discretizations of stochastic differential equations, are known for their generative performance.
  • A simplified theoretical framework has been proposed to analyze Euler-Maruyama discretization of variance-preserving SDEs in Denoising Diffusion Probabilistic Models (DDPMs).
  • The study leverages Gröenwall's inequality to establish a convergence rate of O(1/T^1/2) under Lipschitz assumptions, simplifying previous proofs.
  • Experiments validate the theory, confirming the error scaling, effectiveness of discrete noise over Gaussian noise, and the impact of incorrect noise scaling on performance.

Read Full Article

like

5 Likes

source image

Arxiv

1h

read

15

img
dot

Image Credit: Arxiv

Dynamical System Optimization

  • An optimization framework is developed focusing on transferring control authority to a parametric policy to create an autonomous dynamical system.
  • The framework allows optimizing policy parameters independently of controls or actions, without relying on approximate Dynamic Programming and Reinforcement Learning.
  • Simpler algorithms at the autonomous system level are derived, performing computations equivalent to policy gradients, Hessians, and other optimization methods.
  • The framework is applicable to various tasks like behavioral cloning, mechanism design, system identification, and tuning generative AI models.

Read Full Article

like

Like

source image

Arxiv

1h

read

311

img
dot

Image Credit: Arxiv

Differentially Private Relational Learning with Entity-level Privacy Guarantees

  • Implementing differential privacy in relational learning is important for protecting the privacy of individual entities in sensitive domains.
  • Differential Privacy (DP) provides a structured approach to quantify privacy risks, with DP-SGD being commonly used for private model training.
  • Challenges in applying DP-SGD to relational learning include high sensitivity due to entities participating in multiple relations and complex sampling procedures.
  • This work introduces a framework for relational learning with formal entity-level DP guarantees, including sensitivity analysis, adaptive gradient clipping, and privacy amplification for coupled sampling procedures.

Read Full Article

like

18 Likes

source image

Arxiv

1h

read

120

img
dot

Image Credit: Arxiv

AlphaFold Database Debiasing for Robust Inverse Folding

  • The AlphaFold Protein Structure Database (AFDB) is known for its high structural coverage and near-experimental accuracy, making it valuable for protein design.
  • However, using AFDB directly in training deep models for tasks like inverse folding reveals a systematic geometric bias in the database's structural features.
  • To address this bias, a Debiasing Structure AutoEncoder (DeSAE) has been introduced to improve the reconstruction of native-like conformations from corrupted backbone geometries.
  • The application of DeSAE to AFDB structures has shown significant improvements in inverse folding performance, emphasizing the importance of debiasing in structure-based learning tasks.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app