menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

3d

read

296

img
dot

Image Credit: Arxiv

Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations

  • This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines.
  • For Discriminative models, the study examines various architectures and hyperparameters during training and inference and identifies energy-efficient practices.
  • For Generative AI, the study focuses on Large Language Models (LLMs) and assesses energy consumption across different model sizes and varying service requests.
  • The results indicate that optimizing architectures, hyperparameters, and hardware can significantly reduce energy consumption for Discriminative models without sacrificing performance.

Read Full Article

like

17 Likes

source image

Arxiv

3d

read

384

img
dot

Image Credit: Arxiv

Noise-based reward-modulated learning

  • Recent advances in reinforcement learning (RL) have led to significant improvements in task performance.
  • Noise-based alternatives like reward-modulated Hebbian learning (RMHL) have been proposed, but their performance has been limited in scenarios with delayed rewards.
  • A novel noise-based learning rule has been derived, which combines directional derivative theory and Hebbian-like updates, enabling efficient, gradient-free learning in RL.
  • The proposed method significantly outperforms RMHL and is competitive with backpropagation-based baselines, making it suitable for low-power and real-time applications.

Read Full Article

like

23 Likes

source image

Arxiv

3d

read

76

img
dot

Image Credit: Arxiv

Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving

  • Key-Value cache ( exttt{KV} exttt{cache}) compression has emerged as a promising technique to optimize Large Language Model (LLM) serving.
  • The paper comprehensively reviews existing algorithmic designs and benchmark studies, identifying missing performance measurement aspects that hinder practical adoption.
  • Representative exttt{KV} exttt{cache} compression methods are evaluated, uncovering issues that affect computational efficiency and end-to-end latency.
  • Tools are provided to aid future exttt{KV} exttt{cache} compression studies and facilitate practical deployment in production.

Read Full Article

like

4 Likes

source image

Arxiv

3d

read

104

img
dot

Image Credit: Arxiv

CITRAS: Covariate-Informed Transformer for Time Series Forecasting

  • CITRAS is a patch-based Transformer that addresses challenges in covariate-informed time series forecasting.
  • It leverages multiple targets and covariates, considering both past and future forecasting horizons.
  • CITRAS introduces two novel mechanisms: Key-Value (KV) Shift and Attention Score Smoothing.
  • Experimental results show that CITRAS achieves state-of-the-art performance in both covariate-informed and multivariate forecasting.

Read Full Article

like

6 Likes

source image

Arxiv

3d

read

224

img
dot

Image Credit: Arxiv

Bayesian Predictive Coding

  • Bayesian Predictive Coding (BPC) is a Bayesian extension to the influential theory of Predictive Coding (PC) in information processing in the brain.
  • BPC estimates a posterior distribution over network parameters, allowing for better quantification of epistemic uncertainty.
  • Compared to PC, BPC converges in fewer epochs in the full-batch setting and remains competitive in the mini-batch setting.
  • BPC provides a biologically plausible method for Bayesian learning in the brain and offers attractive uncertainty quantification in deep learning.

Read Full Article

like

13 Likes

source image

Arxiv

3d

read

304

img
dot

Image Credit: Arxiv

Accelerated Airfoil Design Using Neural Network Approaches

  • This paper demonstrates the use of Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs) to predict airfoil shapes from targeted pressure distribution and vice versa.
  • The dataset used in this study consists of 1600 airfoil shapes simulated at various Reynolds numbers and angles of attack.
  • The refined models show improved efficiency and reduced training time compared to the CNN model for complex datasets.
  • The proposed CNN and DNN models show promising results and have the potential to accelerate aerodynamic optimization and design of high-performance airfoils.

Read Full Article

like

18 Likes

source image

Arxiv

3d

read

48

img
dot

Image Credit: Arxiv

TransMamba: Flexibly Switching between Transformer and Mamba

  • TransMamba is a framework that combines Transformer and Mamba models for efficient long-sequence processing.
  • TransMamba uses shared parameter matrices to switch between attention and state space model (SSM) mechanisms.
  • The framework includes a Memory converter to bridge Transformer and Mamba models for seamless information flow.
  • Experimental results demonstrate that TransMamba achieves superior training efficiency and performance compared to baselines.

Read Full Article

like

2 Likes

source image

Arxiv

3d

read

28

img
dot

Image Credit: Arxiv

Level the Level: Balancing Game Levels for Asymmetric Player Archetypes With Reinforcement Learning

  • This work focuses on generating balanced levels tailored to asymmetric player archetypes in games.
  • The goal is to balance the disparity in abilities through the level design.
  • A method using reinforcement learning is used to balance tile-based game levels.
  • The evaluation shows that the method can balance a larger proportion of levels compared to two baseline approaches.

Read Full Article

like

1 Like

source image

Arxiv

3d

read

180

img
dot

Image Credit: Arxiv

CTSketch: Compositional Tensor Sketching for Scalable Neurosymbolic Learning

  • CTSketch is a novel, scalable neurosymbolic learning algorithm for training neural networks using end-to-end input-output labels.
  • CTSketch decomposes the symbolic program into sub-programs and summarizes each sub-program with a sketched tensor to improve scalability.
  • The algorithm approximates the output distribution of the program using simple tensor operations over input distributions and summaries.
  • CTSketch achieves high accuracy on tasks involving over one thousand inputs, pushing neurosymbolic learning to new scales.

Read Full Article

like

10 Likes

source image

Arxiv

3d

read

140

img
dot

Image Credit: Arxiv

Learning a Canonical Basis of Human Preferences from Binary Ratings

  • Recent advances in generative AI have been driven by alignment techniques such as reinforcement learning from human feedback (RLHF).
  • This paper focuses on understanding the preferences encoded in datasets used for RLHF and identifying common human preferences.
  • A small subset of 21 preference categories captures over 89% of preference variation across individuals, serving as a canonical basis of human preferences.
  • The identified preference basis proves useful for model evaluation and training, offering insights into model alignment and successful fine-tuning.

Read Full Article

like

8 Likes

source image

Arxiv

3d

read

284

img
dot

Image Credit: Arxiv

Predicting Targeted Therapy Resistance in Non-Small Cell Lung Cancer Using Multimodal Machine Learning

  • Lung cancer is the primary cause of cancer death globally, with non-small cell lung cancer (NSCLC) being the most common subtype.
  • A new study has developed a multimodal machine learning model to predict patient resistance to osimertinib, a third-generation EGFR-tyrosine kinase inhibitor, in late-stage NSCLC patients with activating EGFR mutations.
  • The model achieved a c-index of 0.82 on a multi-institutional dataset by integrating various data types such as histology images, next-generation sequencing (NGS) data, demographics data, and clinical records.
  • The multimodal model demonstrated superior performance over single modality models, highlighting the importance of combining multiple data types for accurate patient outcome prediction.

Read Full Article

like

17 Likes

source image

Arxiv

3d

read

124

img
dot

Image Credit: Arxiv

Ride-Sourcing Vehicle Rebalancing with Service Accessibility Guarantees via Constrained Mean-Field Reinforcement Learning

  • The rapid expansion of ride-sourcing services presents operational challenges, such as vehicle rebalancing.
  • A scalable mean-field control and reinforcement learning model is proposed for precise vehicle repositioning.
  • An accessibility constraint is integrated to ensure equitable service distribution.
  • Empirical evaluation using real-world data-driven simulation demonstrates the efficiency and robustness of the approach.

Read Full Article

like

7 Likes

source image

Arxiv

3d

read

180

img
dot

Image Credit: Arxiv

Many-to-Many Matching via Sparsity Controlled Optimal Transport

  • Many-to-many matching seeks to match multiple points in one set and multiple points in another set.
  • This paper proposes a novel many-to-many matching method that explicitly encodes many-to-many constraints while preventing one-to-one matching.
  • The method includes matching budget constraints and a deformed $q$-entropy regularization to maximize the matching budget.
  • Experimental results show that the proposed method achieves good performance in generating meaningful many-to-many matchings.

Read Full Article

like

10 Likes

source image

Arxiv

3d

read

164

img
dot

Image Credit: Arxiv

Spatio-temporal Prediction of Fine-Grained Origin-Destination Matrices with Applications in Ridesharing

  • Accurate spatial-temporal prediction of network-based travelers' requests is crucial for the effective policy design of ridesharing platforms.
  • This paper introduces a novel prediction model, OD-CED, for fine-grained Origin-Destination (OD) demand prediction in ridesharing platforms.
  • OD-CED combines an unsupervised space coarsening technique and an encoder-decoder architecture to capture both semantic and geographic dependencies.
  • Experimental results show that OD-CED outperforms traditional statistical methods, achieving significant reductions in root-mean-square error and weighted mean absolute percentage error.

Read Full Article

like

9 Likes

source image

Arxiv

3d

read

20

img
dot

Image Credit: Arxiv

Advances in Continual Graph Learning for Anti-Money Laundering Systems: A Comprehensive Review

  • Financial institutions are required to monitor vast amounts of transactions for money laundering.
  • Traditional machine learning models have limitations in adapting to dynamic environments for AML detection.
  • Continual graph learning approaches can enhance AML practices by incorporating new information while retaining prior knowledge.
  • Experimental evaluations show that continual learning improves model adaptability and robustness in detecting money laundering.

Read Full Article

like

1 Like

For uninterrupted reading, download the app