menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

1d

read

232

img
dot

Image Credit: Arxiv

A Cram\'er-von Mises Approach to Incentivizing Truthful Data Sharing

  • Incentivizing truthful data sharing is important in modern data marketplaces and sharing consortia to discourage manipulation.
  • Schemes that reward agents based on the quantity of data submitted can lead to fabricated or low-quality data being submitted.
  • A new approach based on a Cramér-von Mises statistic-inspired two-sample test has been developed to incentivize agents to submit genuine data and discourage fabrication.
  • The method establishes truthful reporting as a Nash equilibrium and relaxes key assumptions made by prior work in data sharing problems.

Read Full Article

like

13 Likes

source image

Arxiv

1d

read

221

img
dot

Image Credit: Arxiv

Investigating the Relationship Between Physical Activity and Tailored Behavior Change Messaging: Connecting Contextual Bandit with Large Language Models

  • Machine learning approaches, such as contextual multi-armed bandit algorithms, are being employed to reduce sedentary behavior by delivering personalized interventions to encourage physical activity.
  • A hybrid approach combining contextual multi-armed bandit for selecting intervention types with large language models for personalizing message content is proposed in the study.
  • Four intervention types, such as behavioral self-monitoring, gain-framed, loss-framed, and social comparison, are evaluated through motivational messages to increase motivation for physical activity and daily step count.
  • The study assesses the effectiveness of different models in delivering daily messages, including contextual multi-armed bandit alone, large language models alone, combined contextual multi-armed bandit with large language model personalization, and equal randomization.

Read Full Article

like

13 Likes

source image

Arxiv

1d

read

217

img
dot

Image Credit: Arxiv

Tokenized Bandit for LLM Decoding and Alignment

  • Introduction of tokenized linear bandit (TLB) and multi-armed bandit (TMAB) problems inspired by LLM decoding and alignment.
  • Learning without structure on the sequence function is shown to be impossible in both TLB and TMAB problems.
  • Algorithms proposed with regret bounds for TLB and TMAB under the assumption of diminishing distance with more commons (DDMC).
  • Validation of algorithm's performance and assumptions using synthetic and real-world datasets.

Read Full Article

like

13 Likes

source image

Arxiv

1d

read

93

img
dot

Image Credit: Arxiv

EviNet: Evidential Reasoning Network for Resilient Graph Learning in the Open and Noisy Environments

  • EviNet is a new framework introduced for graph learning in open and noisy environments.
  • It addresses challenges of misclassification detection and out-of-distribution detection by integrating Beta embedding within a subjective logic framework.
  • EviNet outperforms state-of-the-art methods in in-distribution classification, misclassification detection, and out-of-distribution detection tasks.
  • The framework highlights the importance of uncertainty estimation and logical reasoning for effective graph learning in open-world scenarios.

Read Full Article

like

5 Likes

source image

Arxiv

1d

read

352

img
dot

Image Credit: Arxiv

Pre-trained Large Language Models Learn Hidden Markov Models In-context

  • Pre-trained large language models (LLMs) can effectively model data generated by Hidden Markov Models (HMMs) via in-context learning.
  • LLMs achieve predictive accuracy approaching the theoretical optimum on a diverse set of synthetic HMMs.
  • Novel scaling trends influenced by HMM properties were uncovered, along with practical guidelines for using in-context learning as a diagnostic tool for complex data.
  • In real-world animal decision-making tasks, in-context learning achieves competitive performance with models designed by human experts, showcasing its potential as a powerful tool for uncovering hidden structure in complex scientific data.

Read Full Article

like

21 Likes

source image

Arxiv

1d

read

44

img
dot

Image Credit: Arxiv

PASS: Private Attributes Protection with Stochastic Data Substitution

  • Various studies have been proposed to protect private attributes in Machine Learning (ML) data.
  • Current methods have vulnerabilities due to common weaknesses in adversarial training strategies.
  • To address this, a novel approach called PASS has been introduced for stochastic substitution of data with certain probabilities.
  • PASS has been evaluated on different datasets like facial images, human activity signals, and voice recordings, showing effectiveness and generalizability.

Read Full Article

like

2 Likes

source image

Arxiv

1d

read

329

img
dot

Image Credit: Arxiv

Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference

  • Large Language Models (LLMs) face memory inefficiencies during long-context inference.
  • A new integration of PagedAttention with PyTorch's FlexAttention is introduced to improve efficiency.
  • The fusion of attention kernel in IBM's Foundation Model Stack (FMS) reduces inference latency significantly.
  • Benchmarks on an NVIDIA L4 GPU show reduced latency with global KV cache, maintaining linear growth with sequence length.

Read Full Article

like

19 Likes

source image

Arxiv

1d

read

202

img
dot

Image Credit: Arxiv

DEF: Diffusion-augmented Ensemble Forecasting

  • DEF (Diffusion-augmented Ensemble Forecasting) is a new approach for generating initial condition perturbations.
  • It aims to address limitations in existing methods primarily designed for numerical weather prediction solvers, making them less applicable to machine learning for weather prediction.
  • DEF utilizes a simple conditional diffusion model to generate structured perturbations iteratively, with a guidance term for controlling the perturbation level.
  • Validation on the ERA5 reanalysis dataset shows that DEF improves predictive performance and provides reasonable spread estimates for long-term forecasts.

Read Full Article

like

12 Likes

source image

Arxiv

1d

read

187

img
dot

Image Credit: Arxiv

Mobility-Aware Asynchronous Federated Learning with Dynamic Sparsification

  • Asynchronous Federated Learning (AFL) allows model training on multiple mobile devices independently.
  • Device mobility leads to intermittent connectivity, requiring gradient sparsification and causing model staleness.
  • A theoretical model is developed to analyze the impact of sparsification, model staleness, and mobility on AFL convergence.
  • A mobility-aware dynamic sparsification (MADS) algorithm is proposed to optimize sparsification based on contact time and model staleness, improving convergence and achieving better results in experiments.

Read Full Article

like

11 Likes

source image

Arxiv

1d

read

101

img
dot

Image Credit: Arxiv

JavelinGuard: Low-Cost Transformer Architectures for LLM Security

  • JavelinGuard is a suite of low-cost, high-performance model architectures designed for detecting malicious intent in Large Language Model (LLM) interactions.
  • The suite includes five progressively sophisticated transformer-based architectures named Sharanga, Mahendra, Vaishnava, Ashwina, and Raudra, each offering unique trade-offs in speed, interpretability, and resource requirements.
  • The models are rigorously benchmarked across diverse adversarial datasets, demonstrating superiority over leading open-source guardrail models and large decoder-only LLMs like gpt-4o in terms of accuracy and latency.
  • Raudra's multi-task design is highlighted for offering the most robust performance overall, providing guidance to practitioners for selecting the optimal balance of complexity and efficiency in real-world LLM security applications.

Read Full Article

like

6 Likes

source image

Arxiv

1d

read

86

img
dot

Image Credit: Arxiv

Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

  • A new study titled 'Graph-KV' introduces a method to inject structural biases into large language models.
  • The Graph-KV approach leverages the KV-cache of text segments to allow for interaction governed by structural inductive biases, improving tasks like retrieval-augmented generation.
  • By selectively attending only to designated source segments, Graph-KV induces a graph-structured block mask, sparsifying attention and enabling a message-passing-like step within the language model.
  • Evaluated across various benchmarks and tasks, Graph-KV outperforms baseline methods by effectively reducing positional bias and utilizing structural inductive biases.

Read Full Article

like

5 Likes

source image

Arxiv

1d

read

106

img
dot

Image Credit: Arxiv

MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing

  • Recent works improve MoE inference load balance by dynamically duplicating popular experts to more GPUs to process excessive tokens.
  • MoE-GPS is a framework proposed to guide the selection of the optimal predictor design for multi-GPU Mixture-of-Experts network.
  • It advocates for Distribution-Only Prediction, a strategy that predicts overall token distribution to reduce overhead compared to Token-to-Expert Prediction.
  • On Mixtral 8x7B MMLU dataset, MoE-GPS suggests Distribution-Only Prediction, improving end-to-end inference performance by over 23% compared to Token-to-Expert Prediction.

Read Full Article

like

6 Likes

source image

Arxiv

1d

read

356

img
dot

Image Credit: Arxiv

Moment Alignment: Unifying Gradient and Hessian Matching for Domain Generalization

  • Domain generalization (DG) aims to create models that perform well on new, unseen domains by addressing distribution shifts.
  • Existing methods focusing on aligning domain-level gradients and Hessians for DG are computationally inefficient and lack clear underlying principles.
  • This paper introduces the theory of moment alignment for DG, which unifies Invariant Risk Minimization, gradient matching, and Hessian matching approaches.
  • The proposed Closed-Form Moment Alignment (CMA) algorithm aligns domain-level gradients and Hessians efficiently, demonstrating superior performance in experiments compared to existing algorithms.

Read Full Article

like

21 Likes

source image

Arxiv

1d

read

198

img
dot

Image Credit: Arxiv

InverseScope: Scalable Activation Inversion for Interpreting Large Language Models

  • Understanding internal representations of large language models is crucial for interpretability research.
  • A new framework called InverseScope is introduced for interpreting neural activations through input inversion.
  • InverseScope defines a distribution over inputs to generate similar activations and analyze to infer encoded features.
  • It scales inversion-based interpretability methods for larger models and enables quantitative analysis of internal representations in real-world LLMs.

Read Full Article

like

11 Likes

source image

Arxiv

1d

read

194

img
dot

Image Credit: Arxiv

Anomaly Detection and Early Warning Mechanism for Intelligent Monitoring Systems in Multi-Cloud Environments Based on LLM

  • Proposal of an anomaly detection and early warning mechanism for intelligent monitoring systems in multi-cloud environments based on Large-Scale Language Model (LLM).
  • Introduction of a multi-level feature extraction method combining LLM's natural language processing with traditional machine learning for enhanced anomaly detection accuracy and real-time response efficiency.
  • Dynamic adaptation to various cloud service providers and environments by utilizing LLM's contextual understanding capabilities for improved abnormal pattern detection and failure prediction.
  • Experimental results demonstrate the model's superiority over traditional systems in terms of detection accuracy, latency, resilience, and active management ability in cloud infrastructure.

Read Full Article

like

11 Likes

For uninterrupted reading, download the app