menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

9h

read

96

img
dot

Image Credit: Arxiv

Diffusion-Free Graph Generation with Next-Scale Prediction

  • Autoregressive models in the transformer ecosystem provide efficiency, scalability, and seamless workflows but require explicit sequence order which conflicts with unordered graphs.
  • Diffusion models maintain permutation invariance and allow one-shot generation but necessitate numerous denoising steps, additional features, and high computational costs.
  • MAG is a novel diffusion-free graph generation framework inspired by visual autoregressive methods, based on next-scale prediction.
  • MAG utilizes a hierarchy of latent representations to generate graph scales progressively without explicit node ordering.
  • Experiments on generic and molecular graph datasets showed MAG's potential, achieving speedups up to three orders of magnitude over existing methods while maintaining high-quality generation.

Read Full Article

like

5 Likes

source image

Arxiv

9h

read

113

img
dot

Image Credit: Arxiv

On the Geometry of Receiver Operating Characteristic and Precision-Recall Curves

  • The study focuses on the geometry of Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves in binary classification problems.
  • Many commonly used binary classification metrics are found to be functions of a composition function G := F_p ∘ F_n⁻¹.
  • G is defined by the class-conditional cumulative distribution functions of classifier scores in positive (F_p(·)) and negative (F_n(·)) classes.
  • The geometric perspective aids in selecting operating points, understanding decision thresholds, and comparing classifiers.
  • It explains how the shapes and geometry of ROC/PR curves reflect classifier behavior, aiding in building optimized classifiers for specific applications with constraints.
  • The study explores conditions for classifier dominance and provides examples showing the impact of class separability and variance on ROC and PR curves.
  • A link is derived between the positive-to-negative class leakage function G(·) and the Kullback--Leibler divergence.
  • Practical considerations like model calibration, cost-sensitive optimization, and operating point selection under real-world constraints are emphasized.
  • This framework enables more informed approaches to classifier deployment and decision-making.
  • The study provides objective tools for building classifiers optimized for specific contexts and constraints.
  • Analytical and numerical examples are presented to demonstrate the effects of class separability and variance on ROC and PR geometries.
  • The study enhances understanding of how ROC/PR curves reflect classifier behavior.
  • The insights can help in selecting appropriate decision thresholds for different classifiers.
  • The framework aids in comparing classifiers and selecting optimal operating points.
  • Explanation is provided on how the geometry of ROC and PR curves influences classifier performance.
  • The study bridges the positive-to-negative class leakage function and the Kullback--Leibler divergence, shedding light on classifier behavior.
  • The research contributes to enhancing classifier performance for specific applications through a better understanding of the ROC and PR curve geometries.

Read Full Article

like

6 Likes

source image

Arxiv

9h

read

352

img
dot

Image Credit: Arxiv

Distortion-Aware Brushing for Reliable Cluster Analysis in Multidimensional Projections

  • Distortion-aware brushing is introduced to address the issue of unreliable cluster analysis in multidimensional projections caused by distortions in the data representation.
  • Conventional brushing in 2D scatterplots may lead to inaccuracies in cluster analysis when applied to multidimensional data projections.
  • The new technique, Distortion-aware brushing, corrects distortions around brushed points by adjusting the points in the projection dynamically.
  • This adjustment pulls close points together and pushes distant points apart in the multidimensional space, enhancing the accuracy of cluster brushing.
  • User studies involving 24 participants demonstrate that Distortion-aware brushing outperforms previous techniques in separating clusters accurately and remains robust against distortions.
  • The effectiveness of the technique is showcased through two use cases: cluster analysis of geospatial data and interactive labeling of multidimensional clusters.

Read Full Article

like

21 Likes

source image

Arxiv

9h

read

138

img
dot

Image Credit: Arxiv

Glimpse: Generalized Locality for Scalable and Robust CT

  • Deep learning has become the preferred approach for medical tomographic imaging, with common methods using a convolutional neural network (CNN) after simple inversion steps like backprojection.
  • Current CNN approaches tend to overfit large-scale structures and struggle with generalization on out-of-distribution (OOD) samples.
  • Multiscale CNNs are computationally complex and memory-intensive at high resolutions, limiting practical applications in realistic clinical settings.
  • A new approach called Glimpse, a local coordinate-based neural network for computed tomography, processes only neighborhood measurements for pixel reconstruction.
  • Glimpse outperforms CNNs on OOD samples, maintains performance on in-distribution test data, and has a memory footprint independent of image resolution.
  • Training Glimpse on 1024x1024 images requires only 5GB of memory, significantly less than needed for CNNs.
  • Glimpse is fully differentiable and can be integrated into various deep learning architectures, allowing tasks like correcting miscalibrated projection orientations.
  • The implementation and demo of Glimpse are available on GitHub at https://github.com/swing-research/Glimpse.

Read Full Article

like

8 Likes

source image

Arxiv

9h

read

53

img
dot

Image Credit: Arxiv

IoTGeM: Generalizable Models for Behaviour-Based IoT Attack Detection

  • IoTGeM is a new approach for behavior-based attack detection focusing on generalizability and improved performance in IoT networks.
  • It introduces an enhanced rolling window method for feature extraction and utilizes a multi-step feature selection process with a Genetic Algorithm guided by external feedback.
  • To avoid overfitting, models are trained and tested using separate datasets and rigorously evaluated with various machine learning algorithms and datasets.
  • The IoTGeM models outperform traditional flow-based models in generalization, achieving high F1 scores for various attack types on unseen data.
  • The approach also utilizes the SHAP explainable AI technique to identify the key features contributing to accurate attack detection.

Read Full Article

like

3 Likes

source image

Arxiv

9h

read

284

img
dot

Image Credit: Arxiv

Learning a Gaussian Mixture for Sparsity Regularization in Inverse Problems

  • In inverse problems, sparsity regularization is used to regularize the solution by assuming that the unknown can be represented with only a few significant components.
  • A new probabilistic sparsity prior based on a mixture of degenerate Gaussians is proposed to model sparsity in a versatile manner in this study.
  • A neural network is designed as the Bayes estimator for linear inverse problems under the probabilistic sparsity prior framework.
  • Supervised and unsupervised training strategies are suggested to estimate the parameters of this neural network.
  • The effectiveness of the proposed approach is evaluated against common sparsity-promoting techniques like LASSO, group LASSO, iterative hard thresholding, and sparse coding/dictionary learning.
  • Comparison results show that the reconstructions using the new approach consistently have lower mean square error values on 1D datasets compared to traditional techniques.

Read Full Article

like

17 Likes

source image

Arxiv

9h

read

170

img
dot

Image Credit: Arxiv

CompilerDream: Learning a Compiler World Model for General Code Optimization

  • CompilerDream is a model-based reinforcement learning approach designed for general code optimization in compilers.
  • Optimization in compilers is crucial for computer and software engineering success.
  • The effectiveness of optimizations relies on the selection and ordering of optimization passes applied to the code.
  • Current methods for finding the optimal sequence of optimization passes are either slow or struggle to generalize to unseen code.
  • CompilerDream introduces a compiler world model that simulates optimization passes and an agent trained on this model for generating effective optimization strategies.
  • By training on a large-scale program dataset, CompilerDream can serve as a general code optimizer for various application scenarios and source-code languages.
  • CompilerDream showcases strong optimization capabilities for autotuning and outperforms LLVM's built-in optimizations, leading the CompilerGym leaderboard.
  • The model's ability to generalize across diverse datasets without prior training surpasses state-of-the-art methods in both value prediction and end-to-end code optimization.

Read Full Article

like

10 Likes

source image

Arxiv

9h

read

124

img
dot

Image Credit: Arxiv

Incentivizing Quality Text Generation via Statistical Contracts

  • Large language models (LLMs) have increased the demand for machine-generated text.
  • Current pay-per-token pricing schemes lead to a misalignment of incentives known as moral hazard.
  • There is a strong incentive for text-generating agents to prefer a cheaper model over a cutting-edge one.
  • This preference can be done internally, impacting the quality of generated text.
  • To address this issue, a pay-for-performance, contract-based framework is proposed to incentivize text quality.
  • The framework involves a principal-agent game where the agent generates text using costly inference.
  • Contracts determine the principal's payment based on an automated quality evaluation of the text.
  • Standard contract theory is insufficient when internal inference costs are unknown.
  • Cost-robust contracts are introduced to deal with this uncertainty.
  • Optimal cost-robust contracts are characterized through a connection to optimal composite hypothesis tests from statistics.
  • Empirical evaluation of the framework involves deriving contracts for various objectives and LLM evaluation benchmarks.
  • Cost-robust contracts show only a slight decrease in objective value compared to their cost-aware counterparts.
  • The study offers insights into incentivizing quality text generation through contracts.
  • Cost-robust contracts are found to be effective in maintaining text quality while addressing cost considerations.
  • The work bridges economic principles and statistical approaches to improve text generation incentives.

Read Full Article

like

7 Likes

source image

Arxiv

9h

read

170

img
dot

Image Credit: Arxiv

Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

  • Geometric shape classification of vector polygons is a challenging task in spatial analysis.
  • This study introduces a graph-based representation of vector polygons and proposes a graph message-passing framework, PolyMP, and its variant, PolyMP-DSC.
  • PolyMP aims to learn more expressive and robust latent representations of polygons by capturing self-looped graph information hierarchically.
  • The framework focuses on learning geometric-invariant features for polygon shape classification.
  • Extensive experiments demonstrate that combining a permutation-invariant graph message-passing neural network with a densely self-connected mechanism results in robust performance on benchmark datasets.
  • The approach outperforms several baseline methods and shows effectiveness on synthetic glyphs and real-world building footprints.
  • PolyMP and PolyMP-DSC effectively capture expressive geometric features that remain invariant under common transformations like translation, rotation, scaling, and shearing, while also being robust to trivial vertex removals.
  • The proposed approach exhibits strong generalization ability, allowing the transfer of learned geometric features from synthetic glyphs to real-world building footprints.

Read Full Article

like

10 Likes

source image

Arxiv

9h

read

345

img
dot

Image Credit: Arxiv

Multi-group Uncertainty Quantification for Long-form Text Generation

  • Uncertainty quantification in long-form text generation is explored in a new study.
  • The study focuses on uncertainty within sub-groupings of data for large language model outputs.
  • Different methods are used to measure uncertainty at the level of individual claims and across the entire output.
  • Biography generation is used as a test case in this study.
  • Demographic attributes are considered to create subgroups of data.
  • Canonical methods for uncertainty quantification perform well for the entire dataset but struggle with subgroup analysis.
  • Group-conditional methods like multicalibration and multivalid conformal prediction are introduced to address subgroup uncertainties.
  • Additional subgroup information consistently enhances calibration and conformal prediction.
  • The study establishes benchmarks for calibration and conformal prediction in the context of long-form text generation.

Read Full Article

like

20 Likes

source image

Arxiv

9h

read

348

img
dot

Image Credit: Arxiv

General targeted machine learning for modern causal mediation analysis

  • Causal mediation analyses focus on understanding how causes exert their effects, crucial for scientific progress.
  • Recent years have seen significant development in defining and identifying mediational effects in rigorous causal models.
  • Challenges in interpreting and identifying such effects have been addressed with important progress.
  • Despite advancements in causal inference, statistical methodology for non-parametric estimation has been lacking.
  • There are limited methods available for non-parametric estimation with multiple, continuous, or high-dimensional mediators.
  • A study shows that six popular non-parametric mediation analysis approaches can be derived from just two statistical estimands.
  • An all-purpose one-step estimation algorithm is proposed for machine learning integration in mediation studies using these six definitions.
  • The estimators exhibit desirable properties like sqrt{n}-convergence and asymptotic normality.
  • Estimating first-order correction for the one-step estimator involves handling complex density ratios on high-dimensional mediators, addressed using Riesz learning.
  • The methods' properties are illustrated in a simulation study and applied to real data to determine how pain management practices mediate the total effect of chronic pain disorder on opioid use disorder.

Read Full Article

like

20 Likes

source image

Arxiv

9h

read

7

img
dot

Image Credit: Arxiv

Deploying Open-Source Large Language Models: A performance Analysis

  • Large language models (LLMs) like ChatGPT have gained success since November 2022.
  • Many open-source models are available, but deploying them comes with unknown requirements.
  • Tests conducted at the Centre Inria de l'Université de Bordeaux compare Mistral and LLaMa models' performance.
  • vLLM, a Python library optimized for LLM inference, was used in the study.
  • Results from the tests help evaluate LLM performance based on available GPUs.
  • The study aims to assist private and public groups in deploying LLMs by providing valuable information.

Read Full Article

like

Like

source image

Arxiv

9h

read

313

img
dot

Image Credit: Arxiv

Mimicking Human Intuition: Cognitive Belief-Driven Reinforcement Learning

  • Traditional reinforcement learning methods rely on trial-and-error exploration, leading to low sample efficiency and struggles to use past experiences effectively.
  • A new framework called Cognitive Belief-Driven Reinforcement Learning (CBD-RL) is proposed to address the inefficiencies of conventional RL.
  • CBD-RL is inspired by cognitive principles and aims to guide agents towards more informative decision-making, simulating the human reasoning process.
  • The core of CBD-RL is a belief system that integrates feedback with prior experience to optimize action probabilities, enhancing decision-making in uncertain environments.
  • CBD-RL organizes state-action pairs into meaningful categories, facilitating generalization and improving sample efficiency.
  • Concrete implementations of CBD-RL, such as CBDQ, CBDPPO, and CBDSAC, have shown superior performance in both discrete and continuous action spaces across various environments like Atari and MuJoCo.
  • By combining cognitive science with reinforcement learning, CBD-RL introduces a new approach to developing more interpretable, efficient, and cognitively-inspired RL systems.

Read Full Article

like

18 Likes

source image

Arxiv

9h

read

14

img
dot

Image Credit: Arxiv

Function-Guided Conditional Generation Using Protein Language Models with Adapters

  • ProCALM (Protein Conditionally Adapted Language Model) is introduced as an approach for generating proteins with desired functions using adapters with protein language models.
  • Existing methods in protein language models for conditional generation are limited and lack generalization to unseen functions.
  • ProCALM utilizes adapters to facilitate the conditional generation of proteins based on versatile representations of protein function like enzyme family, taxonomy, or natural language descriptions.
  • The approach involves finetuning ProGen2 to enable generation conditioned on specific functions, showcasing improved performance compared to current methods.
  • ProCALM demonstrates the ability to generalize to rare and unseen functions, surpassing existing approaches in conditional sequence generation.
  • The method is highlighted for its flexibility, computational efficiency, and potential applicability to a broad array of generative language models.

Read Full Article

like

Like

source image

Arxiv

9h

read

145

img
dot

Image Credit: Arxiv

PointNet with KAN versus PointNet with MLP for 3D Classification and Segmentation of Point Sets

  • Kolmogorov-Arnold Networks (KANs) are being explored as an alternative to Multilayer Perceptrons (MLPs) in deep learning.
  • KANs have been integrated into various deep learning architectures, but their use in point-cloud-based neural networks has not been studied.
  • A new model, PointNet-KAN, combines KANs instead of MLPs in the PointNet framework for 3D point cloud classification and segmentation.
  • PointNet-KAN utilizes shared KAN layers and symmetric functions to maintain permutation invariance for global feature extraction.
  • Unlike MLPs that train weights and biases with fixed activation functions, KANs aim to train the activation functions themselves.
  • Jacobi polynomials are used to construct the KAN layers in this new model.
  • Extensive evaluations show that PointNet-KAN performs competitively with PointNet using MLPs on benchmark datasets for 3D object classification and segmentation.
  • Despite being a simpler architecture, PointNet-KAN achieves good results, even with shallower networks.
  • A hybrid model incorporating both KAN and MLP layers was also studied in this work.
  • The study aims to lay a foundation for integrating KANs into more advanced point cloud processing architectures.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app