Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

261

Image Credit: Arxiv

Weighted Loss Methods for Robust Federated Learning under Data Heterogeneity

Federated learning (FL) enables multiple data holders to train a ML model without sharing data externally.
FL involves workers updating a model locally and sharing their gradients with a central server.
Byzantine-resilient FL prevents malicious participants from harming model convergence.
Common strategies in FL ignore outlier gradients to thwart attacks.
In heterogeneous data settings, distinguishing outliers is challenging.
A new approach, Worker Label Alignement Loss (WoLA), aligns honest worker gradients in heterogeneous data settings.
WoLA helps in identifying malicious gradients and outperforms existing methods in such settings.
The paper includes theoretical insights and empirical evidence supporting WoLA's effectiveness.

Read Full Article

15 Likes

Arxiv

175

Image Credit: Arxiv

Guided Graph Compression for Quantum Graph Neural Networks

Graph Neural Networks (GNNs) face challenges with large graphs due to high memory requirements and inefficient sparse matrix operations on GPUs.
Quantum Computing (QC) is seen as a solution to address GNN challenges and has inspired new algorithmic approaches like Quantum Graph Neural Networks (QGNNs).
Current quantum hardware limitations restrict the effective encoding of data dimensions in QGNNs, leading to the manual simplification of datasets or the use of artificial graphs.
The Guided Graph Compression (GGC) framework is introduced to tackle these limitations by employing a graph autoencoder to reduce the number of nodes and the dimensionality of node features.
GGC compresses graphs to enhance downstream classification tasks, compatible with both quantum and classical classifiers.
This framework is evaluated on the Jet Tagging task, a crucial classification problem in high energy physics for distinguishing particle jets initiated by quarks and gluons.
GGC is compared favorably against using the autoencoder as a standalone step and a baseline classical GNN classifier, as proven by numerical results.
The performance of GGC surpasses the alternatives and enables the testing of novel QGNN approaches on practical datasets.

Read Full Article

10 Likes

Arxiv

156

Image Credit: Arxiv

Machine Learning-Based Classification of Oils Using Dielectric Properties and Microwave Resonant Sensing

The paper presents a machine learning-based approach for classifying oils using dielectric properties and a microwave resonant sensor.
Oils exhibit unique dielectric behavior influenced by their molecular composition, resulting in specific changes in the sensor's resonant frequency and amplitude response.
Variations in sensor responses are analyzed to extract relevant features used as inputs for various machine learning classifiers.
The microwave resonant sensor functions in a non-destructive, low-power mode, ideal for real-time industrial applications.
A dataset is created by altering oil samples' permittivity and recording sensor responses for training and evaluation.
Multiple classifiers are developed and tested using the extracted resonant features to differentiate between types of oils.
Experimental outcomes reveal a classification accuracy of 99.41% with the random forest classifier, indicating the method's efficacy in automated oil identification.
The system's small size, energy efficiency, and high accuracy emphasize its suitability for rapid and dependable oil characterization in industrial settings.

Read Full Article

9 Likes

Arxiv

Image Credit: Arxiv

Private Aggregation for Byzantine-Resilient Heterogeneous Federated Learning

Ensuring resilience to Byzantine clients while protecting the privacy of data in federated learning is a challenge.
Existing secure aggregation techniques are effective when clients' data is homogeneous but fail for heterogeneous data.
Pre-processing techniques like nearest neighbor mixing can enhance countermeasures in heterogeneous settings.
Proposed multi-stage method combines secret sharing, secure aggregation, and private information retrieval for privacy and resilience.
The method is designed to provide information-theoretic privacy guarantees and Byzantine resilience under data heterogeneity.
Scheme outperforms previous techniques in combating various attacks in federated learning.
Investigation into reducing communication overhead of secure aggregation through zero-order estimation methods.
Efforts to make private aggregation scalable in state-of-the-art federated learning tasks.

Read Full Article

3 Likes

Discover more

Arxiv

304

Image Credit: Arxiv

Learning single-index models via harmonic decomposition

Study on learning single-index models where the label depends on the input only through a one-dimensional projection.
Prior work uses Hermite polynomials for recovering the projection under Gaussian inputs.
A new perspective proposes using spherical harmonics due to the problem's rotational symmetry.
Complexity of learning single-index models under spherically symmetric input distributions is characterized.
Introduction of estimators based on tensor unfolding and online SGD to achieve optimal sample complexity or runtime.
No single estimator may achieve both optimal sample complexity and runtime in general.
Specializing to Gaussian inputs, the theory clarifies existing results and uncovers new phenomena.

Read Full Article

18 Likes

Arxiv

187

Image Credit: Arxiv

A look at adversarial attacks on radio waveforms from discrete latent space

Researchers analyzed the effectiveness of VQVAE in suppressing adversarial attacks on high-SNR radio-frequency data-points by targeting amplitude modulations from specific digitally modulated waveform classes.
Adversarial attacks were created to preserve the phase between the in-phase and quadrature components with adversarially changed values, and compared with attacks where the phase was not preserved.
The classification accuracy of adversarial examples was tested on a classifier trained to achieve 100% accuracy on the original data.
The study evaluated the ability of VQVAE to mitigate the strength of the attack by assessing the classifier accuracy on VQVAE reconstructions of the adversarial datapoints.
It was found that VQVAE significantly reduces the effectiveness of the attack.
Comparison was made among the I/Q plane diagram of attacked data, their reconstructions, and the original data.
Different methods and metrics were utilized to compare the probability distribution of the VQVAE latent space with and without attack.
By varying the attack strength, interesting properties of the discrete space were observed which could aid in detecting attacks.

Read Full Article

11 Likes

Arxiv

Image Credit: Arxiv

Apollo: A Posteriori Label-Only Membership Inference Attack Towards Machine Unlearning

Machine Unlearning (MU) is used to update machine learning models efficiently by removing training samples without retraining from scratch.
MU is employed to provide privacy protection and regulatory compliance but can also increase the model's vulnerability to attacks.
Existing privacy attacks on MU require access to both the unlearned model and the original model, limiting their practicality in real-life scenarios.
A novel privacy attack named Apollo is proposed, focusing on label-only membership inference towards MU.
Apollo operates under a strict threat model where the adversary only has access to the label outputs of the unlearned model.
The attack aims to determine if a data sample has been unlearned and shows high precision in identifying unlearned samples.

Read Full Article

1 Like

Arxiv

Image Credit: Arxiv

Canonical Latent Representations in Conditional Diffusion Models

Conditional diffusion models (CDMs) have shown impressive performance in generative tasks by modeling the full data distribution.
CDMs can entangle class-defining features with irrelevant context, making it challenging to extract robust and interpretable representations.
A new concept, Canonical Latent Representations (CLAReps), has been introduced to address this issue.
CLAReps are latent codes in CDMs that preserve essential categorical information while discarding non-discriminative signals.
By utilizing CLAReps, a novel diffusion-based feature-distillation paradigm called CaDistill has been developed.
CaDistill ensures the transfer of core class knowledge from teacher to student CDMs via CLAReps.
CLAReps enable representative sample generation for each class, providing an interpretable and compact summary of core class semantics.
The student model trained with CaDistill achieves strong adversarial robustness and generalization ability.
By focusing on class signals and ignoring spurious background cues, the student model becomes more robust.
The study indicates that CDMs can serve not only as image generators but also as compact, interpretable teachers for driving robust representation learning.

Read Full Article

2 Likes

Arxiv

358

Image Credit: Arxiv

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Researchers introduce Multiverse, a generative model that enables natively parallel generation by internalizing a MapReduce paradigm.
Multiverse operates through three stages: adaptive task decomposition, parallel subtask execution, and lossless result synthesis.
A real-world Multiverse reasoning model is created with co-design of data, algorithm, and system, facilitating rapid transfer from AR-LLMs.
Multiverse 1K is developed by converting sequential reasoning chains into structured training data using an automated pipeline.
Multiverse Attention is designed to separate parallel reasoning steps while maintaining compatibility with causal attention during training.
Multiverse Engine enables parallel inference with a scheduler that dynamically switches between sequential and parallel generation.
After fine-tuning with 1K examples, Multiverse-32B, an open-source non-AR model, achieves performance on par with leading AR-LLMs of the same scale.
Budget control experiments demonstrate Multiverse-32B's superior scaling, outperforming AR-LLMs by 1.87% on average using the same context length.
Multiverse-32B also achieves up to 2x speedup across varying batch sizes, leading to practical efficiency gains.
The entire Multiverse ecosystem, including data, model weights, engine, and tools, has been open-sourced for accessibility.

Read Full Article

21 Likes

Arxiv

331

Image Credit: Arxiv

Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling

Large language models (LLMs) face challenges in generating faithful samples from probability distributions despite accurately describing them in natural language.
A study investigates the discrepancy between knowledge representation and sample generation, particularly focusing on Bernoulli distributions.
The study introduces Verbalized Rejection Sampling (VRS), a natural-language adaptation of classical rejection sampling, to enhance sample generation by guiding the LLM to reason and accept or reject proposed samples.
VRS, although utilizing the same Bernoulli mechanism internally, demonstrates a significant reduction in sampling bias across models.
Theoretical analysis indicates that VRS, under mild assumptions, outperforms direct sampling by improving sample quality, with benefits attributed to the algorithm and prompt design.
The research highlights how integrating classical probabilistic tools into LLM workflows through natural language adaptations like VRS can enhance reliability without needing access to model internals.

Read Full Article

19 Likes

Arxiv

Image Credit: Arxiv

Unify Graph Learning with Text: Unleashing LLM Potentials for Session Search

Session search typically focuses on sequential modeling for deep semantic understanding, neglecting graph structures in interactions.
The proposed Symbolic Graph Ranker (SGR) integrates text-based and graph-based approaches using Large Language Models (LLMs).
SGR converts session graphs into text using symbolic grammar rules, allowing seamless integration of session history, interactions, and task instructions for LLMs.
The objective is to enhance LLMs' ability to capture graph structures within a textual format.
Self-supervised symbolic learning tasks like link prediction and node content generation aid LLMs in capturing topological information.
Experiment results on AOL and Tiangong-ST datasets show the effectiveness of SGR.
SGR offers a methodology that enhances LLMs in capturing graph structures, bridging traditional search strategies with modern LLMs.

Read Full Article

3 Likes

Arxiv

156

Image Credit: Arxiv

Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering

Large Multimodal Models (LMMs) often struggle with in-context learning (ICL) when performing new tasks with limited supervision.
In smaller LMMs, the ICL performance is inconsistent and does not always improve with more examples.
The inconsistency in ICL performance is attributed to LMMs being overwhelmed by unnecessary information in image embeddings.
A meta-learning approach is proposed to enable few-shot capabilities in LMMs by using fixed soft prompts distilled from task-relevant image features.
These prompts can be adapted at test time with just a few examples, addressing the issue of overwhelming information in image embeddings.
An attention-mapper module is introduced to aid in the prompt distillation, which can be integrated with the LLaVA v1.5 architecture.
The attention-mapper module is jointly learned with soft prompts, allowing for task adaptation in LMMs with minimal data using gradient steps.
Evaluation on the VL-ICL Bench demonstrates that the proposed method consistently outperforms ICL and related prompt-tuning approaches.
Even under image perturbations, the proposed method improves task induction and reasoning for visual question answering tasks.

Read Full Article

9 Likes

Arxiv

105

Image Credit: Arxiv

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

Rule-based reasoning is a fundamental problem, but variations in rule formats and complexity in real-world applications are challenging.
Large reasoning models enhanced by reinforcement learning have shown remarkable capabilities.
The effectiveness of small reasoning models in learning rule-based reasoning with generalization across tasks and domains remains an open question.
A method called RuleReasoner is introduced to conduct rule-based reasoning with a wide range of tasks and domain-aware dynamic sampling.
RuleReasoner resamples training batches by updating sampling weights based on historical rewards to facilitate domain augmentation and flexible learning schedules.
Empirical evaluations show that RuleReasoner outperforms leading large reasoning models on in-distribution and out-of-distribution benchmarks.
RuleReasoner achieves a significant performance improvement over existing methods on both in-distribution and out-of-distribution tasks.
The approach also demonstrates higher computational efficiency compared to previous dynamic sampling methods for reinforcement learning.

Read Full Article

6 Likes

Arxiv

124

Image Credit: Arxiv

Reconstructing Heterogeneous Biomolecules via Hierarchical Gaussian Mixtures and Part Discovery

Cryo-electron microscopy (cryo-EM) is used in molecular biology to visualize 3D molecular structures from noisy 2D electron microscope images.
A novel 3D reconstruction framework named CryoSPIRE, inspired by Gaussian Splatting for 4D scene reconstruction, has been introduced for handling non-rigid conformational flexibility and compositional variations in imaged particles.
CryoSPIRE utilizes a hierarchical Gaussian mixture model to infer a part-based segmentation of particles, which helps in dealing with conformational and compositional variability.
The framework has shown the capability to reveal biologically significant structures in complex experimental datasets and has set a new benchmark on CryoBench, a cryo-EM heterogeneity methods benchmark.

Read Full Article

7 Likes

Arxiv

117

Image Credit: Arxiv

Exploring Image Transforms derived from Eye Gaze Variables for Progressive Autism Diagnosis

The prevalence of Autism Spectrum Disorder (ASD) has rapidly increased, impacting communication, behavior, and focus.
Current diagnostic techniques for ASD are time-intensive and costly.
An AI-powered assistive technology is introduced to streamline ASD diagnosis and management.
The system integrates transfer learning with eye gaze variables to diagnose ASD.
This technology allows for in-home periodical diagnosis, reducing stress for individuals and caregivers.
User privacy is maintained through the use of image transforms.
The proposed method enhances communication between guardians and therapists for progress updates and support needs.
The approach ensures timely, accessible diagnosis while protecting privacy and improving outcomes for individuals with ASD.

Read Full Article

7 Likes

For uninterrupted reading, download the app