menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

3h

read

231

img
dot

Image Credit: Arxiv

SVInvNet: A Densely Connected Encoder-Decoder Architecture for Seismic Velocity Inversion

  • Researchers have developed a deep learning-based approach, SVInvNet, for seismic velocity inversion.
  • SVInvNet employs a novel architecture with a multi-connection encoder-decoder structure enhanced with dense blocks.
  • The model effectively processes time series data and addresses non-linear seismic velocity inversion challenges.
  • Despite having fewer parameters, SVInvNet outperforms the baseline model in terms of performance.

Read Full Article

like

13 Likes

source image

Arxiv

3h

read

60

img
dot

Image Credit: Arxiv

Holistic analysis on the sustainability of Federated Learning across AI product lifecycle

  • The study focuses on evaluating the sustainability of Cross-Silo Federated Learning (FL) throughout the entire AI product lifecycle.
  • Cross-Silo FL is a decentralized approach that allows clients to share model updates rather than raw data, enhancing privacy.
  • The energy consumption and costs of model training are comparable between Cross-Silo FL and Centralized Learning.
  • Centralized Learning can result in significant CO2 emissions due to additional data transfer and storage requirements.

Read Full Article

like

3 Likes

source image

Arxiv

3h

read

294

img
dot

Image Credit: Arxiv

Explainable Bayesian Optimization

  • Researchers have proposed a novel algorithm, TNTRules, to address the explainability problem in Bayesian Optimization (BO).
  • TNTRules provides both global and local explanations for BO recommendations in cyber-physical systems.
  • By generating actionable rules and visual graphs, TNTRules helps identify optimal solution bounds, ranges, and potential alternative solutions.
  • The algorithm outperforms three baseline methods in terms of explanation quality, as evaluated using established XAI metrics.

Read Full Article

like

17 Likes

source image

Arxiv

3h

read

231

img
dot

Image Credit: Arxiv

Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss

  • This work focuses on statistical learning with dependent data and square loss in a specific hypothesis class.
  • The objective is to find a sharp noise interaction term, or variance proxy, in learning with dependent data.
  • The empirical risk minimizer achieves a rate that depends only on the complexity of the class and second-order statistics, termed as a 'near mixing-free rate'.
  • The study combines the concept of a weakly sub-Gaussian class with mixed tail generic chaining to compute optimal rates for various problems.

Read Full Article

like

13 Likes

source image

Arxiv

3h

read

234

img
dot

Image Credit: Arxiv

Towards Adversarially Robust Dataset Distillation by Curvature Regularization

  • Dataset distillation (DD) allows datasets to be distilled to fractions of their original size while preserving the rich distributional information so that models trained on the distilled datasets can achieve a comparable accuracy while saving significant computational loads.
  • This paper explores a new perspective of dataset distillation by embedding adversarial robustness, enabling models trained on these datasets to maintain high accuracy and better adversarial robustness.
  • The proposed method incorporates curvature regularization into the distillation process, resulting in improved accuracy and robustness compared to standard adversarial training, with lower computational overhead.
  • Empirical experiments demonstrate that the method generates robust distilled datasets capable of withstanding various adversarial attacks.

Read Full Article

like

14 Likes

source image

Arxiv

3h

read

26

img
dot

Image Credit: Arxiv

A Clustering Method with Graph Maximum Decoding Information

  • A novel clustering method called CMDI is proposed, which incorporates two-dimensional structural information into the graph-based clustering process.
  • CMDI reformulates graph partitioning as an abstract clustering problem, leveraging maximum decoding information to minimize uncertainty associated with random visits to vertices.
  • Empirical evaluations on three real-world datasets show that CMDI outperforms classical baseline methods, exhibiting a superior decoding information ratio (DI-R).
  • CMDI demonstrates heightened efficiency, especially when considering prior knowledge (PK), making it a valuable tool in graph-based clustering analyses.

Read Full Article

like

1 Like

source image

Arxiv

3h

read

107

img
dot

Image Credit: Arxiv

CodingTeachLLM: Empowering LLM's Coding Ability via AST Prior Knowledge

  • CodingTeachLLM is a large language model (LLM) designed for coding teaching.
  • It aims to enhance the coding ability of LLM in education context.
  • The model utilizes a prior-based three-phases supervised fine-tuned approach for better teaching.
  • It achieves state-of-the-art code abilities and maintains strong conversational capabilities.

Read Full Article

like

6 Likes

source image

Arxiv

3h

read

284

img
dot

Image Credit: Arxiv

Rehearsal-free Federated Domain-incremental Learning

  • Researchers have introduced a rehearsal-free federated domain-incremental learning framework called RefFiL.
  • RefFiL is designed to address the challenge of catastrophic forgetting in federated domain-incremental learning.
  • The framework learns domain-invariant knowledge and incorporates domain-specific prompts from different federated learning participants.
  • RefFiL effectively mitigates forgetting without requiring extra memory space, making it suitable for privacy-sensitive and resource-constrained devices.

Read Full Article

like

17 Likes

source image

Arxiv

3h

read

278

img
dot

Image Credit: Arxiv

Enhancing Domain Adaptation through Prompt Gradient Alignment

  • A new method called Prompt Gradient Alignment (PGA) is proposed for improving Unsupervised Domain Adaptation (UDA) in vision-language models.
  • PGA leverages large-scale pre-trained vision-language models to learn both domain-invariant and specific features.
  • The method aligns per-objective gradients to foster consensus between them, and prevents overfitting by penalizing the norm of the gradients.
  • Experimental results show that PGA outperforms other vision-language model adaptation methods for UDA.

Read Full Article

like

16 Likes

source image

Arxiv

3h

read

234

img
dot

Image Credit: Arxiv

Generative Data Assimilation of Sparse Weather Station Observations at Kilometer Scales

  • Methods for deep generative data assimilation have been proposed for weather forecast model initialization.
  • A diffusion model is trained to generate weather snapshots and incorporate sparse weather station data.
  • The generated fields show physically plausible structures and outperform a baseline system.
  • Further exploration is needed to combine regional state generators with diverse data streams.

Read Full Article

like

14 Likes

source image

Arxiv

3h

read

134

img
dot

Image Credit: Arxiv

Machine Unlearning Fails to Remove Data Poisoning Attacks

  • Existing machine unlearning methods fail to remove the effects of data poisoning attacks.
  • A study showed that various unlearning methods were ineffective against different types of poisoning attacks on different models.
  • New evaluation metrics for unlearning were introduced to precisely measure its efficacy in combating data poisoning.
  • The study suggests that unlearning methods for deep learning need further improvement and wider evaluation.

Read Full Article

like

8 Likes

source image

Arxiv

3h

read

50

img
dot

Image Credit: Arxiv

NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

  • NNsight and NDIF are introduced to enable the scientific study of representations and computations learned by large neural networks.
  • NNsight is an open-source system that extends PyTorch with deferred remote execution, and NDIF is a scalable inference service that executes NNsight requests.
  • The Intervention Graph architecture is developed to decouple experimental design from model runtime, enabling transparent and efficient access to the internals of deep neural networks.
  • The framework allows for a range of research methods on huge models and has been benchmarked with previous approaches.

Read Full Article

like

3 Likes

source image

Arxiv

3h

read

80

img
dot

Image Credit: Arxiv

Prior Learning in Introspective VAEs

  • Variational Autoencoders (VAEs) are a popular framework for unsupervised learning and data generation.
  • A study focused on the Soft-IntroVAE (S-IntroVAE) and investigated the implication of incorporating a multimodal and learnable prior into the framework.
  • The study formulated the prior as a third player and showed that it is an effective way for prior learning, sharing the Nash Equilibrium with the vanilla S-IntroVAE.
  • Experiments demonstrated the benefit of prior learning in S-IntroVAE in generation and representation learning on benchmark datasets.

Read Full Article

like

4 Likes

source image

Arxiv

3h

read

311

img
dot

Image Credit: Arxiv

Large-Scale Multi-omic Biosequence Transformers for Modeling Protein-Nucleic Acid Interactions

  • Researchers have developed large-scale multi-omic biosequence transformers for modeling protein-nucleic acid interactions.
  • Previous research has mainly focused on single-omic models, but they are not efficient in modeling multi-omic tasks.
  • The multi-omic models (MOMs) can learn joint representations and achieve state-of-the-art results in predicting protein-nucleic acid interactions.
  • MOMs can infer useful structural information without specific structural training, outperforming single-omic models in many cases.

Read Full Article

like

18 Likes

source image

Arxiv

3h

read

264

img
dot

Image Credit: Arxiv

Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization

  • In real-world applications, users often favor structurally diverse design choices over one high-quality solution.
  • This paper aims to identify a fixed number of solutions with a specified distance while maximizing their average quality.
  • An empirical study suggests that uniform random sampling performs well in producing diverse high-quality solutions.
  • The study emphasizes the need for algorithms tailored to produce diverse solutions of high average quality.

Read Full Article

like

15 Likes

For uninterrupted reading, download the app