Machine Learning (ML) Latest News and Trending articles from all top sources only on Techminis

A naukri.com initiative

New

Home

ML News

Arxiv

154

Image Credit: Arxiv

Plug-and-Play image restoration with Stochastic deNOising REgularization

Plug-and-Play (PnP) algorithms combine a physical model and deep neural network for image restoration.
PnP algorithms previously used denoisers on less noisy images, contrary to Diffusion Models.
SNORE is a new PnP framework that applies denoising on images with suitable noise levels.
SNORE uses stochastic regularization in a gradient descent algorithm for solving inverse problems.
The algorithm's convergence analysis and annealing extension are provided.
SNORE is competitive with state-of-the-art methods for deblurring and inpainting tasks.

Read Full Article

9 Likes

Arxiv

Image Credit: Arxiv

Forecasting high-impact research topics via machine learning on evolving knowledge graphs

The exponential growth in scientific publications is a challenge for researchers to discover impactful research ideas and collaborations outside their field.
Predicting a scientific paper's future citation counts usually occurs after the research is completed, limiting the ability to anticipate impact at the idea stage.
Researchers have developed a large evolving knowledge graph utilizing over 21 million scientific papers to predict the impact of new research ideas that have not yet been published.
The knowledge graph combines a semantic network from paper content and an impact network from historic paper citations.
Machine learning techniques have enabled accurate prediction of the evolving network's dynamics into the future with high accuracy, with AUC values exceeding 0.9 in most cases.
The goal is to forecast the impact of new research directions, providing insights into potential new and impactful scientific ideas before they are published.

Read Full Article

3 Likes

Arxiv

268

Image Credit: Arxiv

FastLloyd: Federated, Accurate, Secure, and Tunable $k$-Means Clustering with Differential Privacy

Researchers propose a new method for privacy-preserving $k$-means clustering in the horizontally federated setting.
Existing federated approaches for $k$-means clustering have issues with overheads and output privacy.
Differentially private $k$-means algorithms face challenges like a trusted curator or degraded utility due to added noise.
A new method is introduced that enhances both differential privacy and secure computation components.
The proposed design is faster, more private, and more accurate than previous approaches.
Utilizing the computational differentially private model, a secure aggregation-based approach achieves significant speed improvements.
The new method maintains and improves the utility compared to existing central models of differential privacy.

Read Full Article

16 Likes

Arxiv

359

Image Credit: Arxiv

LieRE: Lie Rotational Positional Encodings

LieRE is introduced as an enhancement to the popular Rotary Position Encoding (RoPE) used in Transformer architectures.
RoPE has limitations with one-dimensional sequence data and restricted representational capacity, prompting the development of LieRE.
LieRE generalizes RoPE to high-dimensional rotation matrices by leveraging their Lie group structure.
Extensive evaluation on image datasets shows LieRE achieving improvement over state-of-the-art baselines in both 2D and 3D classification tasks.
LieRE offers superior generalization to higher resolutions and is computationally efficient, reproducible on 4 A100 GPUs in 30 minutes on CIFAR100.
LieRE code is available at https://github.com/StanfordMIMI/LieRE.

Read Full Article

21 Likes

Discover more

Arxiv

375

Image Credit: Arxiv

LLM2TEA: Agentic AI Designer Finds Innovative Objects with Generative Evolutionary Multitasking

Researchers introduce LLM2TEA, an agentic AI designer in a generative evolutionary multitasking framework.
LLM2TEA aims to create innovative solutions crossing multiple domains while adhering to real-world physical specifications.
The AI system utilizes a large language model, a text-to-3D generative model, a classifier, and a physics simulation model.
Novel LLM-based multitask evolutionary operators guide the search for high-performing practical objects.
Experimental results show significant improvements in diversity and physical performance of designs compared to a baseline model.
LLM2TEA designs are both aesthetically creative and functional in real-world applications.
Some designs have been successfully 3D-printed, demonstrating the AI system's ability to create tangible objects.
The AI-generated designs meet practical requirements and exhibit innovation and creativity.
LLM2TEA has potential applications in complex design optimization and discovery.

Read Full Article

22 Likes

Arxiv

375

Image Credit: Arxiv

CaLMQA: Exploring culturally specific long-form question answering across 23 languages

Researchers introduce CaLMQA, a dataset of 51.7K culturally specific questions across 23 languages.
Culturally specific questions are defined as those referring to unique cultural concepts or context-dependent answers.
Questions were collected from web forums and native speakers in both high and under-resourced languages.
Data collection for CaLMQA was translation-free to include culturally unique questions.
Evaluation of LLM-generated answers showed critical surface-level errors for many languages.
Even the best models struggled with low-resource languages, making mistakes such as answering in the wrong language or repetitions.
Answers to culturally specific questions had more factual errors compared to culturally agnostic questions.
CaLMQA aims to support future research in cultural and multilingual long-form question answering.
The dataset enables exploration of culturally specific long-form question answering.
Cultural uniqueness in questions included examples like 'Why was the first king of Burundi called Ntare (Lion)?' in Kirundi.
CaLMQA addresses the lack of exploration of culturally specific questions in LLMs.
The study highlights challenges in generating accurate long-form answers across diverse languages and cultures.
Surface-level errors were prominent in LLM-generated answers for culturally specific questions.
Factual errors were more common in answers to culturally specific questions compared to culturally agnostic questions.
CaLMQA dataset creation involved input from multiple languages, including under-resourced ones like Fijian and Kirundi.

Read Full Article

22 Likes

Arxiv

335

Image Credit: Arxiv

Leveraging data-driven weather models for improving numerical weather prediction skill through large-scale spectral nudging

Operational meteorological forecasting traditionally relies on physics-based numerical weather prediction models.
Data-driven artificial intelligence models are disrupting this landscape, offering improved computational performance and competitive forecasting accuracy.
However, data-driven models for medium-range forecasting have limitations including low effective resolution and a narrow range of predicted variables.
A study compares the physics-based GEM model with the AI-based GraphCast model, showcasing their strengths and weaknesses in global predictions.
GraphCast excels in predicting large scales over longer lead times but suffers from excessive smoothing at fine scales.
A hybrid NWP-AI system is proposed where GEM's temperature and wind predictions are nudged towards GraphCast predictions for large scales while GEM generates fine-scale details independently.
This hybrid approach enhances prediction skill by leveraging GraphCast's strengths while maintaining physically consistent forecast fields.
The system shows improved accuracy in predicting tropical cyclone trajectories without significant intensity changes.
Efforts are underway to operationalize this hybrid system at the Canadian Meteorological Centre.

Read Full Article

20 Likes

Arxiv

323

Image Credit: Arxiv

Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning

This paper introduces a new method for identifying the root causes of delivery risks in supply chains by combining causal discovery with reinforcement learning.
Traditional approaches to root cause analysis struggle to handle the complexity of supply chains, often resulting in misleading correlations and suboptimal decisions.
The proposed approach utilizes causal discovery to reveal true causal relationships among operational variables and reinforcement learning to refine the causal graph.
This method accurately identifies key factors contributing to late deliveries, including shipping methods and delivery statuses, offering insights to enhance supply chain performance.
The technique is tested on a real-world supply chain dataset, showcasing its effectiveness in pinpointing reasons for delivery delays and suggesting ways to mitigate risks.
The study's outcomes carry substantial implications for enhancing operational efficiency, customer satisfaction, and financial gains in supply chain operations.

Read Full Article

19 Likes

Arxiv

130

Image Credit: Arxiv

Holistic Uncertainty Estimation For Open-Set Recognition

Accurate uncertainty estimation is crucial for open-set recognition scenarios.
The proposed HolUE method addresses uncertainty through a Bayesian probabilistic model.
HolUE considers two sources of ambiguity: gallery uncertainty from overlapping classes and embedding uncertainty.
Challenging datasets like IJB-C and VoxBlink were used to test HolUE, showing improved recognition error identification.
Existing uncertainty estimation methods based solely on sample quality are outperformed by HolUE.
HolUE introduces a holistic uncertainty estimation approach for open-set recognition.
The method is designed to handle situations where a probe sample may belong to an unknown identity.
Probabilistic embeddings play a role in determining sample quality for uncertainty estimation in open-set recognition.
A new open-set recognition protocol for identification of whales and dolphins was introduced alongside HolUE.
Bayesian probabilistic modeling forms the basis of the HolUE method for uncertainty estimation.
The low variance of probabilistic embeddings may not always indicate low identification error probability in open-set recognition.
HolUE performs well in scenarios where embeddings are close to multiple classes, leading to high uncertainty despite high sample quality.
IJB-C and VoxBlink datasets were utilized to assess the effectiveness of HolUE.
HolUE demonstrates superior recognition error identification compared to competing uncertainty estimation methods.
Open-set recognition systems face challenges due to ambiguous gallery classes and embedding uncertainties.
HolUE offers an improved approach for handling uncertainty in open-set recognition tasks.

Read Full Article

7 Likes

Arxiv

339

Image Credit: Arxiv

LogProber: Disentangling confidence from contamination in LLM responses

Contamination in machine learning refers to testing data leaking into the training set, affecting the evaluation of Large Language Models (LLMs) trained on large, opaque text corpora.
Tools to detect contamination are crucial for fairly tracking LLM performance evolution, especially given their training on web-scraped text.
Previous studies have addressed contamination quantification in short text sequences, but have limitations leading to impracticality.
LogProber is introduced as an efficient algorithm to detect contamination in a black box setting, focusing on question familiarity over the answer.
LogProber aims to address drawbacks in existing methods and highlights the importance of detection algorithms' design in identifying different forms of contamination.

Read Full Article

20 Likes

Arxiv

150

Image Credit: Arxiv

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models

Safety-aligned large language models (LLMs) sometimes falsely refuse pseudo-harmful prompts, leading to user frustration and public backlash.
A new method is proposed to automatically generate diverse, content-controlled pseudo-harmful prompts for evaluating false refusals in LLMs.
An evaluation dataset called PHTest is created, which is larger and covers more false refusal patterns, providing insights into 20 LLMs.
The study reveals a trade-off between minimizing false refusals and enhancing safety against jailbreak attacks.
Defense mechanisms against jailbreak attacks can increase false refusal rates, impacting usability.
The proposed method and dataset aim to assist developers in evaluating and improving the safety and usability of LLMs.
Code and dataset are available at https://github.com/umd-huang-lab/FalseRefusal

Read Full Article

9 Likes

Arxiv

319

Image Credit: Arxiv

Traceable LLM-based validation of statements in knowledge graphs

A method is presented for validating RDF triples using LLMs with traceable arguments.
The approach avoids using internal LLM factual knowledge and instead compares verified RDF statements to external documents.
1,719 positive statements from the BioRED dataset were evaluated alongside the same number of newly generated negative statements, resulting in 88% precision and 44% recall, indicating the need for human oversight.
The method was also tested on the SNLI dataset, showing comparison with models tuned for natural language inference task.
The method was demonstrated on Wikidata using a SPARQL query to automatically retrieve statements for verification.
Results suggest that LLMs could be applied for large-scale validation of statements in knowledge graphs, reducing human annotation costs.

Read Full Article

19 Likes

Arxiv

379

Image Credit: Arxiv

Multimodal Pragmatic Jailbreak on Text-to-image Models

Diffusion models have advanced in image quality aligned with textual prompts, raising safety concerns.
A unique jailbreak method prompts T2I models to create unsafe content when combining images with safe texts.
A dataset was created to test diffusion-based text-to-image (T2I) models under this jailbreak.
Nine T2I models, including commercial ones, were evaluated, showing a tendency to produce unsafe content.
Results indicate rates of unsafe generation varying from 10% to 70%, with DALLE 3 being notably unsafe.
Common filters like keyword blocklists and NSFW image filters were ineffective against this jailbreak.
Filters designed for single modality detection failed to prevent unsafe content generation.
The study delves into the text rendering capability and training data as reasons for such jailbreaks.
The research sets a basis for enhancing security and reliability of T2I models.
Project page available at https://multimodalpragmatic.github.io/

Read Full Article

22 Likes

Arxiv

383

Image Credit: Arxiv

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Modeling human preferences is essential for aligning foundation models with human values.
Traditional reward modeling methods like the Bradley-Terry model have limitations in expressing complex preferences, especially in handling intransitive preferences.
This study introduces preference embedding, which involves embedding responses into a latent space to efficiently capture intricate preference structures with linear query complexity.
The General Preference Optimization (GPO), based on preference scores, is proposed to generalize reward-based reinforcement learning from human feedback (RLHF).
Experimental results demonstrate that the General Preference embedding Model (GPM) consistently outperforms the BT reward model on the RewardBench benchmark and effectively models cyclic preferences.
Evaluation on tasks like AlpacaEval2.0 after language model post-training with GPO and the general preference model shows performance enhancements over BT models.
The method seems promising in enhancing the alignment of foundation models with diverse human values, indicating potential for improvement over existing models.
The code for this model is available at https://github.com/general-preference/general-preference-model.

Read Full Article

23 Likes

Arxiv

Image Credit: Arxiv

Temperature Optimization for Bayesian Deep Learning

The Cold Posterior Effect (CPE) in Bayesian Deep Learning (BDL) involves tempering the posterior to a cold temperature to enhance the predictive performance of the posterior predictive distribution (PPD).
Despite the assumption that colder temperatures are always better, the BDL community acknowledges that this is not consistently true, lacking a systematic method for determining the optimal temperature.
A data-driven approach is suggested in this study to select the temperature maximizing test log-predictive density by treating temperature as a model parameter and estimating it directly from the data.
The proposed method is shown to offer comparable performance to grid search but at a reduced cost for regression and classification tasks in empirical demonstrations.
There is a contrast in the perspectives on CPE between the BDL and Generalized Bayes communities, with the former emphasizing PPD predictive performance and the latter stressing the posterior utility under model misspecification, leading to differing temperature preferences.

Read Full Article

2 Likes

For uninterrupted reading, download the app