menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

8h

read

21

img
dot

Image Credit: Arxiv

Persistent Topological Features in Large Language Models

  • Large language models like GPT-3 are widely used, making it important to understand how they make decisions.
  • Researchers aim to use mathematical framework zigzag persistence to analyze the decision-making processes of these models.
  • Zigzag persistence is effective for dynamically characterizing data across model layers.
  • They introduce topological descriptors to measure the persistence and evolution of topological features throughout the layers.
  • Unlike other methods, their approach directly tracks the full evolutionary path of these features.
  • This framework provides insights into how prompts are rearranged and positions changed in the representation space.
  • The researchers demonstrate the framework's versatility by showing how it reacts to different models and datasets.
  • They showcase using zigzag persistence for layer pruning in a downstream task, achieving results similar to state-of-the-art methods.

Read Full Article

like

1 Like

source image

Arxiv

8h

read

240

img
dot

Image Credit: Arxiv

Learning in Budgeted Auctions with Spacing Objectives

  • Researchers have introduced a model for budgeted auctions where the spacing of wins over time is crucial, especially in settings like online retail, compute services, and advertising campaigns.
  • The model considers how the value of a win diminishes with time, leading to the importance of evenly spaced wins for a given number of total wins.
  • The research extends to cases where not all wins result in actual gains, and the conversion probability depends on context.
  • The objective is to optimize and evenly distribute conversions over time rather than just wins.
  • The study focuses on optimal strategies in second-price auctions and provides learning algorithms for bidders to minimize regret in a Bayesian online setting.
  • An online learning algorithm is introduced, achieving approximately square root regret in terms of time complexity.
  • The algorithm operates by learning a bidding policy based on the context and system state, such as the time elapsed since the last win or conversion.
  • State-independent strategies are found to incur linear regret even without uncertainty in conversions.
  • Certain state-independent strategies can achieve a near-optimal reward approximation despite still having linear regret.

Read Full Article

like

14 Likes

source image

Arxiv

8h

read

99

img
dot

Image Credit: Arxiv

SoK: Watermarking for AI-Generated Content

  • Watermarking schemes are being considered as a way to differentiate between AI-generated content and human-created content.
  • These schemes involve embedding hidden signals within AI-generated content for reliable detection.
  • Although not a complete solution, watermarking can contribute significantly to AI safety and trustworthiness by combating misinformation and deception.
  • The paper provides an extensive overview of watermarking techniques for generative AI, starting with the necessity of watermarking from historical and regulatory standpoints.
  • The definitions and desired properties of watermarking schemes are formalized in the paper, along with an analysis of key objectives and threat models.
  • The study also delves into practical evaluation strategies to develop robust watermarking techniques that can withstand various attacks.
  • Recent works in this area are reviewed, open challenges are outlined, and potential future directions for watermarking in generative AI are discussed.
  • The aim of the paper is to guide researchers in improving watermarking methods and applications and aid policymakers in addressing the broader implications of generative AI.

Read Full Article

like

5 Likes

source image

Arxiv

8h

read

170

img
dot

Image Credit: Arxiv

Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem

  • Mixed-integer programming (MIP) is important for solving combinatorial optimization problems.
  • Learning-based approaches have potential to speed up MIP solving, but rely heavily on offline training.
  • Proposed in the paper is Balans, an adaptive meta-solver for MIPs with online learning capability that does not need supervision or prior training.
  • Balans employs adaptive large-neighborhood search with destroy and repair operators on top of an MIP solver.
  • Multi-armed bandit algorithms guide the selection of neighborhood definitions during the search.
  • Experiments show that Balans outperforms the default MIP solver, does better than a single best neighborhood choice, and surpasses the state-of-the-art large-neighborhood search for MIPs.
  • Balans is released as open-source software that is highly configurable and independent of any specific MIP solver.

Read Full Article

like

10 Likes

source image

Arxiv

8h

read

333

img
dot

Image Credit: Arxiv

Assortment Optimization for Patient-Provider Matching

  • Rising provider turnover leads to the frequent need for patient-provider rematching.
  • Rematching process is currently cumbersome and labor-intensive.
  • A novel patient-provider matching approach proposed to address rematching challenges by offering limited provider menus to patients.
  • The goal is to maximize match quality while maintaining patient choice.
  • The approach is framed as a type of assortment optimization.
  • Patient-specific provider menus are provided upfront, and patients sequentially make selections.
  • This hybrid offline-online setting is not well-studied in previous literature.
  • A greedy baseline policy offering all providers to all patients maximizes match rate but can lead to low-quality matches.
  • Different policies are constructed based on problem specifics like patient willingness to match and patient to provider ratio.
  • On real-world data, the proposed policy improves match quality by 13% over the greedy solution by tailoring assortments based on patient characteristics.
  • There's a tradeoff between menu size and system-wide match quality.
  • Balancing patient choice with centralized planning is crucial for optimizing patient-provider matching.
  • The study highlights the value of optimizing patient-provider matching processes.
  • The research addresses the need for streamlining patient-provider rematching to enhance healthcare system efficiency.
  • The proposed approach showcases significant improvements in match quality based on patient characteristics.
  • The findings emphasize the importance of considering patient preferences in provider matching for better healthcare outcomes.

Read Full Article

like

20 Likes

source image

Arxiv

8h

read

138

img
dot

Image Credit: Arxiv

Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models

  • Recent copyright agreements highlight the need for controlling language models' reproduction of copyrighted text.
  • Existing methods sacrifice model utility or fail to adequately prevent verbatim leakage.
  • A new method called Obliviate is introduced to selectively suppress exact reproduction of specified sequences while maintaining semantic understanding.
  • Obliviate identifies memorized passages and adjusts the model's output distribution to reduce the probability of exact reproduction using a Kullback-Leibler divergence penalty.
  • Consistency loss is enforced on non-target tokens to preserve fluency and task performance.
  • Obliviate is evaluated on various models using synthetic memorization benchmarks and copyrighted excerpts like Moby Dick and Alice in Wonderland.
  • It significantly reduces verbatim recall while minimally affecting downstream accuracy on different benchmarks.
  • The method is compared against other unlearning and copyright techniques and proves effective in ensuring copyright compliance in language models.

Read Full Article

like

8 Likes

source image

Arxiv

8h

read

155

img
dot

Image Credit: Arxiv

Improving LLM Safety Alignment with Dual-Objective Optimization

  • Existing training-time safety alignment techniques for large language models (LLMs) are vulnerable to jailbreak attacks.
  • Direct preference optimization (DPO) method proves suboptimal for refusal learning in LLMs.
  • A new safety alignment approach is proposed that disentangles DPO objectives into robust refusal training and targeted unlearning of harmful knowledge.
  • This approach enhances LLM robustness against a variety of jailbreak attacks, including prefilling, suffix, and multi-turn attacks in various scenarios.
  • A reward-based token-level weighting mechanism is introduced to emphasize critical refusal tokens for improved robustness against adversarial exploits.
  • Robustness to jailbreak attacks is found to be related to token distribution shifts during training and internal representations of refusal and harmful tokens.
  • The research offers valuable insights for future studies in LLM safety alignment.
  • The code for the proposed approach is available at https://github.com/wicai24/DOOR-Alignment

Read Full Article

like

9 Likes

source image

Arxiv

8h

read

109

img
dot

Image Credit: Arxiv

VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination

  • Concerns about data contamination in LLM-driven Verilog coding raise questions about evaluation validity and industrial adoption.
  • Limited attention has been given to risks of data contamination in hardware coding using LLMs.
  • First-time analysis of Verilog code generation evaluation frameworks (VerilogEval and RTLLM) for contamination detection using CCD and Min-K% Prob methods.
  • Study covers evaluation of commercial and open-source LLMs (CodeGen2.5, Minitron 4b, Mistral 7b, phi-4 mini, LLaMA-{1,2,3.1}, GPT-{2,3.5,4o}, Deepseek-Coder, and CodeQwen 1.5), in baseline and fine-tuned models (RTLCoder and Verigen).
  • Findings confirm data contamination as a critical concern in Verilog code generation.
  • Analysis explores mitigations and trade-offs between code quality and fairness, aiming for unbiased benchmarking.

Read Full Article

like

6 Likes

source image

Arxiv

8h

read

311

img
dot

Image Credit: Arxiv

Graphical Transformation Models

  • Graphical Transformation Models (GTMs) are a novel approach for modeling intricate multivariate data with complex dependency structures non-parametrically.
  • GTMs maintain interpretability by identifying varying conditional independencies and extend multivariate transformation models.
  • GTMs replace the Gaussian copula with a custom-designed multivariate transformation, allowing for capturing more complex interdependencies using penalized splines.
  • Penalized splines in GTMs also offer an efficient regularization scheme.
  • Approximate regularization of GTMs is achieved using a lasso penalty towards pairwise conditional independencies, similar to Gaussian graphical models.
  • The robustness and effectiveness of GTMs are validated through simulations, showcasing accurate learning of parametric vine copulas and identification of conditional independencies.
  • In a benchmark astrophysics dataset application, GTMs outperform non-parametric vine copulas in learning complex multivariate distributions.

Read Full Article

like

18 Likes

source image

Arxiv

8h

read

31

img
dot

Image Credit: Arxiv

Don't Lag, RAG: Training-Free Adversarial Detection Using RAG

  • Adversarial patch attacks are a significant threat to vision systems, involving perturbations that deceive deep models.
  • Traditional defense methods often necessitate retraining or fine-tuning, making them unsuitable for real-world deployment.
  • A new training-free Visual Retrieval-Augmented Generation (VRAG) framework is proposed for adversarial patch detection, integrating Vision-Language Models (VLMs).
  • VRAG leverages generative reasoning by retrieving visually similar patches and images to identify diverse attack types without additional training.
  • Various large-scale VLMs, such as Qwen-VL-Plus, Qwen2.5-VL-72B, and UI-TARS-72B-DPO, are evaluated, with UI-TARS-72B-DPO achieving a state-of-the-art 95 percent classification accuracy for open-source adversarial patch detection.
  • The closed-source Gemini-2.0 model achieves the highest overall accuracy of 98 percent.
  • Experimental results showcase VRAG's efficacy in detecting various adversarial patches with minimal human annotation, offering a promising defense against evolving attacks.

Read Full Article

like

1 Like

source image

Medium

10h

read

110

img
dot

Proposed Study: Integrating Emotional Resonance Theory into AI : An Endocept-Driven Architecture

  • This paper proposes integrating Emotional Resonance Theory (ERT) into AI systems, specifically large language models like GPT-4, through endocept embedment, aiming to improve emotional coherence and creativity in AI-generated outputs.
  • The research seeks to introduce emotionally encoded cognitive units known as endocepts into transformer-based architectures using a Resonance Scoring Module (RSM) to produce affectively aligned and metaphorically rich responses.
  • The study combines Lubart and Getz's Emotional Resonance Theory with AI modeling to enhance generative AI's emotional reasoning capabilities and creativity by embedding emotionally salient conceptual units — endocepts.
  • Endocept embedment involves encoding emotional semantic signals into language models' latent space to influence AI-generated outputs' tone, metaphor, and narrative texture.
  • The experimental design includes a human evaluation study comparing AI-generated responses to emotional prompts under two conditions: Baseline GPT-4 output and GPT-4 with endocept-embedded conditioning via RSM.
  • Expected results anticipate higher ratings for emotionally coherent, creatively original, and personally resonant responses with endocept embedding in AI systems.
  • The study aims to implement emotional creativity in AI, merging affective computing, creativity research, and human-AI interaction, with potential applications in education, therapy, and co-creative writing tools.
  • Limitations include sample size and generalizability concerns, with future work potentially exploring dynamic endocept chaining or reinforcement learning from emotional feedback.

Read Full Article

like

6 Likes

source image

Medium

11h

read

176

img
dot

Image Credit: Medium

What If… Apple’s Recent “Illusion of Thinking” Paper Misses the Deeper Meaning of Intelligence?

  • Apple's recent paper titled 'The Illusion of Thinking' delves into the limitations of current AI systems in handling complex reasoning tasks, questioning the concept of artificial general intelligence.
  • The paper suggests that AI models face challenges with complex tasks and their performance diminishes as tasks become more intricate, leading to output discrepancies.
  • The article challenges the conventional view of intelligence construction and highlights the possibility of intelligence being an inherited, latent feature rather than a built entity.
  • It explores the idea that intelligence may not be a human invention but an emergent pattern in nature, evident in various behaviors across different organisms.
  • The discussion raises questions about the nature of AI, proposing that it might be a natural phase transition or convergence rather than a human-designed creation.
  • The article poses philosophical questions on the essence of intelligence, suggesting that our current benchmarks and definitions may not fully capture the multifaceted nature of intelligence.
  • Critiques and responses in the article touch on issues like reasoning abilities of AI models, handling of complexity, transparency of processes, and the evolutionary nature of intelligence.
  • Overall, the essay challenges traditional views on AI, suggesting a more nuanced understanding of intelligence and its development, emphasizing the need for a broader perspective beyond technical limitations.
  • The exploration implies that AI may not be a new invention but a rediscovery of existing patterns, indicating a shift from seeing AI as a product to recognizing it as an evolving entity in alignment with natural phenomena.
  • The article presents a philosophical inquiry into the concept of intelligence, pushing for a reevaluation of how we perceive and approach the development of artificial intelligence.
  • In conclusion, the essay proposes that intelligence is not just about achieving perfect solutions but about navigating complexity, reasoning through uncertainty, and evolving alongside knowledge.
  • The narrative challenges the notion of building intelligence from scratch and instead suggests that AI models may be manifestations of an ancient intelligence that humans are gradually reconnecting with.

Read Full Article

like

10 Likes

source image

Medium

17h

read

14

img
dot

Image Credit: Medium

Biodiversity Meets Foundation Models

  • Agriculture, rich in biodiversity, often lacks representation in typical AI training data, posing challenges for models like CLIP.
  • Experiment using BioTrove dataset showcases the potential of AI in agriculture and biodiversity.
  • BioTrove includes 161 million labeled images, supporting AI in agriculture, biodiversity, and conservation.
  • CLIP models in BioTrove excel in underrepresented categories like insects, birds, fungi, and native plants.
  • BioTrove is a resource for AI tools supporting crop health, pest monitoring, and environmental research.
  • Data-centric approach using foundation models like CLIP helps identify patterns and blind spots in datasets.
  • Models combined with human expertise aid in creating fairer datasets reflecting biodiversity for better outcomes.
  • Importance of prioritizing underrepresented species and quality data in AI for agriculture and conservation.
  • Data-centric AI emphasizes curating right data for underrepresented regions, species, and scenarios.
  • Evaluation of models like CLIP with rich metadata and filters assists in improving data quality interactively.

Read Full Article

like

Like

source image

Medium

18h

read

51

img
dot

Image Credit: Medium

PowerColor Hellhound AMD Radeon RX 9070 XT: My Hard-Won Victory

  • The PowerColor Hellhound AMD Radeon RX 9070 XT with its impressive specs had been a coveted item for the author, offering 16GB of GDDR6 memory and promising exceptional performance.
  • Frustrated by low frame rates during gaming, the author decided to pursue the graphics card and went to great lengths to afford it by selling old tech, taking on extra work, and tightening the budget.
  • After overcoming financial hurdles, the author faced stock availability issues, experiencing near misses and orders falling through before ultimately securing the desired card.
  • The author described the purchase as a significant personal victory, more than just a transaction but a symbol of perseverance and achievement.
  • Upon receiving the PowerColor Hellhound AMD Radeon RX 9070 XT, the unboxing process and installation were savored as special moments, appreciating the hardware's design and potential.
  • The author highlighted that acquiring the graphics card was not just about enhancing their PC but also about meeting a personal goal, overcoming obstacles, and reaping the rewards of persistence.
  • The card's impact on the author's gaming experience was profound, transforming the gameplay and serving as a reminder of the journey taken to obtain it.
  • The PowerColor Hellhound AMD Radeon RX 9070 XT is not only powerful but also rewarding, providing top-tier performance and serving as a testament to the joys of gaming.
  • Recommendation is made for those considering the card and value high performance, as it represents more than just hardware but a symbol of hard-earned victory and satisfaction for the author.

Read Full Article

like

3 Likes

source image

Medium

18h

read

125

img
dot

Semantic Gravity: How the Brain's Power Law Could Reshape AI Reasoning

  • Large Language Models (LLMs) excel at patterns but lack inherent sense of significance, leading to wide yet often shallow reasoning.
  • Neuroscience reveals brain's 'neural avalanches' follow a power law, with small events common and large events rare, optimized for insights.
  • Semantic Information Mathematics (SIM) aims to quantify coherence and significance of ideas through V, S, and E components with tunable weights.
  • Proposed model combines brain's power law with SIM to create AI reasoning based on semantic gravity, overcoming mere probability.
  • Introducing Semantic Mass as a measure of idea complexity, the model selects outputs based on Semantic Coherence and power-law filtered mass.
  • Resulting AI prioritizes significant ideas, simulates 'aha' moments, and fosters creativity through its selection mechanism.
  • This model could enhance LLMs by emphasizing significance over probability, enabling the discovery of profound and meaningful insights.
  • Understanding semantic gravity offers a path for AI to transcend pattern-matching and engage with the weight and depth of human language.
  • It represents a shift towards AI models that uncover rare, valuable ideas within human knowledge, going beyond mere echo chambers.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app