<ul data-eligibleForWebStory="true"><li>A new method called SharpZO has been proposed for fine-tuning vision language models without the need for backpropagation, making them suitable for memory-constrained edge devices.</li><li>SharpZO utilizes a sharpness-aware two-stage optimization process that includes a global exploration stage using evolutionary strategies and a fine-grained local search phase with zeroth-order optimization.</li><li>The approach solely relies on forward passes during optimization and has shown significant improvements in accuracy and convergence speed compared to existing forward-only methods, achieving up to a 7% average gain in experiments on CLIP models.</li></ul>

SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

Discover more