Recent studies have introduced various approaches for prompt-tuning black-box vision-language models, referred to as black-box prompt-tuning (BBPT).To address the issue of excessive queries in prompt-tuning, a new approach called Zeroth-order Intrinsic-dimensional Prompt-tuning (ZIP) is proposed.ZIP reduces problem dimensionality and variance of zeroth-order gradient estimates for efficient and robust prompt optimization.ZIP achieves state-of-the-art performance on multiple vision-language tasks, with improved few-shot accuracy and query efficiency.