Researchers propose a discrete cosine transform (DCT)-based approach for initializing and compressing the attention mechanism in Vision Transformers.The DCT-based attention initialization method offers improved accuracy in classification tasks for Vision Transformers.DCT effectively decorrelates image information in the frequency domain, enabling the compression of higher-frequency components.The DCT-based compression technique reduces the size of weight matrices for queries, keys, and values, resulting in decreased computational overhead.