Researchers propose a novel pruning method called structurally-aware adaptive pruning (SAAP) for large language models (LLMs).SAAP aims to reduce computational and memory costs while maintaining model performance on resource-constrained edge devices.The method defines an adaptive importance fusion metric to evaluate the importance of all coupled structures in LLMs.Experimental results show that SAAP outperforms several state-of-the-art baseline methods in terms of accuracy and token generation speed.