Apple has developed a technique to triple the rate of generating tokens when using Nvidia GPUs.This technique can help accelerate the production of large language models (LLMs) for Apple Intelligence.Training models for machine learning is usually slow and resource-intensive.Earlier in 2024, Apple introduced a method called Recurrent Drafter to improve performance in training.