Researchers have developed a novel text-to-speech (TTS) system called SupertonicTTS.SupertonicTTS aims to improve scalability and efficiency in speech synthesis.The system utilizes a low-dimensional latent space, temporal compression, and ConvNeXt blocks for a lightweight architecture.SupertonicTTS eliminates the need for grapheme-to-phoneme modules and external aligners through cross-attention for text-speech alignment.