Researchers have developed low-complexity networks for sound event detection.
The goal is to achieve competitive performance with state-of-the-art models while reducing computational requirements.
Through adjustments to convolutional models, such as changing strides and adding sequence models, the performance of the low-complexity models was improved.
By optimizing training strategies, event detection performance comparable to state-of-the-art transformers was achieved with only 5% of the parameters.