Spoken Named Entity Recognition (NER) aims to extract named entities from speech and categorise them into types like person, location, organization, etc.
VietMed-NER is the first spoken NER dataset in the medical domain, and the largest spoken NER dataset in the world for the number of entity types.
Baseline results using various state-of-the-art pre-trained models show that pre-trained multilingual models generally outperform monolingual models on reference text and ASR output.
The dataset can be utilized for text NER in the medical domain in other languages by translating the transcripts.