<ul data-eligibleForWebStory="true">Researchers introduce CaLMQA, a dataset of 51.7K culturally specific questions across 23 languages.Culturally specific questions are defined as those referring to unique cultural concepts or context-dependent answers.Questions were collected from web forums and native speakers in both high and under-resourced languages.Data collection for CaLMQA was translation-free to include culturally unique questions.Evaluation of LLM-generated answers showed critical surface-level errors for many languages.Even the best models struggled with low-resource languages, making mistakes such as answering in the wrong language or repetitions.Answers to culturally specific questions had more factual errors compared to culturally agnostic questions.CaLMQA aims to support future research in cultural and multilingual long-form question answering.The dataset enables exploration of culturally specific long-form question answering.Cultural uniqueness in questions included examples like 'Why was the first king of Burundi called Ntare (Lion)?' in Kirundi.CaLMQA addresses the lack of exploration of culturally specific questions in LLMs.The study highlights challenges in generating accurate long-form answers across diverse languages and cultures.Surface-level errors were prominent in LLM-generated answers for culturally specific questions.Factual errors were more common in answers to culturally specific questions compared to culturally agnostic questions.CaLMQA dataset creation involved input from multiple languages, including under-resourced ones like Fijian and Kirundi.