Scammers in Tanzania are using SMS scams, impersonating trusted individuals, to deceive people into sending money.
A Tanzanian developer created the Bongoscam dataset with over 1,500 Swahili SMS scam examples and a machine learning model to detect them.
The dataset contains categorized messages – scam and trust examples to train models for detection.
The machine learning model achieved 98.7% accuracy using CountVectorizer and Multinomial Naive Bayes.
The model is wrapped in a Flask API and accessible through a public website.
The project structure includes GitHub repositories for the frontend, backend, API example, and instructions for setup.
An API endpoint allows users to input SMS messages for scam prediction.
The initiative aims to enhance digital safety by open-sourcing data, making the model public, and supporting the Swahili language.
The BongoScam dataset serves as a foundation for localized machine learning solutions in Swahili, aiding in the fight against digital fraud.
The project encourages developers, linguists, security researchers, and students to contribute and collaborate in combating fraudulent activities in Tanzania.
The tool and dataset are available for testing, exploration, and contribution on various platforms.
The goal is to build AI solutions in Swahili to protect individuals from cyber threats.