Maarallee
from 17/11/2025
With maarallee.be, we collect voice recordings to build a dataset of spoken Flemish. Participants can make short audio recordings through our Maarallee app, using their own way of speaking, with their own accent, dialect, or intonation—simply by answering an audible prompt.
These speech recordings are then converted into written text, which is subsequently corrected by volunteers. And all of this is done in a privacy‑friendly way, of course! This combination of corrected text and voice recordings can be used as training data to improve automatic speech recognition (also known as ASR). These are AI models that recognise sounds as words—and that’s not an easy task for a computer! By collaboratively building a large database of spoken Flemish, we create the conditions for developing ASR models that are specifically tailored to the Flemish language and that truly understand our Flemish sounds and way of speaking.
In this way, we gradually help AI understand Flemish better—from Acid and Willem Vermandere to Natalia and Belle Perez. That’s the plan! Speech technology that understands the Flemish way of speaking—whether you’re from Pelt, Ostend, Borgerhout, or anywhere else.
Aim
The project aims for a publicly available dataset of al least 2000 minutes of spoken Flemish.
This dataset will be distributed by the institute for the Dutch language.
How to participate
Download the app at:
Maarallee - Apps on Google PlayGive the necessary permissions to record audio via your device and start answering the provided questions in the app. You are free to choose which ones, but the more you can speak the better.
Needed equipment
A smartphone
About funding
Funding bodies: VAIOP