To create a training corpus through crowdsourcing for the automatic identification of spoken references to Valencian municipalities in news bradcasts of the Valencian public television channel “À punt”.
In order to achieve this goal, we will train a machine to analyse the audio of news broadcasts and identify any municipalities that are named. In doing so, we need to take into consideration different accents in the pronunciation of each municipality.
– To develop a chatbot in Telegram for collaborative text to audio transcription of the names of Valencian municipalities.
– To implement an API to access the official names of Valencian municipalities.
– To design a user-centred chat interface.
The Telegram chatbot can be accessed at http://t.me/pronunciaelpoblebot or searching on Telegram for “pronunciaelpoble”.
It is very simple to use:
- Download and execute Telegram.
- Search for “pronunciaelpoble”.
- Press the “Start” button.
- Press or send the command /municipi.
- Record yourself reading out loud the municipality name that appears on the screen. Then send the recorded audio.
You can record as many municipalities as you wish.