A basic User Guide for Woefzela is available for here.
Please refer to the Download page for an example corpus.
Data recording operational tips
- Do a complete and thorough TRIAL RUN of recording AND ALL DATA ANALYSIS for each language prior to commencing large data collection campaigns!!! Especially note things such as UTF-8 compliance when data has been handled through multiple machines/channels/operating systems and various other checks (See "The design, collection and annotation of speech databases in South Africa").
- It is suggested to only record about four (4) sessions on a single 2GB SD card. This implies physically removing the data from the SD card after about four sessions to free up the space on the card and not simply moving/deleting to Trash/renaming the folders/files. Be aware of how different operating systems 'delete' files when cleaning SD cards.
- A further suggestion would be to prepare additional SD cards with all the necessary corpora to facilitate easily swapping cards on recording devices between sessions, maximizing the use of the recording devices.
- Ensure that all the phones/devices are set to the correct date since generated data and folders are named after the date time stamp which could impact on traceability.
- Confirm that all phones/devices used in a campaign has unique IMEI numbers as this is used in uniquely identify data sources. (PS: In principle IMEIs should be unique world-wide, but confirm in advance.)
- Manage collected data carefully to avoid any duplication and risk time consuming checks later.
- Manage recording corpora/prompts very carefully to ensure different versions of the corpora are not used at different times, or at minimum that merging of these results are pre-planned.
- It is strongly recommended that the field worker carefully verifies the information required by Woefzela's "Respondent Information" screen to avoid mismatch between the information manually collected about the respondent and that entered on this screen. However, the respondent himself/herself must read and accept/reject the actual terms and conditions in person prior to commencing. This will also avoid the situation in which a respondent browses into other respondent's profiles while appearing to be entering their own information.
- Supervise each respondent's recording session regularly by looking at the UI on the recording device to ascertain the number and types of errors made up to that point. Some repondents may record hundreds of prompts but while whispering causing 'volume too low' errors and thus poor quality data.
- Please note that volume and other QC thresholds may depend on the actual handset/device used due to the variability of recording hardware on different devices. Please confirm that the levels are sufficient for quality control.
Additional tips
- To change the recording target (only for recording sessions) without code recompilation, create a plain text file called recordingTarget.txt and place it on the device in the folder /sdcard/Woefzela/Tracking. The number inside this file will be used as the recording target.
- Please note that there are differences in the formats of the meta data output files generated during enrolment and recording between the different versions of Woefzela. Please confirm these differences if intending to change to a different version during a recording campaign.
Known problem with UTF-8
- Although all files associated with prompts are in UTF-8, the file name of the Fieldworker and Respondent profiles are not allowed to have UTF-8 characters in them. As a consequence, please avoid using non-ASCII characters during the registration of both Fieldworker and Respondent profiles, SINCE the filenames are automatically derived from the person's First and Surnames.