Implementation

DATA COLLECTION

The tool we used for collecting audio data is Praat. After designing 8 sentences, involving negation, imperative, question, declarative and exclamation, we asked each participant to speak the 8 sentences in both polite and impolite tone as well as a sentence of their names and majors in a neutral tone. The neutral audios are used as baselines for future analysis in order to eliminate the tone differences between different people. We collected audio samples from 16 persons, a total of 272 audios, including 16 neutral audios, 128 polite ones and 128 impolite ones.

About model

The model is the process that analyzes the user input and gives the server a politeness score as well as the pitch and volume information. We extracted pitch and volume features from the audio using Parselmouth [1], a Python library that allows easy access to Praat [2] functionality. We then used logistic regression from Scikit-learn [3] to predict a probability of politeness from the given feature vectors.

[1] Y. Jadoul, B. Thompson, and B. de Boer, “Introducing Parselmouth: A Python interface to Praat,” Journal of Phonetics, vol. 71, pp. 1–15, 2018.

[2] P. Boersma and D. Weenink, Praat: doing phonetics by computer [Computer program]. 2018.

[3] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Data Processing

For Data Mining in the back-end, we used logistic model to train the model which could return a percentage of politeness finally. I work on processing data before putting them in the model. Since there are more than 1000 data tuples for each audio, I remove all data with a zero pitch. Then I divide the data into 10 chunks and calculate the mean for each chunk. So after the processing, there would be 10 values of pitch and 10 values of volume for each audio with a time sequence

about frontend

We used Html,CSS & Javascript with Bootstrap4 to design the website. Most of request are going through the Ajax. Fonts and icons are used from the Google fonts/icons

For the Recording features and drawing graphs, we used Recorder.js and D3.js.

Server

Our Server is Apache2 on Amazon Web Service EC2. It is Ubuntu 18.04 using python3 with Flask for backend framework. Our website is protected with Https, which makes it safer to save users' audios.

Report abuse