Welcome to my Capstone Project
Hello, I am Joshua Yoon, a senior at Fort Worth Country Day. Over the summer, I learned about computational linguistics with my mentor Charles Nainan at JPL. I wanted to extend my learning into the school year by making a Capstone! To do so, I made a program that will listen to a speaker and then classify them as Texan or not. I then consolidated the project into a Kermit the Frog puppet. On this website, you can read about the process that I went through to make the project. If you wish to see the code that I put together, you can check it out here.
I would like to thank my mentor, Mrs. Robinson, and Mrs. Blan for helping throughout the project. Certainly, I would not have been able to do it without their support.
The Goal
The idea is to write a Python program that will classify the speaker's accents and dialects. Since we live in Texas, the classification will be centered on Texan dialects. The program will determine whether you have the accent or not. The project should be simple enough.
The Process
The project is not simple at all.
The Written Quiz
The first idea was to make a written quiz of sorts. Participants would type out their answers to a question. Based on their result, the program would spit out a result like "You are from Texan." The problem here is that this is very difficult. Also, written language is different from spoken language so the quiz would probably fail spectacularly.
The Multiple Choice Quiz
The quiz could be multiple choice. Participants choose a response from listed options, and the answer is thus spat out. This solution is much too simple and would not be an extension of my academic interests. This too would fail spectacularly.
Speech Recognition: Phonetic Transcription
The third idea could be to use speech recognition to recognize people's accent. If we use audio recordings of American dialects to train a program, then this could work. If we could use a speech to text program that writes down the phonetics of the words, then we could viably classify accents. Instead of writing out "quiz", the program would write out "kwiz." The problem here is that there is no way to do that. If I wanted to do that, then I would need years more of experience.
Speech Recognition: Confidence Score
In my research, I discovered that IBM has a speech to text service. The service will give a confidence score to its transcription. You can also train the model on audio that you feed it. So the idea here is that we could train the classifier on Texan audio. If the confidence score is over a certain value, then it is a Texan accent. If it is beneath the value, then the accent is some other accent. To keep the project simple and not too time-consuming, the results would be kept binary. The problem here is that to customize the speech to text model you have to pay IBM. Also, the confidence score would be high even with people with standard American accents. I am not training the model to only recognize Texan accents; I am training the model to recognize both Texan accents and "standard" U.S. accents. Next idea
Naive Bayes Classification
In computational linguistics, there exists the Naive Bayes classifier. It is named after the statistician Thomas Bayes. It takes in data you give it and looks for patterns in the data. Then from there, it can classify new data. I will take transcripts of dialect recordings. Then, I will train the classifier with the transcript data. Afterwards, I program a speech recognizer to transcribe your sentence. The program compares the sentence to the data it has.
Now that I have decided on my method, all that is left to do is to execute, which is much easier said than done. The first task is to find the data. For this purpose, I used the International Dialects of English Archive. They have a plethora of audio recordings and transcriptions of many different English speakers. I went through their Texas selection and chose sentences that seemed to me the most representative of the dialect. Then, I did the same with California. I used California as the "standard" kind of American dialect. Using this data, I trained the classification model.
The next step is speech recognition. Using Google's free speech recognition service, I put together a simple speech recognizer. Then, I put the two programs together. The computer now can listen to you talk then it will give you an answer, but I was not completely satisfied with this result.
Kermit the Frog
I thought that having people talk to a computer screen would be pretty boring and uneventful, so I purchased a Kermit the Frog puppet. Now, Kermit listens to you talk. Instead of computer text on a screen giving you feedback, Kermit the Frog now talks to you. Yes, I voiced Kermit.
Also pictured are the wireless speaker and the microphone that I used for the project.
Works Cited
"International Dialects of English Archive." International Dialects of English Archive, www.dialectsarchive.com/.