If you have a phone, you've almost definitely texted before. A revolutionary way to communicate with everyone, texting is a useful and integral skill. Some people, through one way or another, are more skillful and effective at typing on their phone than others. The Texting Speedometer evaluates just how fast you can type on a mobile phone by using accelerometer readings.
With cell phones becoming an integral part of life these days, many people have become accustomed to typing on the smartphone keyboard. Just like how typing on a computer keyboard is a fundamental skill [1], typing on a phone is also a useful skill. It allows for communication with people, easy online searching, and so much more. Touch screen keyboards are "one of the most prominent among interactive devices" [2], likely because they are so portable and wieldable. But typing on a phone is not a skill that is very well measured or cared about; it isn't something that schools will teach as part of their core curriculum like typing on a computer might be. Therefore, people who are able to type on their phone quickly have likely built up strong skills through texting a lot.
The purpose of this project is to help college students learn how fast they can text, and compare their speeds with others to see whether they are avid texters or not.
Additionally, studying touch screen keyboard inputs has a more serious application. Keystroke logging is possible on phones through the use of motion sensors; Cai and Chen found were able to correctly infer more than 70% of the typed keys in their study purely through analyzing the phone's motion sensor data [3]. By understanding how accelerometers relate to aspects of typing on a phone, more preventative measures can be taken to prevent dangerous uses of keystroke logging on phones.
Sensors
Use the Phyphox app to collect accelerometer data
Fourier Transforms
Apply Fourier transforms to data and find the maximum frequency
MATLAB
Plot accelerations, Fourier Transforms, and data points
Statistical Analysis
Calculate the correlation between most common frequency and WPM
Initial Model
When we first began this project, we intended to analyze the typing speeds for people on their laptops. A phone was strapped to a person's forearm as they did a 30 second typing test on MonkeyType. The phone collected acceleration data in the x, y, and z directions with the Phyphox app.
The accelerometer data we gathered was converted into x, y, and z FFT plots. The peak value of each graph, representing the most common frequency along that axis, was plotted against the person's words per minute (WPM) as measured by MonkeyType. This generated three scatter plots (x, y, and z) of everyone's most common frequency vs. WPM. But instead of seeing a clear correlation in at least one of the scatter plots, we found just a random cloud of points (Figure 1).
We realized that people's typing habits varied significantly on a laptop keyboard; some would move their hands a lot, while others with larger hands would only move their fingers. All of this variation caused inconsistent sensor data that was nearly impossible to fix with any kind of calibration. We decided to modify our model to focus on typing speeds on a phone instead, with the idea that there would be far less variation in movement.
A phone was attached to a person's forearm to collect accelerometer data as they performed a typing test on their laptop.
Figure 1: Based on the Fourier Transforms, the most prevalent x, y, and z frequency for each typing test was plotted against the Words Per Minute. These three plots were generated with tests using the computer keyboard, so none of them appeared to have a strong correlation.
Final Model
In the final model, a person would take a 30 second MonkeyType test on their phone's browser, and hold a second phone right behind theirs to collect the acceleration data using Phyphox. We expected to pick up a vibration from each key the user pressed. Because the user would press into the screen to type (AKA exert a force along the z-axis), we found that acceleration data along the z-axis showed us the most consistent and clear graph. Along the x- and y-axes, there was more user movement that reflected their unique typing habits. We chose not to consider the x- and y-axis data because of all the unknown noise that we couldn't calibrate across all users.
Further Adjustments
In addition to pivoting to using the phone, we realized that we should be using the raw typing speed rather than the typing speed that MonkeyType calculated. This calculated speed accounted for the accuracy (lower accuracy = lower WPM), which was impossible to account for when analyzing the pure acceleration data. Before we made this change to our model, there was very little correlation between the most common z-acceleration frequency and the words per minute; the points looked like a scattered cloud. After we found this change and re-recorded the raw WPM for everybody, the plot clearly followed a linear trend.
Some of our old data points actually fit our regression line very well. We chose to include them in our final analysis because we verified that they had a high accuracy (>95%), which meant that their calculated WPM would be very similar to their raw WPM.
After a person took a 30 second typing test on their phone, the accelerometer data along the x, y, and z axes were plotted (Figure 2). Each plot was cropped to only 25 seconds, as it was difficult to crop 30 seconds without accidentally including movement before/after the test. In Figure 2, the accelerations along the z-axis generally have higher peaks, meaning that it is more obvious to identify frequencies and analyze them.
Figure 2: x, y, and z accelerometer data was plotted for a 30 second phone typing test. The data was cropped to only 25 seconds to allow some buffer for precision.
After trimming down the range of accelerometer data points, we took a Fast Fourier Transform (FFT) on each axis. Instead of plotting the FFT values directly — which generated some noisy, difficult-to-read graphs due to the sheer amount of data points — we grouped 50 adjacent points together, took the average of them, and plotted them as bars (Figure 3).
In each of the FFT plots in Figure 3, there is a distinct peak, indicating that the person's movements were rather consistent across the test. Additionally, the z-axis FFT was the most slender, indicating that the vibrations experienced along that axis were far more consistent than in other directions. Because of this, we decided to focus on the z-axis data points to analyze typing speeds.
Figure 3: A Fast Fourier Transform was applied to the x, y, and z accelerometer data (Figure 2). For better viewing, these transforms were plotted after binning them into groups of 50 adjacent points.
The FFT plots allowed us to identify the peak, which represented the most frequent frequency (hehe) that occurred during the typing test. We took the peak frequencies along only the z-axis and plotted them against each person's WPM to analyze correlation. Using MATLAB's polyfit and polyval functions, we generated a line of best fit.
Figure 4 represents the data points we gathered before recognizing that we needed to use the raw WPM, and Figure 5 represents the data points after we implemented the raw WPM. The points form a much clearer trend line in Figure 5, verifying that using the raw WPM was the right way to go. In the plot in Figure 5, there are some points that were brought over from the "calculated WPM" plot in Figure 4. This is because their typing accuracies were near 100%, so their raw WPM was almost exactly their calculated WPM.
Figure 4: A scatter plot of the most common frequency vs. WPM. The points are spread far apart, suggesting our model here is not very accurate.
y = 0.072x + 3.16, where y represents the most common typing frequency, and x represents the raw WPM measured by MonkeyType.
Figure 5: A scatter plot of the most common frequency vs. WPM. The points fall close to the line, suggesting our model here is more accurate.
y = 0.117x + 0.1, where y represents the most common typing frequency, and x represents the raw WPM measured by MonkeyType.
Using the model and the line of best fit we developed we were able to plug a few test data sets that we gathered and attempt to identify what the WPM score of the typing test they originated from was. The results of it can be seen below:
Percent Accuracy
14.5%
15.2%
5.0%
Actual Typing Speed
64
62
52
Calculated Typing Speed
54.7
52.6
49.4
Given the issues with our initial data set, we are more than satisfied with the accuracy of our results. At most there is a difference of only about 11 WPM. Hopefully, this would mean that with a larger set of test data, similar or better accuracy would be achieved. We consider this to be an acceptable level of error for the first version of our typing speed estimator. However, for continued development, a higher accuracy (±5 WPM) would be desired.
There were some limitations to our model and measurement system that contributed to a relative inaccuracy with our calculated correlation.
First, we used the most prevalent frequency (mode) as our measurement of typing speed, rather than an average of all the frequencies. MonkeyType uses the average frequency within the 30 seconds to calculate the words per minute. This difference caused slight errors in our measurement system, but was most significant when the person typing had a relatively inconsistent typing style (like if they backspaced to correct all their mistakes). In these scenarios, there was not necessarily a prominent typing frequency, which was visually signified by a more stout Fourier Transform plot. Therefore, our use of the mode was not always accurate in determining the typing speed.
Another limitation to our study was that the size of our data is relatively small; in our final model, we had less than 20 data points to work with. Had we acquired more data, our correlation analysis would be more accurate.
To improve upon our study we would aim to have a wider data set collected, preferably at least 40 individuals each with 3-5 trials. This would allow us to get a more accurate trend line, that would apply to more individuals along with accounting for any variance in a single typing test by taking multiple trials. We would likely take the median between each of the typing trials to account for outliers and use this in our dataset. From here, we would ensure to use only the raw typing score rather than the accuracy-adjusted one and conduct a similar analysis that we have with past data.
To ensure that the data we collect is as good as possible it would be ideal to incorporate the typing test and accelerometer data collection into a singular phone rather than relying on the test subject using two at once. We would also like to be able to do this on the test subject's own phone to allow their most comfortable typing experience. Were this to actually be implemented it would need to be functional on a wide variety of phones, so this would prepare for such an occurrence.
It would also be intriguing to investigate how our results are affected when on varied types of devices and not simply smartphones. It could be interesting to revisit the laptop experiment now that we have more experience and attempt to attain usable results, and hopefully a usable trend line, from it.
2011-2022, (c) Copyright skillsyouneed.com. “Stop Pecking at Your Keyboard: The Many Benefits of Learning How to Type Fast and Accurately.” SkillsYouNeed, www.skillsyouneed.com/rhubarb/benefits-of-typing.html.
Abdusselam, Mustafa Serkan. “A Research on Students' Preferences for Mobile On-Screen Keyboard.” Shanlax International Journal of Education, vol. 9, no. 2, Mar. 2021, https://files.eric.ed.gov/fulltext/EJ1287537.pdf.
Cai, Liang, and Hao Chen. “TouchLogger: Inferring Keystrokes on Touch Screen from ... - Usenix.” TouchLogger, UC Davis, https://www.usenix.org/legacy/events/hotsec11/tech/final_files/Cai.pdf.