The most challenging parts of this project were cleaning the tweets of hyperlinks and Unicode characters, organizing them by date and figuring out how to iterate through them properly. To address this, I had to teach myself how to use datetime and Regexptokenizer. I also had to figure out how to perform an OLS regression with python using plotly.express. Unfortunately, I couldn’t figure out how to label the axes on my OLS graphs which is why I presented two graphs for each year instead of just the graph with the regression line.
There are libraries which were created to deal with tweet data specifically so if I had to do this project again, I would use one of them.