This analysis is based on the problem rised by Kaggle here
Solved with Mathematica 11 (code here)
A sentiment analysis job about the problems of each major U.S. airline. Twitter data was scraped from February of 2015 and contributors were asked to first classify positive, negative, and neutral tweets, followed by categorizing negative reasons (such as "late flight" or "rude service").
The dataset contains 14640 tweets and 15 variables (columns)
We can see from the bar plot and the pie that most tweets contain negative sentiment.
Most of the tweets are directed towards United Airlines, followed by American and US Airways. Very few tweets are targeted towards Virgin America
The second plot is more informative, in the sense that it allows as to see the proportion of negative sentiment tweets per airline. We see that American, United and US Airways directed tweets are mostly negative. On the contrary, tweets directed towards Delta, Southwest and Virgin contain a good proportion of neutral and positive sentiment tweets.
We see that negative sentiment is mostly elicited by Customer Service Issues (presumably bad customer service), followed by Late Flights.
From the plots we can see that for American airlines, negative sentiment is elicited mostly by Customer Service related Issues, and not so much for Late Flights. We could speculate that American flights depart mostly on time. The same seems to be true for Virgin and Southwest airlines. Virgin seems to have a sub-optimal booking system, as booking problems is the second reason eliciting bad sentiment in tweets.
US Airways and United have a number of complaints for Customer Service Issues followed closely by Late Flights.
On the contrary, for Delta most of the complaints are due to late flights. We could then speculate that Delta has problems with having their flights depart on time, yet they show a perhaps better customer service.
last four data retweets
'@USAirways 5 hr flight delay and a delay when we land . Is that even real life ? Get me off this plane , I wanna go home (3 heel clicks)'
"@USAirways of course never again tho . Thanks for tweetin ur concern but not Doin anythin to fix what happened. I'll choose wiser next time"
"STOP. USING.THIS.WORD. IF. YOU'RE. A. COMPANY. RT @JetBlue: Our fleet's on fleek. http://t.co/Fd2TNYcTrB"
'@USAirways with this livery back in the day. http://t.co/EEqWVAMmiy
The first 2 tweets show clear anger directed to US Airways. There was a substantial delay in the flight according to the first tweet, however the reason is not clear in the second tweet. The third tweet is directed towards Delta, although it is not clear what the message is. The curator of the dataset identified this tweet as negative, perhaps she followed the link attached for more information. I can't tell what the sentiment is from those lines. Finally, the fourth tweet is also targeted towards US Airways, the sentiment is neutral according to the curator of the dataset. I can't say what it was referring to from those lines.
Most tweets have negative sentiment (> 60%).
Most tweets are targeted towards United, followed by American and US Airways.
Virgin receives very few tweets.
Most of the tweets targeted towards American, United and US Airways contain negative sentiment.
Tweets targeted to Delta, Virgin and Southwest contain roughly similar proportion of negative, neutral and positive sentiment.
Main reasons for negative sentiment are Customer Service Issues and Late Flights.
Negative sentiment tweets towards Delta are based mostly on late flights and not so much on Customer Service Issues as for the rest of the airlines.
Most tweets are not re-tweeted.
Most tweets come from US & Canada time zone
Most tweets come from the States.