viralheat

Sentiment Analysis with Viralheat's API sentiments

Viralheat's Sentiment Analysis API

The third approach to perform sentiment analysis in R relies on the

Sentiment Analysis API from Viralheat which allows us to infer the sentiment

of a given piece of text. Jeff Allen shows a nice example of how to use this

API with R with his Sermon-Sentiment-Analysis code.

Although this API only accepts up to 360 characters, this is no problem

since tweet's content is limited to 140 characters.

In order to use the Sentiment Analysis API, you need an API key which

you can get by simply registering for a free developer account in Viralheat.

The key should look something like this (you need to get your own):

JosYnzSHwhszhjhBgABC

Example comparing "mcdonalds" vs "burgerking"

Step 1: load required packages

# load packages

library(twitteR)

library(RCurl)

library(RJSONIO)

library(stringr)

Step 2: Let's create a function to get the sentiment of a given tweet text

# getSentiment function

getSentiment <- function (some_text, key)

{

# text url-encoded

code_text = URLencode(some_text)

# save all the spaces

code_text = str_replace_all(code_text, "%20", " ")

# get rid of the weird characters that break the API

code_text = str_replace_all(code_text, "%\\d\\d", "")

# convert back the URL-encoded spaces

code_text = str_replace_all(code_text, " ", "%20")

# viralheat sentiment url

vh_url = "http://www.viralheat.com/api/sentiment/review.json?text="

# send query and get an answer from viralheat

vh_answer = getURL(paste(vh_url, code_text, "&api_key=", key, sep=""))

# extract elements in the answer (i.e. prob, text, mood)

js = fromJSON(vh_answer, asText=TRUE)

# get mood probability

score = js$prob

# positive, negative or neutral?

if (js$mood != "positive")

{

if (js$mood == "negative") {

score = -1 * score

} else {

# neutral

score = 0

}

return(list(mood=js$mood, score=score))

}

Step 3: Let's create a function to clean the text

clean.text <- function(some_txt)

{

some_txt = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", "", some_txt)

some_txt = gsub("@\\w+", "", some_txt)

some_txt = gsub("[[:punct:]]", "", some_txt)

some_txt = gsub("[[:digit:]]", "", some_txt)

some_txt = gsub("http\\w+", "", some_txt)

some_txt = gsub("[ \t]{2,}", "", some_txt)

some_txt = gsub("^\\s+|\\s+$", "", some_txt)

# define "tolower error handling" function

try.tolower = function(x)

{

y = NA

try_error = tryCatch(tolower(x), error=function(e) e)

if (!inherits(try_error, "error"))

y = tolower(x)

return(y)

}

some_txt = sapply(some_txt, try.tolower)

some_txt = some_txt[some_txt != ""]

names(some_txt) = NULL

return(some_txt)

}

Step 4: Let's get tweets for mcdonalds and burgerking

# harvest tweets

mc_tweets = searchTwitter("mcdonalds", n=200, lang="en")

bk_tweets = searchTwitter("burgerking", n=200, lang="en")

# get text

mc_txt = sapply(mc_tweets, function(x) x$getText())

bk_txt = sapply(bk_tweets, function(x) x$getText())

# clean text

mc_clean = clean.text(mc_txt)

bk_clean = clean.text(bk_txt)

Step 5.1: Get sentiment of each mcdonalds text

Since we are using viralheat's api this might take a while

# how many tweets

mcnum = length(mc_clean)

# data frame (text, sentiment, score)

mc_df = data.frame(text=mc_clean, sentiment=rep("", mcnum),

score=1:mcnum, stringsAsFactors=FALSE)

# apply function getSentiment

sentiment = rep(0, mcnum)

for (i in 1:mcnum)

{

tmp = getSentiment(mc_clean[i], mykey)

mc_df$sentiment[i] = tmp$mood

mc_df$score[i] = tmp$score

}

In my case, the first 10 rows look like this

(Note that the assigned sentiments by viralheat's API is not always correct)

Step 5.2: Get sentiment of each burgerking text

# how many tweets

bknum = length(bk_clean)

# data frame (text, sentiment, score)

bk_df = data.frame(text=bk_clean, sentiment=rep("", bknum),

score=1:bknum, stringsAsFactors=FALSE)

# apply getSentiment

sentiment = rep(0, bknum)

for (i in 1:bknum)

{

tmp = getSentiment(bk_clean[i], mykey)

bk_df$sentiment[i] = tmp$mood

bk_df$score[i] = tmp$score

}

In my case, the first 10 rows look like this

Step 6: Checking sentiments

# how many positives and negatives in mcdonalds

table(mc_df$sentiment)

# how many positives and negatives in burgerking

table(bk_df$sentiment)

Page updated

Google Sites

Report abuse