Computers communicate with numbers, so if we want them to understand how we communicate we need to understand how language actually works and how people determine meaning.
People use a lot of emotion when they interact with each other, and determining if somebody feels positively or negatively about whatever they're talking about may seem simple for us, but for computers it's much more complex. Furthermore, humans are big fans of sarcasm, slang, and improper grammar, making the process of teaching our technology to understand what we mean even more difficult.
This process, of figuring out what people mean when they talk or write and inputting it in such a way that computers can understand and respond to, is called Sentiment Analysis. We can determine the emotion people try to convey in language by using different algorithms that analyze each word, propose potential meaning, and decide which meaning is most probable. People do this automatically, but for computers it can take a lot more time. So why do we even do it?
Computers are capable of analyzing huge amounts of data, all at the same time, which humans just can't do. By helping computers understand what we mean, they can process billions of words. This has a lot of potential: NLP and Sentiment Analysis is already used in thousands of homes with voice-activated assistants such as Amazon Alexa and Apple's Siri, but it can also be used to learn more about humans. With the ability to understand the words of millions of people on the internet, we can learn about the current state of our world and what people think about it. In the future, we may even be able to improve the world using this information.
With this in mind, we developed three projects that analyzed information from the internet. Using databases consisting of millions of tweets and thousands of articles, we used our knowledge of NLP and sentiment analysis to figure out how language is used to affect the perception of ideas and understand the implications of text created and generated by people all over the world.
We used data from Twitter to compare how positively or negatively Twitter users from different countries felt about assorted topics. To better picture this information, we created maps that are color-coordinated to reflect their residents' attitudes.
We developed an algorithm that filters out articles with a certain subjectivity rating. Basically, it only shows articles that are calculated to be as objective as possible and removes biased ones.
We gathered data from twitter to analyze the publics perceptions of a few billionaires to see whether the opinion was based on objective or subjective factors.
On the second day of the Invent the Future camp, we attended a lecture by Dr. Alona Fyshe, an expert in Natural Language Processing. She explained what NLP is, what it is so difficult, and what it is used for.
Throughout the three days we had to work on the project, we were taught and aided by our two amazing mentors, Murad Ali and Nadia Ghobadipasha. They answered all our questions and walked us through how to complete our projects, and we could have never done it without them. They're geniuses and endlessly helpful and informative.