Lecturer: Giovanni Pilato
The module presents some models of text representation and the basic techniques of natural language processing. Special emphasis will be given to the vector representation model of words and texts in general. The python NLTK (Natural Language Toolkit) library for text analysis is introduced. The problem of information extraction and sentiment analysis, which is useful for identifying opinions in data on social media or microblogs, is also presented.
Textual data analysis can be useful to analyze the opinions that users express about objects and, in general, market products. Moreover, through this analysis it is possible to identify user profiles through the semantic content of their posts, as well as to implement automatic content recommendation systems.