Opinion Mining for Portuguese

Concept-based Approaches and Beyond

According to Liu (2012), "sentiment analysis, also called opinion mining, is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes". Cambria (2013) and Cambria et al. (2015) go on and define what they call concept-level sentiment analysis, which, according to the authors, performs a deeper understanding of the texts of interest in order to produce better results, taking into account more sophisticated NLP tasks for extracting opinionated information from text, including microtext analysis, semantic parsing, subjectivity detection, anaphora resolution, sarcasm detection, topic spotting, aspect extraction, and polarity detection".

The OPINANDO project aims at investigating issues of concept-level analysis for the Brazilian Portuguese language. We are particularly interested on three main research fronts, namely: (i) the identification of relevant texts to mine, which includes tackling text importance and filtering deceptive content; (ii) the analysis of the selected texts, performing the necessary semantic and discourse analysis and identifying subjective content and the corresponding aspects and polarities; and (iii) the synthesis of the relevant information, using text summarization and generation strategies and dealing with the related challenges in these tasks.

The project has been officially funded by USP Research Office (PRP N. 668, from May 2019 to April 2020) and got student scholarships from FAPESP, CAPES and CNPq agencies.

Related publications (see more here)

  • Monteiro, R.A. (2018). Detecção Automática de Notícias Falsas. Trabalho de Conclusão de Curso. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos-SP, November, 40p. pdf
  • Costa, R.W.M. (2018). Extração e qualificação de aspectos de opinião para o português. Trabalho de Conclusão de Curso. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos-SP, November, 47p. pdf
  • Anchiêta, R.T. and Pardo, T.A.S. (2018). A Rule-Based AMR Parser for Portuguese. In the Proceedings of the 16th Ibero-American Conference on Artificial Intelligence (IBERAMIA) (LNCS 11238), pp. 341-353. November, 13-16. Trujillo/Peru. pdf (preprint version)
  • Nóbrega, F.A.A. and Pardo, T.A.S. (2018). Update Summarization: Building from Scratch for Portuguese and Comparing to English. Journal of the Brazilian Computer Society (JBCS), Vol. 24, N. 11, pp. 1-12. pdf
  • Monteiro, R.A.; Santos, R.L.S.; Pardo, T.A.S.; Almeida, T.A.; Ruiz, E.E.S.; Vale, O.A. (2018). Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In the Proceedings of the 13th International Conference on the Computational Processing of Portuguese (PROPOR) (LNAI 11122), pp. 324-334. September, 24-26. Canela-RS/Brazil. pdf (preprint version)
  • Machado, M.T.; Pardo, T.A.S.; Ruiz, E.E.S. (2018). Creating a Portuguese context sensitive lexicon for sentiment analysis. In the Proceedings of the 13th International Conference on the Computational Processing of Portuguese (PROPOR) (LNAI 11122), pp. 335-344. September, 24-26. Canela-RS/Brazil. pdf (preprint version)
  • Vargas, F.A. and Pardo, T.A.S. (2018). Aspect clustering methods for sentiment analysis. In the Proceedings of the 13th International Conference on the Computational Processing of Portuguese (PROPOR) (LNAI 11122), pp. 365-374. September, 24-26. Canela-RS/Brazil. pdf (preprint version)
  • Santos, R.L.S.; Monteiro, R.A.; Pardo, T.A.S. (2018). The Fake.Br corpus - a corpus of fake news for Brazilian Portuguese. Latin American and Iberian Languages Open Corpora Forum (OpenCor). September, 24. Canela-RS/Brazil. pdf
  • Sobrevilla Cabezudo, M.A. and Pardo, T.A.S. (2018). NILC-SWORNEMO at the Surface Realization Shared Task: Exploring Syntax-Based Word Ordering using Neural Models. In the Proceedings of the First Workshop on Multilingual Surface Realisation, pp. 1–7. July 19. Melbourne/Australia. pdf
  • Sousa, O.A.F. (2018). Sumarização contrastiva de opinião: uma abordagem com otimização. Trabalho de Conclusão de Curso. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos-SP, June, 42p. pdf
  • Anchiêta, R.T. and Pardo, T.A.S. (2018). Towards AMR-BR: A SemBank for Brazilian Portuguese Language. In the Proceedings of the 11th edition of the Language Resources and Evaluation Conference (LREC), pp. 974-979. May 7-12. Miyazaki/Japan. pdf
  • Vargas, F.A. and Pardo, T,A.S. (2018). Hierarchical clustering of aspects for opinion mining: a corpus study. In M.J.B. Finatto, R.R. Rebechi, S. Sarmento and A.E.P. Bocorny (eds.), Linguística de Corpus: Perspectivas, pp. 69-91. Porto Alegre: Instituto de Letras da UFRGS. 580p. pdf
  • Anchiêta, R.T.; Sousa, R.F.; Moura, R.S.; Pardo, T.A.S. (2017). Improving Opinion Summarization by Assessing Sentence Importance in On-line Reviews. In the Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology (STIL), pp. 32-36. October 2-4. Uberlândia-MG/Brazil. pdf
  • Machado, M.T.; Ruiz, E.E.S.; Pardo, T.A.S. (2017). Analysis of unsupervised aspect term identification methods for Portuguese reviews. In the Proceedings of the 14o Encontro Nacional de Inteligência Artificial e Computacional (ENIAC), pp. 239-249. October 2-5. Uberlândia-MG/Brazil. pdf
  • Machado, M.T.; Temporal, J.C.A.N.; Pardo, T.A.S.; Ruiz, E.E.S. (2017). Mineração de tópicos e aspectos em microblogs sobre Dengue, Chikungunya, Zika e Microcefalia. In Anais do XVII Workshop de Informática Médica (WiM), pp. 265-274. July 3-5. São Paulo-SP/Brazil. pdf
  • López Condori, R.E. and Pardo, T.A.S. (2017). Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches. Expert Systems with Applications (ESWA), Vol. 78, pp. 124-134. pdf
  • Vargas, F.A. and Pardo, T.A.S. (2017). Estudo Empírico sobre Agrupamento e Organização Hierárquica de Aspectos para Mineração de Opinião. Série de Relatórios Técnicos do Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, N. 418. São Carlos-SP, March, 48p. pdf

Resources and tools (see more here)