Find below a glimpse of the several resources and tools that have been produced by NILC. There are many more research products that may be found at the websites of the projects.
Lexical resources
PortiLexicon-UD: a lexicon for Brazilian Portuguese according to Universal Dependencies
Unitex-PB: a lexicon for Brazilian Portuguese
Word embeddings for Portuguese
Lexical resources for Portuguese: propbank, verbnet, liwc and other resources
WordNetBr: a lexical database for Portuguese
Thesaurus for Portuguese
Aeiouadô: a machine-readable pronunciation dictionary for Brazilian Portuguese
Corpora and datasets
NILC corpus: a large corpus for Brazilian Portuguese
Lácio-Web: Brazilian Portuguese corpora and analysis tools
TweetSentBR: a large corpus of tweets in Brazilian Portuguese manually labeled according to their polarities
Leg2Kids
Mac-Morpho: a reference corpus for POS tagging in Portuguese
MilkQA
ASSIN
PorSimplesSent
Datasets of neuropsychological language tests in Brazilian Portuguese
SIMPLEX
SIMPLEX 2.0
SIMPLEX 3.0
CSTNews: a corpus with several linguistic annotation layers
OpSums-PT: a corpus of opinion summaries
Text complexity for educational levels
Historical Portuguese corpora
Fake.Br corpus: a corpus of aligned true and fake news in Brazilian Portuguese
UTLCorpus: a corpus of online reviews in Brazilian Portuguese annotated with helpfulness classification
AMR-BP: semantically annotated corpora for Brazilian Portuguese (according to Abstract Meaning Representation)
PLN-BR corpus: a journalistic corpus for Brazilian Portuguese
Tools
Stemming for Portuguese
Lemmatization for Portuguese
A flexible normalizer for user-generated content in Portuguese
SENTER: sentence segmenter for Portuguese
Coh-Metrix-Dementia
NILC-Metrix
Neologism detection tool for Portuguese
e-Termos: a system for terminology management
HABLA project
Automatic phonetic transcription for Brazilian Portuguese
Part-of-speech tagging for Portuguese
Curupira parser
NLP with neural networks: part-of-speech tagging and semantic role labeling
Semantic parser (following the Abstract Meaning Representation) for Portuguese
OPCluster-PT: automatic identification and clustering of opinion aspects in Portuguese
DiZer: DIscourse analyZER for Portuguese (according to the Rhetorical Structure Theory)
CST Parser: a multi-document discourse parser for Portuguese (according to Cross-document Structure Theory)
Topic segmentation for Portuguese (an adaptation of TextTilling)
RST Toolkit: a collection of software for dealing with RST-based discourse annotated texts
Evaluation tool for RST-based discourse trees
Text aligners
A tool for sentence ordering for texts in Portuguese
NILC-Wise: web interface for summary evaluation
UDConcord: a concordancer for Universal Dependencies-annotated data
Applications
RSumm: multi-document summarization for Portuguese
GistSumm - a classical text summarization system for Portuguese
An English pronunciation checker
Machine translation portal
Educational Facilita
SciPo: Scientific Portuguese
SciPo-Farmácia
SciPo-Farmácia (for English)
CALeSE: Computer-Aided Learning Tool for Scientific Writing in English