Downloads

GermEval Shared Task 2018
(Saarland University, Hochschule Darmstadt, Heidelberg University)
First shared task on the identification of offensive language.

German Verbal Polarity Shifters
(Saarland University, Heidelberg University)
A bootstrapped lexicon of German verbal polarity shifters. The lexicon covers 2595 verbs of GermaNet. Polarity shifter labels are given for each word lemma. All labels were assigned by an expert annotator who is a native speaker of German.
LINK to resource

One Million Posts Corpus
(Austrian Research Institute of Artificial Intelligence) 
This annotated data set consists of user comments posted to an Austrian newspaper website (in German language). The annotation includes sentiment information.

Tools for Lexicon Induction
(University of Potsdam)
A collection of scripts and executable files for generating sentiment lexicons from lexical taxonomies (mainly GermaNet), raw text corpora, and neural word embeddings.

Morphologically Complex Words
(Heidelberg University, Saarland University)
A data set comprising about 9000 complex polar expressions (e.g. compounds) along their polarity label. This resource also includes very rare complex expressions (taken from Wortwarte.de) along their polarity label and morphological analysis.

Negation Modeling for German Sentiment Analysis
(Saarland University, Heidelberg University)
A data set focusing on the scope of German negation and a rule-based tool that automatically detects the scope of a wide range of different negation words. The tool also supports sentence-level polarity classification. Negation modeling is incorporated in that classifier.

German Irony Corpus
(Hochschule Darmstadt)
A text corpus of ironical tweets in the soccer domain.

German Opinion Spam Corpus
(Hochschule Darmstadt)
A text corpus with fake reviews and genuine reviews from the German amazon portal.

German Opinion Role Extractor
(Saarland University)
This software is designed for the extraction of subjective expressions, sentiment sources and sentiment targets from German text. It has been developed according to the specification of the STEPS Shared Task (see below). The tool comes with pre-processing scripts (i.e. part-of-speech tagging, named entity recognition and syntactic parsing).

STEPS Shared Task 2016
(Heidelberg University, University of Hildesheim, Saarland University)
2nd iteration of the Shared Task on Source, Subjective Expression and Target Extraction from Political Speeches. Annotation guidelines were heavily refined. New annotated data were produced.

German EffectSynSets
(Hildesheim University)
The resource provides annotations of 1667 GermaNet synsets with effect functors, as detailed in the associated papers. Briefly, functors map constellations of participant evaluations to event evaluations, thereby supporting opinion inference. The functor scheme used is more comprehensive than that in Wiebe's work on EffectWordNet. The manually generated annotations in the resource can be used for experiments to automatically label all of GermaNet with functors. 

German Emotion Dictionary
(University of Stuttgart)
This resource contains word lists for 7 fundamental emotions. For more details, please refer to this paper abstract.

SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in German
(Humboldt Universität zu Berlin, Neofonie GmbH, University of Stuttgart)
The SCARE corpus consists of fine-grained annotations for mobile application reviews from the Google Play Store. For each user review the mentioned application aspects, i.e., the design or the usability, as well as subjective phrases, which evaluate these aspects, are annotated. In addition, the polarity (positive, negative or neutral) of each subjective phrase is recorded as well as the relationship of an aspect to the main app in discussion. Aspects which refer to an app or an aspect of an app other than the app in discussion are marked as “foreign”. All other aspects are “related”. In total, the corpus consists of 1,760 German application reviews with 2,487 aspects and 3,959 subjective phrases.

Sentiment Phrase List
(University of Applied Science Hof)
Sentiment Phrase List (SePL) is a generated list of opinion bearing words and phrases. The list contains adjectives, verbs and nouns as well as adjective-, verbs- and noun-based phrases and their opinion values on a continuous range between −1.00 and +1.00. For each word or phrase two additional quality measures are given. The list was produced using a large number of product review titles providing a textual assessment and numerical star ratings.

Potsdam Twitter Sentiment Corpus
(University of Potsdam)
A dataset of 7,992 German tweets, which were manually annotated by two human experts with fine-grained opinion relations. Corpus annotation includes sentiment-relevant elements such as opinion spans, their respective sources and targets, emotionally laden terms with their possible contextual negations and modifiers. LINK

Opinion Compound Dataset
(Saarland University, Hildesheim University)
Resource comprising German compounds (e.g. Expertenmeinung or Kinderlärm) that have been annotated with regard to opinion roles. Release comprises two datasets: one dataset comprising 2000 opinion compounds in which the modifier is annotated as either conveying some opinion role or none; 1000 opinion compounds in which the modifier is annotated as either conveying an opinion holder or an opinion target.

Verb View Lexicon
(Saarland University, Hildesheim University)
Resource that classifies all opinion verbs from the German Zurich Sentiment Lexicon according to their sentiment views. Each verb is categorized in one of three view categories. Categories are inspired by the different argument positions an opinion holder can assume. The categories are: agent view, where the opinion holder is realized as the agent of the opinion verb (e.g. love, hate, think), patient view, where the opinion holder is realized as the patient of the opinion verb (e.g. please, disappoint, surprise), and speaker view, where the opinion holder is the implicit speaker of the utterance (e.g. succeed, cheat, lie).

Bibliography on German Sentiment Analysis
(Hochschule Darmstadt)
A bibliography listing all (known) research done on German sentiment analysis.
This list was initiated by Melanie Siegel and her group of students at Darmstadt University of Applied SciencesgermanSentimentBibliography.xlsx

Multi-Domain Sentiment Lexicon for German
(Hochschule Darmstadt)
A sentiment lexicon combining sentiment terms extracted from the MLSA corpus and the pressrelations dataset, together with their polarites. The average polarity for each term was computed, separately for each data source. In addition, contents of SentiWS were added for evaluation purposes.
The lexicon was compiled by Kerstin Diwisch and Melanie Siegel.

MLSA: A Multi-Layered Reference Corpus for German Sentiment Analysis, v1.0
(University of Zurich,
 MODUL University Vienna, 
University of Leipzig, 
University of Hildesheim, University of Bielefeld, 
Saarland University)
This version consists of 270 sentences manually annotated for objectivity and subjectivity (Layer 1), word and phrase polarity (Layer 2) and expressions of private states (Level 3).
More layer-specific information is provided by README files, contained in the archive listed below. mlsa.tgz

STEPS Shared Task 2014
(University of Hildesheim, 
University of Potsdam, 
MODUL University Vienna, 
Bielefeld University, Saarland University)
Annotation guidelines for the STEPS Shared Task, to be held in collocation with KONVENS 2014.  guide.pdf
Small set of trial data that shows the gold annotation on 10 sentences.