Downloads

DeTox

(Darmstadt University of Applied Sciences, Fraunhofer Institute for Secure Information Technology, Mittweida University of Applied Sciences)

A Comprehensive Dataset for German Offensive Language and Conversation Analysis.

LINK to data

LINK to paper

Pro/Con Sentiment Inference Gold Standard Data

(Zurich University)

Annotation of pros and cons relations as well as positive actors and effects.

LINK to data

LINK to paper 1

LINK to paper 2

Quantifed NP Silver Standard Data

(Zurich University)

Noun phrases quantified with respect to polarity (semi-automatically annotated).

LINK to data

LINK to paper

Negative Entities Quantified

(Zurich University)

A small set of sentences involving negative entities quantified (manually annotated).

LINK to data

LINK to paper

X-stance Writer Perspective

(Zurich University)

1000 texts of the X-stance corpus manually annotated regarding the writer's perspective: implicit and explicit pros/cons-relations.

LINK to data

LINK to paper

CLEF2022 CLEF-2022 CheckThat! Lab -- Task 3

(FH Potsdam, Universität Duisburg-Essen, Hochschule Darmstadt, Universität Hildesheim, AIT Austrian Institute of Technology GmbH, Alpen-Adria-Universität Klagenfurt)

Shared task on the identification of fake news

LINK to website

LINK to data

LINK to proceedings

GermEval Shared Task 2021

(deepset, Heinrich-Heine Universität Düsseldorf, Alpen-Adria-Universität Klagenfurt)

Shared task on the identification of toxic, engaging and fact-claiming comments.

LINK to website

LINK to data

LINK to proceedings

GermEval Shared Task 2019

(FH Potsdam, Saarland University, Hochschule Darmstadt, Heidelberg University)

Second shared task on the identification of offensive language.

LINK to website

LINK to data

LINK to proceedings (part of the proceedings of KONVENS 2019 )

GermEval Shared Task 2018

(Saarland University, Hochschule Darmstadt, Heidelberg University)

First shared task on the identification of offensive language.

LINK to website

LINK to data

LINK to proceedings

German Verbal Polarity Shifters

(Saarland University, Heidelberg University)

A bootstrapped lexicon of German verbal polarity shifters. The lexicon covers 2595 verbs of GermaNet. Polarity shifter labels are given for each word lemma. All labels were assigned by an expert annotator who is a native speaker of German.

LINK to resource

LINK to paper

German Polarity Lexicon ("PolArt"-Lexicon)

(Zurich University)

A manually specified, word-level polarity lexicon for German comprising 3424 positive,

5294 negative and 662 neutral nouns, verbs and adjectives. Part of the lexicon are also 22

shifters and 36 diminishers/intensifiers.

LINK_to_paper

LINK to resource

One Million Posts Corpus

(Austrian Research Institute of Artificial Intelligence)

This annotated data set consists of user comments posted to an Austrian newspaper website (in German language). The annotation includes sentiment information.

LINK to resource

LINK to paper

Tools for Lexicon Induction

(University of Potsdam)

A collection of scripts and executable files for generating sentiment lexicons from lexical taxonomies (mainly GermaNet), raw text corpora, and neural word embeddings.

LINK to resource

Morphologically Complex Words

(Heidelberg University, Saarland University)

A data set comprising about 9000 complex polar expressions (e.g. compounds) along their polarity label. This resource also includes very rare complex expressions (taken from Wortwarte.de) along their polarity label and morphological analysis.

LINK to resource

LINK to paper

Negation Modeling for German Sentiment Analysis

(Saarland University, Heidelberg University)

A data set focusing on the scope of German negation and a rule-based tool that automatically detects the scope of a wide range of different negation words. The tool also supports sentence-level polarity classification. Negation modeling is incorporated in that classifier.

LINK to resource

LINK to paper

German Irony Corpus

(Hochschule Darmstadt)

A text corpus of ironical tweets in the soccer domain.

LINK

German Opinion Spam Corpus

(Hochschule Darmstadt)

A text corpus with fake reviews and genuine reviews from the German amazon portal.

LINK

German Opinion Role Extractor

(Saarland University)

This software is designed for the extraction of subjective expressions, sentiment sources and sentiment targets from German text. It has been developed according to the specification of the STEPS Shared Task (see below). The tool comes with pre-processing scripts (i.e. part-of-speech tagging, named entity recognition and syntactic parsing).

LINK to resource

STEPS Shared Task 2016

(Heidelberg University, University of Hildesheim, Saarland University)

2nd iteration of the Shared Task on Source, Subjective Expression and Target Extraction from Political Speeches. Annotation guidelines were heavily refined. New annotated data were produced.

LINK to website

LINK to data

LINK to proceedings

German EffectSynSets

(Hildesheim University)

The resource provides annotations of 1667 GermaNet synsets with effect functors, as detailed in the associated papers. Briefly, functors map constellations of participant evaluations to event evaluations, thereby supporting opinion inference. The functor scheme used is more comprehensive than that in Wiebe's work on EffectWordNet. The manually generated annotations in the resource can be used for experiments to automatically label all of GermaNet with functors.

LINK

LINK_to_paper1

LINK_to_paper2

German Emotion Dictionary

(University of Stuttgart)

This resource contains word lists for 7 fundamental emotions. For more details, please refer to this paper abstract.

LINK

SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in German

(Humboldt Universität zu Berlin, Neofonie GmbH, University of Stuttgart)

The SCARE corpus consists of fine-grained annotations for mobile application reviews from the Google Play Store. For each user review the mentioned application aspects, i.e., the design or the usability, as well as subjective phrases, which evaluate these aspects, are annotated. In addition, the polarity (positive, negative or neutral) of each subjective phrase is recorded as well as the relationship of an aspect to the main app in discussion. Aspects which refer to an app or an aspect of an app other than the app in discussion are marked as “foreign”. All other aspects are “related”. In total, the corpus consists of 1,760 German application reviews with 2,487 aspects and 3,959 subjective phrases.

LINK

Sentiment Phrase List

(University of Applied Science Hof)

Sentiment Phrase List (SePL) is a generated list of opinion bearing words and phrases. The list contains adjectives, verbs and nouns as well as adjective-, verbs- and noun-based phrases and their opinion values on a continuous range between −1.00 and +1.00. For each word or phrase two additional quality measures are given. The list was produced using a large number of product review titles providing a textual assessment and numerical star ratings.

Potsdam Twitter Sentiment Corpus

(University of Potsdam)

A dataset of 7,992 German tweets, which were manually annotated by two human experts with fine-grained opinion relations. Corpus annotation includes sentiment-relevant elements such as opinion spans, their respective sources and targets, emotionally laden terms with their possible contextual negations and modifiers. LINK

Opinion Compound Dataset

(Saarland University, Hildesheim University)

Resource comprising German compounds (e.g. Expertenmeinung or Kinderlärm) that have been annotated with regard to opinion roles. Release comprises two datasets: one dataset comprising 2000 opinion compounds in which the modifier is annotated as either conveying some opinion role or none; 1000 opinion compounds in which the modifier is annotated as either conveying an opinion holder or an opinion target.

LINK_to_resource

LINK_to_publication

Verb View Lexicon

(Saarland University, Hildesheim University)

Resource that classifies all opinion verbs from the German Zurich Sentiment Lexicon according to their sentiment views. Each verb is categorized in one of three view categories. Categories are inspired by the different argument positions an opinion holder can assume. The categories are: agent view, where the opinion holder is realized as the agent of the opinion verb (e.g. love, hate, think), patient view, where the opinion holder is realized as the patient of the opinion verb (e.g. please, disappoint, surprise), and speaker view, where the opinion holder is the implicit speaker of the utterance (e.g. succeed, cheat, lie).

LINK_to_resource

LINK_to_publication

Multi-Domain Sentiment Lexicon for German

(Hochschule Darmstadt)

A sentiment lexicon combining sentiment terms extracted from the MLSA corpus and the pressrelations dataset, together with their polarites. The average polarity for each term was computed, separately for each data source. In addition, contents of SentiWS were added for evaluation purposes.

The lexicon was compiled by Kerstin Diwisch and Melanie Siegel.

LINK to resource

MLSA: A Multi-Layered Reference Corpus for German Sentiment Analysis, v1.0

(University of Zurich, MODUL University Vienna, University of Leipzig, University of Hildesheim, University of Bielefeld, Saarland University)

This version consists of 270 sentences manually annotated for objectivity and subjectivity (Layer 1), word and phrase polarity (Layer 2) and expressions of private states (Level 3).

LINK to resource

LINK to paper

STEPS Shared Task 2014

(University of Hildesheim, University of Potsdam, MODUL University Vienna, Bielefeld University, Saarland University)

Annotation guidelines for the STEPS Shared Task, to be held in collocation with KONVENS 2014. guide.pdf

Small set of trial data that shows the gold annotation on 10 sentences.

Data available on website of shared task: click here