Check out the most recent version of ParlSpeech & ParlLawSpeech
As part of an internship at the WZB Berlin Social Science Center during my master's degree, I had the opportunity to participate in a research project by Christian Rauh and Pieter de Wilde on the saliency of EU affairs in national parliamentary debates. In this context, we released the first version of ParlSpeech with annotated full-text vectors of 3.9 million plenary speeches in the key legislative chambers of seven European states. In the course of various further projects, Christian and I decided to extend the data set in a second version. The updated version includes new countries (e.g. New Zealand), new variables (such as agenda items whenever possible) and extended time periods. Furthermore, we were given the chance to summarize many of our insights in a book chapter on collecting large-scale comparative text data on legislative debates.
Within the OPTED project, we extended the ParlSpeech dataset to ParlLawSpeech which enables the linkage of parliamentary speeches and corresponding bill propsals as well as final laws. The new dataset offers a unique collection of machine-readable full-text vectors for legislative documents (including parliamentary speeches, bills, and laws) alongside their metadata. PLS covers seven European countries (Austria, Czechia, Croatia, Denmark, Germany, Hungary, and Spain) and the European Parliament. The novelty of this dataset lies in the common identifier (see codebook) that links these three types of documents. This allows researchers to analyse the entire lawmaking process—from proposal through parliamentary debates to final enactment. Our website leverages interactive visualisations and tutorials to empower researchers to explore the relationships and trends within the PLS dataset, fostering a deeper understanding of the European lawmaking process.
Should you plan to collect/analyze large parliamentary speech corpora yourself, feel free to get in touch with me. I am currently collecting parliamentary debates again myself for various projects. In addition, I am always excited to learn about other projects with a similar interest focus and to think about possible collaborations.