What can Data Science tell us about Danish politics?

Danish Parliament. News Øresund – Johan Wessman.

Danish Parliament. News Øresund – Johan Wessman.

Have you ever wondered whether politicians actually talk about the same things in parliament that they spend their energy on during elections? Have you wondered whether your favourite politician is actually fighting the battles that you elected them to fight for you? Do they care as much about the environment as they said? Which types of words are they using to address refugees, foreigners, and migrants? Do they care as much about elders as they claimed to during elections?

At Data Culture we have those same questions, and we believe some of them might be answered using computational methods – methods like Natural Language Processing (NLP). These are the same methods employed by companies like Amazon, Google and Facebook to better understand you and sell you ads. We learned that the debates in the Danish parliament are recorded and that the video files are available online.

We saw this as amazing opportunity to keep track of politicians and make politics more transparent, but we also realized that nobody has the time or capacity to go through all of these hours of debate and make sense of them. Fortunately, we can employ sophisticated algorithms to help us. Typically we would use a speech-to-text model to transcribe these conversations for us, but by a stroke of luck, the Danish Parliament had already made transcripts readily available, stored online, free for the public to access and/or scrape.

The parliamentary page provides vague information on the methodology with regards to how debates were transcribed. We did find a few typos and mistakes (list examples) in the data, but overall the quality was high enough to make some interesting analyses of what goes on in the Danish Parliament.

In a series of posts, we will delve deeper into our findings from the analysis we conducted on this data-set. We set out to answer some of the following questions:

How can we approach and work with data from the Danish Parliament? How can we visualize and understand data about conversations and debates? Which approaches to the data are most useful for what? What can we learn from analyzing based on political parties versus individual politicians, or from analyzing words versus speeches, for instance?

With a good grasp of the data, we can move on to answer questions such as:

Who speaks the most in Parliament? What do different parties and politicians speak about? How do different parties and politicians speak about climate change, immigration and refugees or any other topic?

background.png

Info Box

Our analysis of the conversations in the Danish Parliament is the first of a series of projects that we plan to carry out based on freely available data, using methods that we make available open source and publishing our results here digestible for the broader public. At Data Culture we work with data in two ways. We consult our clients on sustainable solutions to how they use data science, by critically examining the impacts and biases of their work with data, and we design and carry out data-driven solutions to challenges regarding sustainability.

If you want to play around with data from the Danish Parliament and run your own queries using our scripts they will be available on Github. We would love to hear about your analyses and findings, so please reach out to us on either mail or twitter. If you are interested in partnering on projects, have suggestions for further analyses we could do, or if you know of interesting datasets that could use crunching, please reach out to us!

Data Culturedkpol