From April to June 2012, second year student Tjerk Smit from the Communication Media and Design course in Amsterdam performed his internship at the Research & Devopment department of Sound and Vision. During those ten weeks he explored new ways for visualising the data of the items from the EUscreen collection. In this blog post he tells us a bit more about that visualisation. EUscreen keeps an overview of the projects we’ve done with visualising the collection at our demo page.
You can see Tjerks data visualisation at http://demo.euscreen.eu/datavisualisation/. Any feedback, hints of praise or tips for further improvement are highly welcome. If you’re a developer and reading this, you’re invited to make use of our linked open data capabilities to give visualising EUscreen a spin.
With the help of this library, I was able to create a broad range of visualisations. You can find many great examples on the d3.js website. Most of those examples use datasets that are based on a hierarchical structure. The data on EUscreen is not hierarchical, so I needed to find a other kind of way to visualise it and came upon the parallel diagram. A parallel diagram is used to visualise multidimensional categorical data. It can be summed up in a so called cross-tabulation. With this diagram you can explore and analyse the data in an interactive way.
How it works
The data is loaded through a comma-separated values (CSV) file. The first line of the CSV contains all the column names: all the different dimensions you can filter on. Below you can see an (simplified) example of such a CSV file. Every item on EUscreen is one row in the document, which contains 20083 lines. Minus the first line with the column names that makes a total of 20082 items that are loaded and visualised: The entire EUscreen collection in April 2012.
With the diagram I created, you can change the dimensions (column names) you want to compare. If you click on a filter, a menu will expand. Doing so will add an extra dimension to the diagram. Within the filters you can toggle different options (categories). A horizontal bar is shown for each of its possible categories. The width of the bar denotes the absolute number of matches fort hat category. Every category in the first dimension is connected to a number of categories in the next dimenions, showing how that category is subdivided. Within the graph you can drag the dimensions and categories to reorder them. If you hover over the dimension names you will see 2 links: alpha and size. With alpha you can sort the categories on alphabetical order. With size you can sort the categories on size.
The demo that I made is not perfect and improvements could be made. The data is not directly loaded from the EUscreen website, so the dataset is not dynamic. If there are added items on EUscreen, they’re not directly added to the CSV file. Another thing is that the data is not loaded into memory, so with every filter you make the whole CSV is loaded again. This means that the 4.5mb CSV has to be downloaded, which takes quite some time. A final improvement would be to create the menu for the filters dynamically. Those are static and therefore manually written. An improvement would be that they’re created by reading the first line of the CSV file (the column names) and that all the options (categories) within the filters are created by reading the input of all the items that are in the CSV file.
Visit the EUscreen data visualisation at: http://demo.euscreen.eu/datavisualisation/