German train monitor provides access to train delay data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The German newspaper Süddeutsche Zeitung (SZ) worked together with OpenDataCity to create an online train monitor of the German network: Zugmonitor. This is another great example of the new form of data journalism.
The project provides access to data of train delays collected over 150 days between 2 October 2011 and 1 March 2012 and allows you to analyse the delays in more detail.
Here is an example showing the delays by station.
This SZ article (in German) gives you an overview of the data and how to access it. I believe the most convient method to query the data is to use the Google Fusion tables. It allows you to import the data into R with the read.csv
function. The filename to use is an url mixed with a little bit of SQL syntax.
The other sources can be accessed in the same way:
Delay | Fusion table ID |
by station | 3166152 |
between stations (all trains) | 3166064 |
between stations (ICE tains only) | 3166328 |
by country | 3166042 |
by cause | 3165200 |
by daytime | 3164289 |
by train type | 3165124 |
I am curious what people will make of the data. Apparently more data will be made available in the future. I will keep an eye the project page.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.