Site icon R-bloggers

Cars in Netherlands

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am looking for a new car. So when I saw there was an update on vehicles in Statistics Netherlands I just had to go and look at the data. So, I learned the brown is getting more popular, often the number of cars from a certain construction year is larger at six years of age than five years of age and lighter cars get more popular, especially in these later years in the crisis.

Color of Cars

The data I downloaded contains number of cars by color, build year and reference date. Unfortunately it is all in Dutch, but I did translate all relevant parts. The data actually contains 14 colors, including ‘other’ but some of the colors were so infrequent, it made all confusing. So I added some colors to other. To plot what is sold in a certain year, I took reference date 1 January, the year after building.
The data shown grey and black as most sold, white and brown do get more popular, blue and green get less popular. There is also a little dent in 2009 (crisis) and a new decrease in 2012. New tells us 2013 is worse.

% change in number of cars is interesting. I was expecting black cars maybe get more in accidents or something, because they are less visible. What I did see is a marked decrease in white cars, especially in the beginning of the previous decade. And there is a marked increase in cars of say 5 or 6 years old. Finally, at say ten years of age the cars start disappearing. The marked increase at 5 or 6 years may be explained by importing older cars. I sometimes read or hear they are bought in Germany. The Dutch tax on imported cars is fairly high, so it is interesting to import a cheaper second hand, which has significantly less tax. For example; Athlon car lease Germany has a Dutch language site especially for this purpose.

Fuel

There are three kinds of frequently used fuels; gasoline (Benzine), diesel and LPG (liquid propane gas). Gasoline is the standard, most expensive fuel, lowest in road tax, diesel is cheaper to run, but more expensive in tax, LPG even more so. It boils down to this, Diesel and LPG if you drive a lot, Gasoline if you drive a bit. It is possible to convert from Gasoline to LPG, which may explain the big increases in number of LPG for older cars.

Weight

Weight is also interesting. It is tax categories, these I translated into continuous values by taking the lower end of the brackets. The first bracket is 0 to 450 kg, then 100 kg increases. Beats the hell out of me what cars are under 650 kg, but there is something. Obviously most cars are between about 700 and 1800 kg. There is also some shift at 750 kg, I you have a good color discrimination you might see what exactly. My eyes won’t so the next plot shows that cars 651 to 750 kg are getting less frequent, 751 to 850 much more frequent. There are tax advantages to less fuel consumption, and the economy can only strengthen that. Somewhat heavier cars suffer from a decrease. Above 1500 kg it seems the market works differently.

The final plots show number of  heaver cars are most decreasing when older. Which makes sense, they are still pretty expensive to run in terms of fuel and tax. The final plot I chose because you can see the big increase in love for small cars again, it would seem old small cars don’t disappear or their number even increases, while the bigger cars, the 1151 to 1250 bracket has prominent decreases from 7 years old. 

R code.

library(ggplot2)      
col1 <- read.csv2(‘Motorvoertuigen__per_260513101348.csv’,na.strings=’-‘)
col2 <- col1[!is.na(col1$Waarde),]
col2$BuildYear <- as.numeric(sub(‘Bouwjaar ‘,”,as.character(col2$Bouwjaren)))
col2$RefYear <- as.numeric(sub(‘, 1 januari’,”,as.character(col2$Peildatum)))
col2$Colour <- factor( c(“Beige”, “Blue”, “Brown”, “Other”, “Yellow”, 
        “Grey”, “Green”, “Other”, “Other”,
        “Other”, “Red”, “Other”, “White” ,”Black”)[col2$Onderwerpen_2])
col3 <- aggregate(col2$Waarde,list(Colour=col2$Colour,BuildYear=col2$BuildYear,
        RefYear=col2$RefYear),sum)
col4 <- col3[col3$RefYear==col3$BuildYear+1,]
colourcode <-  c(“#C8AD7F”, “Black”,”Blue”, “Brown”, “Green” ,
    “Grey” ,  “Purple” , “Red”,”White” , “Yellow”)
png(‘col1.png’)
p <- ggplot(col4, aes(x=BuildYear, y=x, colour=Colour))
p + geom_line() + 
    scale_colour_manual(values=colourcode) +
    scale_y_log10(“Numer of vehicles”) +
    scale_x_continuous(breaks=seq(2000,2012,2))
dev.off()

##############

lastyeardata <- col3[,c(‘x’,’BuildYear’,’Colour’,’RefYear’)] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’x’] <- ‘LastYearAmount’
change <- merge(x=col3,y=lastyeardata)
change$Pchange <- with(change,100*(x-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘col2.png’)
p <- ggplot(change[change$BuildYear<2010,], aes(x=Age, y=Pchange, colour=Colour))
p + geom_line() + 
    scale_colour_manual(values=colourcode) +
    scale_y_continuous(“Numer of vehicles”) +
    facet_wrap(~BuildYear,nrow=2)
dev.off()

############
fuel1 <- read.csv2(‘Motorvoertuigen__per_010613135050.csv’,na.strings=’-‘)
fuel2 <- fuel1[!is.na(fuel1$Waarde),]
fuel2$BuildYear <- as.numeric(sub(‘Bouwjaar ‘,”,as.character(fuel2$Bouwjaren)))
fuel2$RefYear <- as.numeric(sub(‘, 1 januari’,”,as.character(fuel2$Peildatum)))
fuel4 <- fuel2[fuel2$RefYear==fuel2$BuildYear+1,]
png(‘fuel1.png’)
p <- ggplot(fuel4, aes(x=BuildYear, y=Waarde, colour=Onderwerpen_2))
p + geom_line() + 
    scale_y_continuous(“Numer of vehicles”) +
    scale_x_continuous(breaks=seq(2000,2012,2)) +
    labs(colour=”Fuel”)
dev.off()
##

lastyeardata <- fuel2[,c(‘Waarde’,’BuildYear’,’Onderwerpen_2′,’RefYear’)] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’Waarde’] <- ‘LastYearAmount’
change <- merge(x=fuel2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘fuel2.png’)
p <- ggplot(change[change$BuildYear<2010,], 
    aes(x=Age, y=Pchange, colour=Onderwerpen_2))
p + geom_line() + 
    scale_y_continuous(“Numer of vehicles”) +
    facet_wrap(~BuildYear,nrow=2) +
    labs(colour=”Fuel”)
dev.off()

##############

weight1 <- read.csv2(‘Motorvoertuigen__per_010613140907.csv’,na.strings=’-‘)
weight2 <- weight1[!is.na(weight1$Waarde),]
weight2$BuildYear <- as.numeric(sub(‘Bouwjaar ‘,”,as.character(weight2$Bouwjaren)))
weight2$RefYear <- as.numeric(sub(‘, 1 januari’,”,as.character(weight2$Peildatum)))
weightcats <- levels(weight2$Onderwerpen_2)
weightcats <- gsub(‘en meer’,’and more’,weightcats)
levels(weight2$Onderwerpen_2) <- weightcats 
lweightcats <- as.numeric(gsub(‘( |-).*$’,”,weightcats))
weight2$lweight <- lweightcats[weight2$Onderwerpen_2]
weightcats <- weightcats[order(lweightcats)]
weight2$WeightCat <- factor(weight2$Onderwerpen_2,levels=weightcats)
weight4 <- weight2[weight2$RefYear==weight2$BuildYear+1,]
png(‘weight1.png’)
p <- ggplot(weight4, aes(x=lweight, y=Waarde, colour=factor(BuildYear)))
p + geom_line() + 
    scale_y_continuous(“Numer of vehicles”) +
    labs(colour=’Build Year’)
dev.off()

png(‘weight2.png’)
p <- ggplot(weight4[weight4$lweight>600& weight4$lweight<1800,], aes(x=BuildYear,y=Waarde))
p + geom_line() + 
    scale_y_continuous(“Numer of vehicles”) + 
    facet_wrap(~WeightCat)
dev.off()
##

lastyeardata <- weight2[,c(‘Waarde’,’BuildYear’,’Onderwerpen_2′,’RefYear’)] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’Waarde’] <- ‘LastYearAmount’
change <- merge(x=weight2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘weight3.png’)
p <- ggplot(change[change$lweight>600& change$lweight<2200 & change$BuildYear<2010,]
    , aes(x=Age, y=Pchange, colour=WeightCat))
p + geom_line() + 
    scale_y_continuous(“% Chane in Numer of vehicles”) +
    facet_wrap(~BuildYear,nrow=2)
dev.off()

png(‘weight4.png’)
p <- ggplot(change[change$lweight>600 & change$lweight<1200,],
    aes(x=RefYear, y=Pchange, colour=factor(Age)))
p + geom_line() + 
    scale_y_continuous(“% Change in Number of vehicles”) +
    facet_wrap(~WeightCat)
dev.off()

To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.