Voronoi diagram with ggvoronoi package with Train Station data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’ve always been curious to make Voronoi diagram, I just think they are beautiful! When I came across data set with train stations in Japan. I instantly thought this would be great data sets to make Voronoi diagram! I’ve gotten data sets from (Ekidata)[http://www.ekidata.jp/] site. I’m amazed how many train stations we have in Japan, as well as coverage of train systems in Japan.
There are couple of packages I could’ve used to make Voronoi diagram, but I’ve utilized package ggvoronoi. I really like using “outline” inside of geom_voronoi function to mask out the shape! (Which I wasn’t sure how to do before using deldir package).
Voronoi Diagram with Train Station as a seed.
ggvoronoi makes it easy to plot voronoi diagram! All I really needed to produce voronoi diagram was longitude & latitude.
Initially I’ve plotted all the train station as a point (using geom_point), you can see that station will reveal shape of Japan, as JR (Japan Railway) really covers coast line of Japan. There are total of 10828 points, as there were 10828 stations listed in most recent data set downloaded today.
I also used treemap package to create treemap.
I’ve colour coded rectangle inside of treemap with company types. 47% of 10K+ stations are JR Japan Railway stations in Japan.
Tokyo (area: 2,188 sq.km) has 943 stations all together, followed by Hokkaido 650 stations, but Hokkaido is the biggest prefecture in terms of area (83,456.87 sq.km) . It would be interesting to get area data for each prefecture, so we can calculate stations per area.
jp <- ggplot2::map_data('world2', region='japan') names(jp) <- c("lon","lat", "group","order","region","subregion") ## for train, I'm going to tidy up the map bit. (I've excluded Okinawa for now) jp_outline <- jp %>% filter(subregion %in% c("Honshu","Hokkaido","Kyushu","Shikoku")) ## I also wanted prefecture level data, so I've used map data from mapdata package. jp_outline_detailed <- map_data("japan") ## station_master lists all stations of all lines plotPoints <-station_master %>% ggplot(aes(x=lon, y=lat)) + theme_void(base_family="Roboto Condensed") + geom_polygon(data=jp_outline, aes(group=group), fill="#ffffff", color="#33333380") + geom_point(aes(color=pref_cd),size=0.1, alpha=0.8) + scale_color_viridis_c(end=0.5, guide="none") + labs(title="Each Train Station as a point") + coord_quickmap() ## station_master2 is reduced version of station_master plotVoronoi <-station_master2 %>% ggplot(aes(x=lon, y=lat)) + theme_void(base_family="Roboto Condensed") + geom_polygon(data=jp_outline, aes(group=group), fill="#ffffff00", color="#33333380") + geom_path(stat="voronoi", size=0.1, aes(color=pref_cd)) + coord_quickmap() + scale_color_viridis_c(end=0.5, guide="none") + labs(title="Voronoi Diagram with station as a seed") ## use patchwork package to plot 2 plots side by side plotPoints + plotVoronoi
## All of Japan - Takes long time to draw on my machine. station_master2 %>% ggplot(aes(x=lon, y=lat)) + theme_void(base_family="Hiragino Sans W5") + geom_voronoi(aes(fill=station_cnt),size=0.05, color="#ffffff", outline=jp_outline) + ## this outline feature is awesome! coord_quickmap() + scale_fill_viridis_c(end=0.8, option="magma", guide="none")
Treemap with treemap package
With treemap, I can easily see which prefecture has more stations. Also I wanted to see which railway company are dominant in each prefecture.
## Treemap to see which prefecture has more stations. station_master %>% count(pref_name,company_type_descr,company_name_r) %>% add_count(pref_name,wt=n) %>% mutate(pref_descr = paste(pref_name,":",nn,"駅")) %>% treemap(index=c("pref_descr","company_type_descr","company_name_r"), vSize="n", vColor="company_type_descr", type="categorical", fontfamily.labels="Hiragino Sans W3", align.labels=list(c("left","top"),c("center","center"),c("right","bottom")), fontsize.labels=c(13,0,11), palette=viridis_pal(end=0.6)(4), border.col="white", bg.labels=0, position.legend="bottom", title.legend="", title="Number of Stations by Prefecture\ncoloured by operating company types")
Writing Function to Plot Prefecture Level Voronoi
There are 47 prefectures in Japan. So I’ve decided to write function to draw voronoi as below. I think below can be simplified…, but for now it does the job…
## function to draw voronoi map at prefecture level draw_pref <- function(pref_no=1,zoom=T,save_file=F,folder_name="prefecture",...){ region <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name_en) region_jp <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name) pref_summary <- station_master %>% filter(pref_cd==pref_no) %>% summarise(station_cnt=n(), line_cnt =n_distinct(line_name), company_count=n_distinct(company_name_r)) tmp_df <- station_master2 %>% filter(pref_cd==pref_no) pref_outline <- map_data("japan", region=region) capital <-jpnprefs %>% mutate(pref_cd=row_number()) %>% filter(pref_cd==pref_no) ## calculate distance between capital city & each station so i can colour the cell of voronoi. tmp_df <- tmp_df %>% mutate(dist_from_capital = sqrt((lon-capital$capital_longitude)^2 + (lat-capital$capital_latitude)^2)) # finding bounding box from train station data... , so I can crop the map if I want to. bbox <-tmp_df %>% ungroup() %>% summarise(xmax=max(lon), xmin=min(lon), ymax=max(lat), ymin=min(lat)) base_map <-tmp_df %>% ggplot(aes(x=lon,y=lat)) + theme_void(base_family="Hiragino Sans W5") + #geom_voronoi(aes(fill=comp_cd_min) ,size=0.1, color="#ffffff", # outline=pref_outline) + geom_voronoi(aes(fill=dist_from_capital) ,size=0.1, color="#ffffff", outline=pref_outline) + #scale_fill_gradientn(colors = c("#440154FF","#000000FF","#31688EFF", "#1F9E89FF","#6DCD59FF"), #breaks=c(0,7,30,100,200), limits=c(1,250), guide="none") + scale_fill_viridis_c(end=0.9, guide="none", option="magma") + labs(title=paste0(region_jp," (",region,")"), caption=paste0("Capital City of ",region," is ",capital$capital, " @ (", round(capital$capital_longitude,2),",", round(capital$capital_latitude,2),")"), subtitle=paste(pref_summary$station_cnt,"stations", pref_summary$line_cnt," lines operated by", pref_summary$company_count, "companies in", str_to_title(region))) + geom_point(data=capital, aes(x=capital_longitude, y=capital_latitude),shape=4, color="#ffffff") if (zoom) { print(base_map + coord_quickmap(xlim=c(bbox$xmin-0.1,bbox$xmax+0.1), ylim=c(bbox$ymin-0.1,bbox$ymax+0.1))) } else { print(base_map + coord_quickmap()) } if(save_file){ ggsave(paste0(folder_name,"/", formatC(pref_no, width=2,flag="0"),"-",str_to_lower(region),".png"), width=9,height=9,dpi=300) } } ## function to draw treemap at prefecture level draw_treemap <- function(pref_no=1,...){ station_master$color <- viridis_pal(end=0.6)(nlevels(station_master$company_type_descr))[station_master$company_type_descr] title_text <- prefs %>% filter(pref_cd==pref_no) %>% pull(pref_name_en) station_master %>% filter(pref_cd==pref_no) %>% count(company_type_descr,company_name_r,line_name,color,station_name) %>% treemap(index=c("company_type_descr","company_name_r","line_name","station_name"), vSize="n", vColor="color", type="color", fontfamily.labels="Hiragino Sans W3", fontfamily.title="Roboto Condensed", align.labels=list(c("center","center"),c("left","top"), c("right","bottom"),c("center","center")), fontsize.labels=c(0,13,11,0), border.col=c("#ffffffff","#ffffff90","#ffffff30","#ffffff10"), border.lwds = c(3,2,1,0.2), bg.labels=0, title.legend="", title="", aspRatio = 16/9) }
Tokyo!
While it’s interesting to see Voronoi map of Japan, I wanted to zoom into selected prefectures that I care maybe more about.
Firstly, Tokyo. I love looking at Tokyo’s train map such as this one. JR East Route Map PDF.
For below voronoi diagram, I’ve decided to colour the voronoi cell with distance from Shinjuku (capital city of Tokyo) to corresponding station cell. (I actually think it’s more interesting to get train usage data, and colour the cell with train usage data, but because there are so many different operating company, getting data about train usage seemed like pretty hard task to do…)
I like how dense train staions are packed around central tokyo (east side), but as you go towards the west, cell becomes bigger and bigger. In fact, far west side of Tokyo, there are NOT that many stations at all.
I’ve also created treemap for Tokyo below. Personally I was surprised that there are maybe more Tokyo metro stations than JR stations in Tokyo. I’ve also came to realize that there are so many companies…
draw_pref(13, zoom=T, save_file=F)
draw_treemap(13)
station_master %>% filter(pref_cd==13) %>% arrange(e_sort) %>% ggplot(aes(x=lon, y=lat)) + geom_sf(data=jpn_pref(13), inherit.aes=F, color="#33333320") + geom_point(aes(color=company_name_r, shape=company_type_descr), alpha=0.8) + theme(axis.text=element_blank(), axis.title=element_blank()) + theme_minimal(base_family="Hiragino Sans W5") + geom_text_repel(data=station_master2 %>% filter(pref_cd==13 & station_cnt>6), aes(label=station_name), family="Osaka", min.segment.length=0, nudge_x=0.25, segment.color="#33333350") + scale_color_hue(l=45) + coord_sf(ylim=c(35.5,35.9), xlim=c(138.8,139.9)) ## to remove islands of Tokyo
Plotting Kanagawa Prefecture
Kanagawa prefecture is where Yokohama, Also where one of my favourite place, Kamakura is. I like the shape of prefecture, as it sort of looks like a dog?! Maybe camel?!
Capital city of Kanagawa prefecture is Yokohama, and I’ve again coloured cell based on distance from Yokohama. Similar to Tokyo, east side of Kanagawa has a lot of stations but west side is pretty sparse.
draw_pref(14, zoom=T, save_file=F)
draw_treemap(14)
## See Station in Kanagawa station_master %>% filter(pref_cd==14) %>% arrange(e_sort) %>% ggplot(aes(x=lon, y=lat)) + geom_sf(data=jpn_pref(14), inherit.aes=F, color="#33333320") + geom_point(aes(color=company_name_r, shape=company_type_descr), alpha=0.8) + theme(axis.text=element_blank(), axis.title=element_blank()) + theme_minimal(base_family="Hiragino Sans W5") + geom_text_repel(data=station_master2 %>% filter(pref_cd==14 & station_cnt>3), aes(label=station_name), family="Osaka", min.segment.length=0, nudge_x=0.25, segment.color="#33333350") + scale_color_hue(l=45)
Plotting Chiba Prefecture
Chiba is where Narita Airport is. I just had to plot it out, because I like the shape of prefecture 🙂 It looks like a hummingbird to me, but Chiba prefecture actually have a maskot called Chi-ba-kun, and it’s a dog character.
draw_pref(12, zoom=T, save_file=F)
draw_treemap(12)
## See Station in Chiba station_master %>% filter(pref_cd==12) %>% arrange(e_sort) %>% ggplot(aes(x=lon, y=lat)) + geom_sf(data=jpn_pref(12), inherit.aes=F, color="#33333320") + geom_point(aes(color=company_name_r, shape=company_type_descr), alpha=0.8) + theme(axis.text=element_blank(), axis.title=element_blank()) + theme_minimal(base_family="Hiragino Sans W5") + geom_text_repel(data=station_master2 %>% filter(pref_cd==12 & station_cnt>3), aes(label=station_name),size=3, family="Osaka", min.segment.length=0, nudge_x=0.25, segment.color="#33333350") + scale_color_hue(l=45)
Bonus: Plotting Hokkaido Prefecture
Hokkaido is the largest prefecture in Japan, and it has 2nd most numbers of train stations. (While it has 2nd most stations in number, Hokkaido is about 37 times bigger than Tokyo in area).
Shape of Hokkaido is pretty iconic (at least in my mind.) I recently found out there’s heart-shaped lake called Toyoni lake in Hokkaido too, but I didn’t spot heart-shaped Voronoi cell…
draw_pref(1, zoom=T, save_file=F)
draw_treemap(1)
## To see station on actual map station_master %>% filter(pref_cd==1) %>% arrange(e_sort) %>% ggplot(aes(x=lon, y=lat)) + geom_sf(data=jpn_pref(1), inherit.aes=F, color="#33333320") + geom_point(aes(color=company_name_r, shape=company_type_descr), alpha=0.8) + theme(axis.text=element_blank(), axis.title=element_blank()) + theme_minimal(base_family="Hiragino Sans W5") + geom_text_repel(data=station_master2 %>% filter(pref_cd==1 & station_cnt>=3), aes(label=station_name), family="Osaka", min.segment.length=0, nudge_x=2, segment.color="#33333350") + scale_color_hue(l=45, name="company name")
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.