Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Basis and principle of D3partitionR
D3partitionR is to plot sequential and hierarchical data using treemap (and circle Tree Map), sunburst and partition chart, collapsible trees (indented or not).
The package only has one all-in function called D3partitionR(…) to create a partition chart. Two others functions as renderD3partitionR(…) and D3partitionROutput(…) are used to render partition charts in Shiny.
The goal of the tutorial will be to create a Shiny app plotting the evolution of Japan Trade from 1988 to 2015. The goal is to create the following shiny app :
You can find all the code here:
https://gitlab.com/ant-guillot/ExploringJapanTradeApp_Kaggle/tree/master
Exploring Japan Trade from 1988 to 2015: downloading the data
You can find the data we are going to use are available on Kaggle: here .
These data are exploring Japan import and export by countries and area and by type of goods (with several classifications).
Download and unzip the data at the root of your project directory, we are going to need these later. You should have the following files in the directory:
Creating the project.
First, let’s create a Rstudio Shiny Project, then you can download the package we’ll need:
- shinydashboard
- highcharter
- data.table
- D3partitionR
Creating the layout:
That’s what we want the final application to look like. Basically it contains:
- tabBox with two tabs, in each tabs:
- A box with a time interval input and a radio button to switch between import and export
- A partition chart to have a global overview over the period
- A time series to see the evolution on the different area and segment (linked with the partition chart).
To create the layout, we need to:
- create an ui.r and server.r
- properly modify the ui.r
- add the package we need (also do this in server.R):
library(shiny) library(D3partitionR) library(data.table) library(highcharter) require(shinydashboard)
- Create the dashboard layout (the body will be create later)
dashboardPage( dashboardHeader(disable = TRUE), dashboardSidebar(disable = TRUE), body )
- creating the body and the boxes
body=dashboardBody(fluidPage( h2("Exploring Japan trade from 1988 to 2015",align="center",style="-variant: small-caps;"), tabBox(width = 12, tabPanel( "Export and import by country",fluidRow( box(width=12,title="Options",solidHeader = T,status = "primary") ,box(width = 6,height = 800), box(width=6,height = 800) ) ), tabPanel( "Export and import by type of product",fluidRow( box(width=12,title="Options",solidHeader = T,status = "primary") ,box(solidHeader=T,width = 6,height = 700), box(solidHeader=T,width=6) ) ) ) ) )
- add the package we need (also do this in server.R):
Your code should look this way:
library(shiny) library(D3partitionR) library(data.table) library(highcharter) require(shinydashboard) #Body which contains the boxes body=dashboardBody(fluidPage( h2("Exploring Japan trade from 1988 to 2015",align="center",style="-variant: small-caps;"), tabBox(width = 12, tabPanel( "Export and import by country",fluidRow( #Options box where our slider input and the import/export switch will be put box(width=12,title="Options",solidHeader = T,status = "primary") ,box(D3partitionROutput("D3Part1"),width = 6,height = 800), box(highchartOutput("Graph",height = "600px"),width=6) ) ), tabPanel( "Export and import by type of product",fluidRow( box(width=12,title="Options",solidHeader = T,status = "primary") ,box(solidHeader=T,width = 6,height = 700), box(solidHeader=T,width=6) ) ) ) ) ) dashboardPage( dashboardHeader(disable = TRUE), dashboardSidebar(disable = TRUE), body )
When running server.r, you should get:
Well that’s not very useful, let’s add some data and visualisation.
Data processing (server.R)
The data from kaggle need some preprocessing before being used.
- Replace hs2, hs4, hs6, area, country codification by their current name and save the data.table as an .RDS
year_latest = data.table(read.csv("year_latest.csv")) hs2_eng = data.table(read.csv("hs2_eng.csv")) hs4_eng = data.table(read.csv("hs4_eng.csv")) hs6_eng = data.table(read.csv("hs6_eng.csv")) country_eng = data.table(read.csv("country_eng.csv")) year_latest = merge(year_latest,hs2_eng,by="hs2") year_latest = merge(year_latest,hs4_eng,by="hs4") year_latest = merge(year_latest,hs6_eng,by="hs6") year_latest = merge(year_latest,country_eng,by="Country")
- Put the data under the proper format to use it with D3partitionR: data for the first tab
#Selecting variable we want to plot year_latest_proc=year_latest[,.(hs2_name,hs4_name,hs6_name,Country_name,Area,Year,VY,exp_imp)] #Summing the value of exchanges year_latest_proc_year=year_latest[,.(Value=sum(VY)),by=c("Country_name","Area","Year","hs2_name","exp_imp")] year_latest_proc_year[,tot_value:=sum(Value),by=c("Country_name","Area","hs2_name","exp_imp")] year_latest_proc_year[,prev_value:=sum(Value),by=c("Country_name","Area","exp_imp")] #Deletion of small exchange (to have a fluid visualisation) year_latest_proc_year[tot_value/prev_value<0.02,hs2_name:="Other"] year_latest_proc_year=unique(year_latest_proc_year[,.(Value=sum(Value)),by=c("Country_name","Area","hs2_name","Year","exp_imp")]) #Path construction, the path need to be a list with the different steps year_latest_proc_year[,path_str:=paste(paste("World",Area,Country_name,sep="/"),hs2_name)] year_latest_proc_year[,path:=strsplit(path_str,"/")]
Adding the top inputs: time range and import/export switch
We wan the users of our apps to be able to select a specific time range to understand the japan commerce during this time range. Our input will be a slider one:
- The maximum and the minimum will be the ones from our data
- The step size will be one since we only have yearly data.
#Line to add in our options box column(3, sliderInput("DateRange1", "Time selection:",min = 1988, max = 2015, value = c(1988, 2015)))
We also want to add a switch input to let the user choose between import and export:
#Line to add in our options box, just after the previous line column(width=3,radioButtons("exchangeToShow",label="Exchanges to show",choices=c("import","export")))
This input need to be converted in a condition that will be used to do some subsetting.
#Converting the input in a "string condition" #To ba added to the server exchangesType=reactive({ switch(input$exchangeToShow, "import"="exp_imp==2", "export"="exp_imp==1" ) })
Selecting the data accordingly to our input
The input being built, the data need to be properly subsetted.
year_latest_proc_noyear_reac=reactive({ #Selecting data in the time range year_latest_proc_noyear=unique(year_latest[Year>=input$DateRange1[1] & Year <=input$DateRange1[2],.(Value=sum(VY)),by=c("Country_name","Area","hs2_name","exp_imp")]) #Summing accordingly (we need to sum to have non timed data for the partition plot) year_latest_proc_noyear[,prev_value:=sum(Value),by=c("Country_name","Area","exp_imp")] year_latest_proc_noyear[Value/prev_value<0.02,hs2_name:="Other"] year_latest_proc_noyear=unique(year_latest_proc_noyear[,.(Value=sum(Value)),by=c("Country_name","Area","hs2_name","exp_imp")]) #Building the path for the partition chart year_latest_proc_noyear[,path_str:=paste("World",Area,Country_name,hs2_name,sep="/")] year_latest_proc_noyear[,path:=strsplit(path_str,"/")] #Subsetting to get only the import or export #Keeping only the value and path columns that are needed by the partition chart year_latest_proc_noyear[eval(parse(text=exchangesType())),.(Value=sum(Value),path),by=c("Country_name","Area","hs2_name","path_str")] })
Building the partition chart:
Once the data have been pre-processed building the partitionChart is easy.
output$D3Part1 = renderD3partitionR( D3partitionR(data =list(path=year_latest_proc_noyear_reac()$path, value=year_latest_proc_noyear_reac()$Value),Input=list(enabled=T,Id="D3Part1",clickedStep=T,currentPath=T,visiblePaths=T,visibleLeaf=T,visibleNode=T),width = 600,height = 600))
Some comments on the param:
- The data param is taking a list with a value item and the list of path linked to each value.
- The input param allow us to use the partition chart as an input:
- Clicked step send the name of the clicked step
- Current path send back the position in the chart
- Visible path, visible leaf and visible node send all the visible paths/leaf/nodes from the current point of view.
When running the app, you should now see the partition chart summarizing the japan trade over your selected range.
Building the evolution over time plot:
output$Graph<-renderHighchart( hchart(unique(year_latest_proc_year[path_str%like%input$D3Part1$clickedStep & eval(parse(text=exchangesType())) & Year>=input$DateRange1[1] & Year <=input$DateRange1[2], .(Value=sum(Value)), by=c("Year","hs2_name")][order(Year)]), "line", x = Year, y = Value, group = hs2_name) )
As you can see, you can directly access the input from the partition chart using input$D3part1. The inputs are send to shiny as a list, so you need to select the inputs you’re interested in. Here, we want to see the evolution of trade between Japan and our selected area/country, using :
path_str%like%input$D3Part1$clickedStep
we are selecting all the path our clicked step belong to.
Important comment:
The app is now working, however the data preprocessing may take a big amount of time (around 1 to 2 minutes on my computer). To avoid that, you can run the preprocess step and save the aggregated data as a RDS. Thus, you’ll just have to load this RDS, which will be faster !
year_latest= readRDS("data_aggr.RDS")
Conclusion and comments:
With all this code, you should get a 1st working tab. The process to create the second one is pretty much the same.
You can also try the modify the type of partition visualisation using type=”treeMap” or type=”partitionChart” or type=”sunburst”.
Thanks for reading, and I hope you enjoyed this walk-through !
Antoine
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.