Joyplot Logo
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Welcome to my data science blog datistics
where I will gradually post all the vignettes and programming POC’s that I have written over the past two years. Most of them can be already found in my github repository.
I am using blogdown to create this blog and using R
and RStudio
. However I have recently taken up python
programming for work again, so my first challenge will be to also add posts in the form of jupyter notebooks.
As for my first post I will add the code that I use to generate my page logo in R
.
Tweedie distributions
We often encounter distributions that are not normal, I often encounter poisson and gamma distributions as well as distributions with an inflated zero value all of which belong to the family of tweedie distributions. When changing the parameter \(p\) which can take values between 0 and 2 ( p == 0 gaussian, p == 1 poisson, p == 2 gamma) we can sample the different tweedie distributions.
the tweedie
package only supports values for 1 <= p <= 2
suppressWarnings({ suppressPackageStartupMessages({ require(tidyverse) require(tweedie) require(ggridges) }) }) df = tibble( p = seq(1,2,0.1) ) %>% mutate( data = map(p, function(p) rtweedie(n = 500 , mu = 1 , phi = 1 , power = p ) ) ) %>% unnest(data) df %>% ggplot( aes(x = data) )+ geom_histogram(bins = 100, fill = '#77773c') + facet_wrap(~p, scales = 'free_y')
Joyplot
We will now transform these distributions into a joyplot in the style of the Joy Divisions album Unknown Pleasurs cover art.
We will use ggridges
formerly known as ggjoy
.
joyplot = function(df){ p = df %>% ggplot(aes(x = data, y = as.factor(p), fill = ..x.. ) ) + geom_density_ridges_gradient( color = 'white' , size = 0.5 , scale = 3) + theme( panel.background = element_rect(fill = 'white') , panel.grid = element_blank() , aspect.ratio = 1 , axis.title = element_blank() , axis.text = element_blank() , axis.ticks = element_blank() , legend.position = 'none') + xlim(-1,5) + scale_fill_viridis_c(option = "inferno") return(p) } joyplot(df) ## Picking joint bandwidth of 0.24
I order to distribute them a bit better over the x-axis we will transform them using a sine wave pattern.
df = tibble( p = seq(1,2,0.05) , rwn = row_number(p) , sin = sin(rwn) ) %>% mutate( data = map(p, function(p) rtweedie(500 , mu = 1 , phi = 1 , power = p) ) ) %>% unnest(data) %>% filter( data <= 4) %>% mutate( data = ( 4 * abs( sin(rwn) ) ) - data ) joyplot(df) ## Picking joint bandwidth of 0.206
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.