Finding the midpoint when creating intervals
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Nothing ground breaking here. I was doing some work dividing data into deciles and then creating some plots. I couldn’t find an function to calculate this from cut, and I use cut quite a bit. So here we are.
midpoints <- function(x, dp=2){
lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
return(round(lower+(upper-lower)/2, dp))
}
And in an example:
midpoints <- function(x, dp=2){
lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
return(round(lower+(upper-lower)/2, dp))
}
mtcars$mpg
cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T)
midpoints(cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T))
Which looks like this:
> midpoints <- function(x, dp=2){
+ lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
+ upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
+ return(round(lower+(upper-lower)/2, dp))
+ }
>
> mtcars$mpg
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7
[18] 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
> cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T)
[1] (19.2,22.8] (19.2,22.8] (19.2,22.8] (19.2,22.8] (15.4,19.2] (15.4,19.2] [10.4,15.4]
[8] (22.8,33.9] (19.2,22.8] (15.4,19.2] (15.4,19.2] (15.4,19.2] (15.4,19.2] [10.4,15.4]
[15] [10.4,15.4] [10.4,15.4] [10.4,15.4] (22.8,33.9] (22.8,33.9] (22.8,33.9] (19.2,22.8]
[22] (15.4,19.2] [10.4,15.4] [10.4,15.4] (15.4,19.2] (22.8,33.9] (22.8,33.9] (22.8,33.9]
[29] (15.4,19.2] (19.2,22.8] [10.4,15.4] (19.2,22.8]
Levels: [10.4,15.4] (15.4,19.2] (19.2,22.8] (22.8,33.9]
> midpoints(cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T))
[1] 21.00 21.00 21.00 21.00 17.30 17.30 12.90 28.35 21.00 17.30 17.30 17.30 17.30 12.90
[15] 12.90 12.90 12.90 28.35 28.35 28.35 21.00 17.30 12.90 12.90 17.30 28.35 28.35 28.35
[29] 17.30 21.00 12.90 21.00
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.