Site icon R-bloggers

counts numbers in a interval

[This article was first published on One Tip Per Day, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Say I have a list of values, and I cut them by some break points, how do I know the number of values in each interval?

We know cut() function in R works for the purpose.  For example,


tx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)
x <- rep(0:8, tx0)
> x
 [1] 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 5 5 5 5 6
[39] 6 6 6 6 7 7 7 8 8 8 8 8
> table( cut(x, b = 8))

(-0.008,0.994]      (0.994,2]          (2,3]          (3,4]          (4,5] 
             9              4              6              5             13 
         (5,6]       (6,7.01]    (7.01,8.01] 
             5              3              5 
In the cut() document, there is a note, saying

Instead of table(cut(x, br)), hist(x, br, plot = FALSE) is more efficient and less memory hungry. Instead of cut(*, labels = FALSE), findInterval() is more efficient.

But if you try as it said, you will the counts returned look different:


> hist(x, 8, plot=F)
$breaks
[1] 0 1 2 3 4 5 6 7 8
$counts
[1] 13  6  5  3 10  5  3  5




What's wrong?





Nothing is wrong. Just missed argument. "When breaks is specified as a single number, the range of the data is divided into breaks pieces of equal length, and then the outer limits are moved away by 0.1% of the range to ensure that the extreme values both fall within the break intervals. (If x is a constant vector, equal-length intervals are created, one of which includes the single value.)"



The conclusion is:
when breaks is a vector, table( cut(x, b = 0:8,include.lowest = T)) is equal to hist(x, breaks=0:8, plot=F)$counts; when breaks is a single number, it's not.


To leave a comment for the author, please follow the link and comment on their blog: One Tip Per Day.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.