Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In David Smith’s latest blog post (which, in a sense, is a continued response to the latest public attack on R), there was a comment by Barry that caught my eye. Barry wrote:
Even I get caught out on R quirks after 20 years of using it. Compare letters[c(12,NA)] and letters[c(NA,NA)] for the most recent thing that made me bang my head against the wall.
So I did, and here’s the output:
> letters[c(12,NA)] [1] "l" NA > letters[c(NA,NA)] [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA >
Interesting isn’t it?
I had no clue why this had happened but luckily for us, Barry gave a follow-up reply with an explanation. And here is what he wrote:
My example with ‘letters’ comes from a collision of three features:
- recycling of short subscripts
- silent coercion of types (boolean NA to numeric NA)
- and the existence of five different NA values that all print the same.
[…] to really understand that letters[c(1,NA)] is different from letters[c(NA,NA)] you have to see that:
- in the first case, the NA is coerced to a numeric NA because it’s in a vector with a numeric ‘1′.
- in the first case, you are selecting elements by supplying a vector of indexes
- in the second case, your NAs are boolean (logical) NA values
- hence your subscript is a logical vector
- logical vectors are recycled
- now your subscript is a vector of TRUE/FALSE values (which are all NA) of the same length as ‘letters’.
To make sure I understood Barry correctly, I tried the following code:
> letters[c(T,NA)] [1] "a" NA "c" NA "e" NA "g" NA "i" NA "k" NA "m" NA "o" NA "q" NA "s" NA "u" NA "w" NA "y" NA
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.