Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Feature | NumPy | R |
contiguous (virtual) memory | ✔ | ✔ |
‘view’ memory model | ✔ | ✘ |
subset-assignment | ✔ | ✔ |
vectorized operations | ✔ | ✔ |
memory-mapping | ✔ | ✘* |
broadcasting rules | ✔ | ✔ |
index arrays | ✔ | ✔ |
This comparison is current as of R 2.13.0, NumPy version 1.4.1, and other web resources to date. Because this post was motivated by a recent article (cited below) promoting the NumPy array, the comparison above may seem one-sided. To be fair, I welcome corrections and additions to the above feature table.
Comput. Sci. Eng. 13, 22 (2011)
http://link.aip.org/link/?CSENFA/13/22/1
contiguous (virtual) memory
Contiguous (virtual) memory means that memory used by an array is allocated as a single block, and that the elements of an array are stored adjacently. This type of storage enables efficient operations on the array. The ‘virtual’ qualification signifies that memory may only appear contiguous to the executing process, but be noncontiguous in physical memory.
‘view’ memory model
A ‘view’ memory model allows an array to be ‘viewed’ differently under certain operations (matrix transpose, many types of subsetting, reshaping) without copying the memory where the array’s data is stored. The NumPy array has a ‘view’ memory model, but the R array generally does not. However, the ‘view’ memory model may be viable for R arrays, since the memory model is mostly invisible to the user.
subset-assignment
Subset assignment refers to assignments that modify one or more elements of an array. For example:
> x <- c(1,2,3,4) > x[1] <- 100 > x [1] 100 2 3 4
vectorized operations
Vectorized operations refer to expressions where an element-wise operation is implicit. Consider this R code:
> x <- c(1,2,3,4) > x * 3 [1] 3 6 9 12
where x * 3 implicitly specifies that each element of x should be multiplied by 3. Vectorized operations avoid the need for looping in many cases.
memory-mapping
Memory mapping refers to an ability to map a program’s memory onto a file. Hence, a large array stored on disk may be manipulated without loading the entire array into memory. *R doesn’t offer a memory mapping facility for arrays. However, some memory-mapping functionality is provided by the bigmemory and mmap extension packages. R also provides a well-developed interface to DBMSs (see the R Data Import/Export manual), enabling random access to data stored on disk.
broadcasting rules
Broadcasting rules affect the behavior of binary operations (‘+’, ‘*’, etc.) on arrays of different dimensions. Without broadcasting rules, the behavior of such operations may not be defined. Both R and NumPy arrays have broadcasting rules, but they are not the same rules.
index arrays
Index arrays may be used to index another array. For example:
> x <- array(rnorm(9), c(3, 3)) > y <- array(c(1, 1, 1, 2), c(2, 2)) > x[y] [1] -0.9345381 0.5509239
However, the rules for index arrays are different for R and NumPy arrays.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.