Matlab and R (getting started)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Matlab and R are two popular languages for data analysis and visualization. The similarity between the two languages is high. Both are interpreted languages that run in a shell-like environment (while also allowing to run scripts or functions written off-line). Both tend to be slow if your code contains many loops but are fast when running vectorized code (vectorized code means that a repeated operation is cast as an operation on matrices or tensors).
One difference between them is that Matlab is commercial while R is open-source. Another difference is that Matlab is traditionally more popular in engineering and scientific computing while R is traditionally used by statisticians. As a result, Matlab is probably more polished and can probably handle large computations faster. R, on the other hand, has a larger library of data analysis and visualization routines-often contributed by a vibrant network of users. I would choose R for most statistical data analysis and visualization but revert to Matlab for the heavy lifting (although for really heavy lifting there is no escaping C/C++).
In this note and some subsequent ones I will describe a few commands and features of the two languages. I think it is useful to do it side by side for the two languages as many people know one of the two languages. Considering how the same task is done in both language is instructive in teaching Matlab (R) programmers how to program in R (Matlab).
Getting Help and Reference Material
Matlab’s online getting started is available here. Note in particular the link to the pdf manuals. R’s manuals are available here.
In Matlab to get help on a certain function type
help functionName |
for a plain text explanation within the Matlab window, or
doc functionName |
for help in html format (including figures sometimes) viewed through a browser.
In R, to get help type
help("functionName") |
The help will typically appear inside the R environment as plain text. It is possible to get html help in a browser by typing help.start() before the help command. R also have the following specific command to provide examples
example("functionName") |
Variables
In both languages simply type the name of the variable followed by the return key to see its value. In Matlab type
whos |
to see what variables are currently defined and
clear varName |
to remove a variable from the workspace. Typing
save fileName |
saves the workspace variables to a file which may be loaded using the command load.
In R, type
ls() |
to see a list of defined variables and
rm(x,y) |
to remove some variables from memory. The workspace is saved to the file .RData when the program quits (user is prompted) which is then automatically loaded when a new R session starts (in the same directory). It is also possible to save all variables at any point using the command
save.image(file="fileName") |
Indexing
Both R and Matlab make heavy use of vectors, matrices, and tensors. To access a specific element or elements in Matlab use
Arr=zeros(10,10,10); ... Arr(3,2,5) Arr(3,2:4,5) Arr(3,:,5) |
to extract one element, a pair of elements, or an entire slice of a third dimensional array.
To accomplish the same effect in R use
Arr=array(data=0,dim=c(10,10,10)) ... |
varName |
Arr[3,2,5]
Arr[3,2:4,5]
Arr[3,,5]
Interfacing with the OS and other Languages
In Matlab you can issue commands to the shell by prefixing it with an exclamation mark, for example
!ls -al |
The same thing is accomplished in R using
system("ls -al") |
Sooner or later you will want to call C programs from Matlab or R. This can be useful since Matlab and R can be very very slow. Coding the computational bottleneck routine in C may speed up things considerably with only a modest programming effort. The simplest way to do that is to save the data to a file in Matlab/R, execute a compiled program as shown above which reads the data from the file and writes the result to a new file. The process is completed by read the file containing the results from Matlab/R.
A more smooth solution is to write C code in a slightly modified way which will enable calling them from Matlab/R using a standard parameter passing syntax without the use of files to transfer input/output data.
More info on this may be found here for Matlab and here for R. Matlab’s version is more powerful but is more tedious and less programmer-friendly.
Cell Array and Lists
Standard arrays in both languages contains multiple elements of the same atomic type (double, character, binary, etc.). In many cases it is useful to have an array of non-atoms or an array where each element has a different types. This concept is called a cell array in Matlab and a list in R.
In Matlab, cell arrays are created using the command cell, for example
A=cell(1,10); |
Assigning values to the elements and accessing them is done with curly braces (regular parenthesis in the example will create a cell array of reduced size)
B=A{3}; |
In R the same thing is accomplished by creating a list (the [[i]] notation means return the content of the i list element-rather than the i-element itself which is a list of length 1)
A=vector("list",10); B=A[[3]]; |
Redirecting Input and Output
Programming in Matlab/R by interacting withe the shell environment is ultimately limited. Redirecting the input means that you run Matlab/R commands that were written off-line and saved to a file (sometime called a script). Redirecting the output means that the output produced by Matlab/R is saved as a separate file for off-line investigation later on.
In Matlab redirecting input and running the commands in a script file is done by simply typing the name of the script file (without the conventional .m extension).
scriptName |
To redirect output use
diary('fileName.txt') ... diary off |
In R, we have the following commands
source("scriptName.R") sink("fileName.txt") |
A different kind of output redirection is for graphic figures which are normally printed to the screen. To print an existing figure as an eps file type (replace epsc with other options to create png, jpeg, etc.)
print -depsc fileName |
To create a pdf file it’s best to first create an eps and then convert it rather than to use the pdf flag in the print command (see this post).
In R, redirecting figures to be printed to a file instead of the screen is done in the following way
pdf() # print all figures to a pdf file Rplots.pdf ... dev.off() |
Note that the pdf file is not created in its final form until after the dev.off command. To redirect to a ps or eps file use the command postscript. The figures may be printed to multiple devices simultaneously as follows.
X11() #linux screen pdf() # pdf file Plots.pdf jpeg() # ... dev.off() |
Installing Contributed Libraries
Both languages have a set of user contributed libraries which are very useful in extending the functionality of the language. Generally speaking R’s libraries are more extensive and high quality, probably due to the fact that it’s open source.
To install a user contributed library on Matlab simply download the .m files and put them in a directory (and add it to the Matlab path using addpath if needed).
In R, new libraries may be installed using
install.packages("packageName") |
which automatically downloads all the necessary files and selects an appropriate place to download them and updates the path. To use the library (in other words bring it into scope) use
library("packageName") |
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.