Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Rstudio Server Open Source Edition (OSE) is offered with some key limitations compared to the Pro Edition. A few of these limitations are easy to circumvent using basic Linux sysadmin skills (such as encrypting traffic by using a reverse proxy), but most of RStudio Server OSE’s limitations are not so easy to work with, in my opinion.
The ability to use Rstudio server in the browser has been great and a major time-saver, but the one thing that caused me the most grief was the lack of support for multiple concurrent sessions (this is acutely felt when working within Rstudio projects). Now, Rstudio server Pro has this functionality and I’d be happy to support the Rstudio team, but the Pro license is firmly out of reach for a single user like myself (at 10000 USD/yr it’s clearly meant for institutions or large teams only).
Concurrent multiple sessions outside Pro is a feature that many others have asked for or tired to implement, e.g., see “Run multiple instances of RStudio in a web browser”, StackOverflow, “Unable to run multiple RStudio Server sessions at the same time”, Batch Connect – OSC RStudio Server", “Parallel processes for Rstudio-server, much like Jupyter Notebooks”, RStudio Community, or “Multiple Rstudio server sessions from one user account”, RStudio Support.
Several work-arounds or solutions have been offered by the community, for example by running two Rstudio server processes on different IP addresses, or by assigning different Linux user/group ownerships to the Rstudio server processes, but unfortunately Rstudio has also been locking down the free Rstudio server in what can only be interpreted as an attempt to negate these efforts, such as by hard-coding the path to the .rstudio/
directory, or by making Rstudio project folders read-only for all users except the owner (and overriding the machine’s system-wide permissions settings in the process), or by making the browser-side of the session logout even across incognito mode.
In short, none of the work-arounds actually achieve multiple sessions in the same browser.
In this post I will demonstrate how to setup multiple Rstudio Server OSE instances using a single, shared R and LaTeX installation on an Ubuntu host with KVM virtualisation. We will go through the setup of the host and the first virtual machine (VM) in some detail, and then we will see how more Rstudio Server instances can be (manually) created by cloning the existing VM.
The issue at hand
I wanted to have the ability to run multiple concurrent browser sessions of RStudio Server. In particular, my workflow required that I have different Rstudio projects, or even the same project, open in different browser windows/tabs at the same time. I also like to keep a particular RStudio project open, while working on other Rstudio projects in other tabs.
So, in short, we are trying to circumvent (without breaking any rules) the “multiple sessions” limitation that RStudio has put on its free-tier server offering. If you run an Rstudio server with a single user account, have some CPU/RAM to spare and want to run more than one concurrent RStudio Server session and don’t mind using virtual machines, read on.
The main advantage of this approach over simply provisioning an arbitrary number of VMs in the cloud (a perfectly valid solution to achieve “multiple sessions”) is that by keeping the VMs on a single host we can centralise the maintenance of the R and LaTeX installations. A disadvantage is that we have to compile R from source (but I will walk you through that, it’s not at all complicated).
The benefit of using full virtualisation is that as far as RStudio Server is concerned, we are running only one instance of it. Also, increasing the number of RStudio Server instances is trivial once the first VM is configured.
Requirements
You will need a server with sufficient CPU and RAM to run a hypervisor and the number of virtual machines of your choice. Still, nothing too fancy is required. For example, my own server runs Ubuntu 16.04 on an Intel Xeon E3-1200 series processor with 32 GB of RAM, and that’s been more than plenty to run 3 Rstudio Server VMs along with a multitude of other services.
If you want to use your Rstudio Server from outside your LAN (without having to fuss with virtual private networks) you will need to assign a (sub)domain to each Rstudio Server instance. If your router does not get a static IP address from your ISP, you will additionally have to setup dynamic DNS (DDNS). I will link to some guides on how to that below.
Of course, you will need superuser permission on the server in question (it’s your server, right?). Also, I’m assuming your server already runs RStudio Server, so by following this project you will gain at least one more instance (if you setup just one virtual machine).
Outline of setup
For the purposes of this project, we will assume that your server sits behind your router. The server is reachable externally at yourhost.yourdomain.se
, and the router uses DDNS to keep that domain pointed at it. The server runs Ubuntu Server.
We will install a a virtual machine hypervisor (KVM) and a single virtual machine which we will then install RStudio Server Open Source Edition on, and we will demonstrate how you can clone this VM to setup as many RStudio Server instances as you like (and the server has resources for). Using full virtualisation will of course incur a slight performance penalty in terms of CPU and RAM overhead. If resources are scarce, I guess using Docker or LXD may offer another another way to achieve a similar result.
In KVM terminology, the host is the system that runs the VM hypervisor, and the guest is the virtual machine (VM). One host, multiple guests. The guests are controlled (started, stopped, cloned, destroyed) via the VM hypervisor on the host.
The guest (the virtual machine itself) will also run headless Ubuntu server 16.04 with RStudio Server.
NOTE: it’s actually very important that both the host and the guests run the same version of the same OS, since R will be compiled on the host and absolutely needs the same environment on the guest. Even using Ubuntu 16.04 on the host and Ubuntu 18.04 will fail to run R properly on the guest (I’ve tried).
In terms of security, the setup described here assumes that 1) the server sits behind a router with a firewall, and 2) that the local network can be considered safe (the RStudio Server instances on the guests send their traffic to/from the host over unencrypted connections). But this is not so bad, since any setup that uses a proxy that terminates HTTPS connections will need to pass on the traffic unencrypted over part of the network. I guess if you distrust the LAN you could configure a separate VLAN for this traffic, or otherwise encrypt it somehow.
The central idea of this project is that we will install and maintain the R and LaTeX installations only on the host. We can expose the R and LaTeX trees to the guest by using KVM filesystem passthrough. For this to work, it’s critical that we use the same OS with the same version on host and guest.
The table below outlines the software components needed to complete our setup.
Installed on host | Installed on VM | Configured on VM |
---|---|---|
Apache web server | No | No |
Certbot/Let’s Encrypt | No | No |
LaTeX/TeXLive | No | Yes (added to $PATH ) |
R (compiled from source) | No | Yes (symlinks) |
KVM | No | No |
Fonts | No | No |
Rstudio server | YES | Yes |
As you can see, only RStudio Server has to be installed on both the host and the guest. The rest of the software stack can be almost exclusively managed on just the host. This is what makes this endeavour workable, in my opinion.
Apache is used as a reverse proxy to encrypt all traffic to/from each RStudio Server. Certbot is used to download the Let’s Encrypt certificates. Additionally, we will setup a CRON job or a Systemd timer to automatically renew those certificates.
We will use TeXLive and install a default LaTeX installation. LaTeX, as installed by TeXLive, lives entirely inside a single directory (except for a few files) and so is easy to passthrough to the virtual machine.
R, as installed from the Debian repos, is less well-behaved in that regard. Actually, it’s TeXLive that’s unusual and not very Debian-like in this regard, but since we are looking to passthrough the entire installation so that we can “trick” the guest into thinking the installation is local, having it all inside a single directory tree is ideal. And I’ve found no way to passthrough R, as installed from the repos, to a VM. My solution is to instead compile R from source. This allows us to fit all of it inside the /opt/R/
tree, and allows us to passthrough just that directory to the guest.
And we will of course need to install the VM hypervisor, KVM in this case, as well as libvirt
which is a commandline tool to manage KVM and its virtual machines. With all that in place, we will install RStudio Server on the guest (this will be instance no. 2, assuming you already have it installed on the host itself). Once the guest setup is complete, we can clone it to create additional RStudio Server instances.
For all of this to work, we will also need to learn a little about network interfaces and bridges so that we can provision each VM with its own IP address on our LAN. Each instance will get its own Apache virtualhost file which will simply setup a proxy pointing to its IP address on the LAN, redirecting all traffic from the subdomain we setup to it.
A note on naming. I will use the placeholders
<yourhost>
and<guestname>
in this text. You should of course replace those with the hostname of your host machine and virtual machine, respectively. Also, the sooner you decide on names for your virtual machines, the better. My recommendation is that you use the same hostname in the subdomains you setup to point at each VM, but that’s optional. I also recommend you to setup the DNS records for the virtual machines early in the process, since DNS propagation may take some time, and you will need those records in place to install the Let’s Encrypt certificates.
And that’s about the gist of it. Still interested? Let’s get into the nitty-gritty!
Setup the host
The server used as an example in this post runs Ubuntu 16.04. The instructions here should be easily adapted to work for any Debian-based Linux distribution.
Install Apache, if you haven’t already. We will not go through that here. Plenty of guides describe the process. I should mention that my server runs Apache v2.4.18 (the syntax used in the virtualhost config files changed slightly with v2.2, so if your server happens to use an older Apache version, beware). We will get back to the web server configuration once the VM is up and running with its own IP address. Now, let’s move on to installing LaTeX and compiling R.
Install LaTeX (TeXLive)
If you plan to install LaTeX, you should do so before compiling R since the latter will look for it during compilation.
$ wget http://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz $ tar -xvf install-tl-unx.tar.gz $ cd install-tl-20190119/ $ ./install-tl
Depending on your preferences, you might want to make the directory /usr/local/texlive
owned by your user account instead of root to simplify the maintenance of TeXLive. Note that downloading a full TeXLive installation can take quite a while.
To add LaTeX to the system-wide PATH
, I added the following line to /etc/profile
:
PATH="/usr/local/texlive/2018/bin/x86_64-linux:$PATH"
This will make the LaTeX executables findable by R and other programs on the host (you might have to logout and log back in for the change to take effect).
Compile R from source
I have found that compiling R from source (on the host) aids considerably in achieving a working R session on each guest, because whereas compiled R lives entirely inside the /opt/R/
directory, the R installated from the Debian repos gets parts of it installed all over the filesystem (/usr/lib/R/
, /usr/local/lib/R/
, among other places) which makes it unworkable to setup KVM filesystem passthrough.
For removing an existing R installed by the apt package manager, see the appendix. But please don’t remove your existing R installation before reading through the installation instructions below carefully.
Download and extract the R source tarball (at the time I compiled R, version 3.5.1 was available).
chepec@yourhost:~/Downloads $ wget https://cran.r-project.org/src/base/R-3/R-3.5.1.tar.gz $ tar -xzvf R-3.5.1.tar.gz $ cd R-3.5.1
Depending on which R packages you plan to install, you might need to install certain dependencies before compiling R. How to figure out which dependencies you need may not be entirely straight-forward. You should at least have a look at the capabilities()
of your current R installation (before removing it):
> capabilities() jpeg png tiff tcltk X11 aqua TRUE TRUE TRUE TRUE FALSE FALSE http/ftp sockets libxml fifo cledit iconv TRUE TRUE TRUE TRUE TRUE TRUE NLS profmem cairo ICU long.double libcurl TRUE TRUE TRUE FALSE TRUE TRUE
Some of these capabilities/dependencies appear to be packaged together with apt’s R installation, so when we remove it, we also uninstall the dependencies. Anyway, I found that, on my particular system, tcltk
was missing and had to be installed prior to compiling R for a whole bunch of R packages to work later. So we need to install the tcl
and tk
development packages:
chepec@yourhost:~ $ sudo apt install tcl8.6-dev tk8.6-dev Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libxft-dev libxss-dev x11proto-scrnsaver-dev Suggested packages: tcl8.6-doc tk8.6-doc The following NEW packages will be installed: libxft-dev libxss-dev tcl8.6-dev tk8.6-dev x11proto-scrnsaver-dev 0 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Now let’s build R from the source tarball we downloaded (note the build flags, and adapt to your needs).
chepec@yourhost:~/Downloads/R-3.5.1 $ ./configure --prefix=/opt/R/$(cat VERSION) --enable-R-shlib --with-blas --with-lapack --with-tcltk
This should produce lots of output, ending with a block looking like this:
R is now configured for x86_64-pc-linux-gnu Source directory: . Installation directory: /opt/R/3.5.1 C compiler: gcc -g -O2 Fortran 77 compiler: f95 -g -O2 Default C++ compiler: g++ -g -O2 C++98 compiler: g++ -std=gnu++98 -g -O2 C++11 compiler: g++ -std=gnu++11 -g -O2 C++14 compiler: g++ -std=gnu++14 -g -O2 C++17 compiler: Fortran 90/95 compiler: gfortran -g -O2 Obj-C compiler: Interfaces supported: X11, tcltk External libraries: readline, BLAS(generic), LAPACK(generic), curl Additional capabilities: PNG, JPEG, TIFF, NLS, cairo, ICU Options enabled: shared R library, R profiling Capabilities skipped: Options not enabled: shared BLAS, memory profiling Recommended packages: yes
To figure out if anything was amiss during the compilation, you really should spend some time scrolling through config.log
in the compilation directory. It might save you a bunch of time later.
Now if you’re happy with the build, let’s make and install.
$ make $ sudo make install
That’s it. Now either put the opt/R/<version-number>/bin/
directory into the system PATH
, or symlink the two R executables into somewhere already on the system PATH
. I prefer the latter approach, so:
$ sudo ln -s /opt/R/3.5.1/bin/R /usr/bin/R $ sudo ln -s /opt/R/3.5.1/bin/Rscript /usr/bin/Rscript
When we create our first VM we will create the same symlinks on there. And don’t forget that these symlinks should be changed once you have upgraded to a newer R version.
NOTE: at this point, it’s perhaps worth underscoring that going forward, everything R-related should be kept inside this
/opt/R/
tree. That includes config files (by default placed in/opt/R/3.5.1/lib/R/etc/
) as well as installed system and user packages. Anything you place outside this tree will not be visible on the VM which would defeat the purpose of this whole project.
Install KVM and the virtual machine OS
Install the VM hypervisor (KVM/libvirt):
$ sudo apt install qemu-kvm libvirt-bin virtinst bridge-utils cpu-checker
Verify the KVM installation by running kvm-ok
.
Also check that the libvirtd
service is running: systemctl status libvirtd.service
. If it’s not, start it: sudo systemctl start libvirtd.service
.
For networking, I decided to setup the server’s physical network interface card (NIC) as a bridge adapter. This way, each VM will act as its own NIC visavi the router, and I can simply let the router assign IP addresses as usual (and it allows me to use my router’s DHCP server to set a static IP address for each VM). This was simpler to setup than iptables redirects on the host or other alternatives (there are many ways to setup networking for VMs). And if you want SSH access to the VM, just setup port forwarding of the SSH port on the router, as per usual.
To set up your NIC as a bridge adapter, first make sure the bridge-utils
package is installed on the host. We will edit /etc/network/interfaces
, so make sure to take a backup copy of it first: sudo cp /etc/network/interfaces /etc/network/interfaces.backup
. For good measure, you might want to backup /etc/resolv.conf
as well.
This is how my /etc/network/interfaces
looked before any changes:
# The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth1 iface eth1 inet dhcp
and this is the same file now configured for bridging:
# The loopback network interface auto lo iface lo inet loopback # The physical network interface auto eth1 iface eth1 inet manual auto br0 iface br0 inet dhcp bridge_ports eth1 bridge_stp off bridge_fd 0 bridge maxwait 0
This sets up the new br0
interface, and effectively makes eth1
just a physical transport layer to it.
Bridging means the VM will be exposed to whatever the bridged NIC is connected to, in our case to our LAN and the router. As such it will get an IP address from the router’s DHCP server just like any other networked device on the LAN.
Note that by default, KVM creates its own bridge at 192.168.122.1
. This device acts as a virtual router for your VMs. (We will not use this virtual router for anything in this project).
With the edits in place, restart the networking service (but be warned: if your connection to the host is over SSH, at this point you will lose the connection! I found it useful to apply these networking changes while physically connected to the host).
$ sudo systemctl restart networking.service
With KVM installed and with the host’s network configured, we are now ready to install our first virtual machine. The installation is done over VNC.
All virtual machine files and other related files will be stored under /var/lib/libvirt/
.
The default path of saved ISO images is /var/lib/libvirt/boot/
.
We will go ahead and download the latest netinst Ubuntu Server 16.04 64-bit image, and save it in /var/lib/libvirt/boot/
. Installing is fairly straight-forward once you figure out the virt-install
command parameters (lots of guides online, see the Bibliography). I used the following command with success:
$ sudo virt-install --virt-type=kvm --name <guestname> --ram 4096 --vcpus=2 \ --os-variant=ubuntu16.04 --hvm \ --cdrom=/var/lib/libvirt/boot/ubuntu-16.04.3-amd64-netinst.iso \ --network=bridge=br0,model=virtio --graphics vnc \ --disk path=/var/lib/libvirt/images/<guestname>.qcow2,size=20,bus=virtio,format=qcow2
Here we allocate a maximum of 4 GB RAM, 2 CPU cores and 20 GB disk space to the new virtual machine. KVM is pretty smart and won’t use up those resources until usage demands it, so most of the time the load on the host from this VM will be much less than this. As for the disk space, 20 GB should be plenty. In fact, if you don’t plan to keep your work files on the VM itself, even half that would be plenty). Anyway, it’s fairly easy to expand a qcow2 disk at a later date, but slightly complicated to shrink one.
The above command will simply tell you that installation has started – you will need to connect to the OS installation using VNC to actually proceed with the installation.
KVM gives every guest it launches a new VNC instance with its own port (starting with port 5900
). So we’ll check the current guest’s VNC port number, and then tunnel in via ssh to proceed with the OS installation:
# find the VNC port number of the VM $ sudo virsh dumpxml <guestname> | grep vnc # sample output: <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
Use a VNC client (such as xtightvncviewer
) to connect to the port used by the libvirt installer on the server. For security, we will route the connection through an SSH tunnel:
# Create an ssh tunnel from your workstation to the host $ ssh <username>@yourhost.yourdomain.se -L 5900:localhost:<localport> # on your workstation, use a local VNC client to connect to the tunnel $ xtightvncviewer localhost:<localport>
Et voilà! Now proceed with the OS installation as usual. If the VNC connection is dropped during the installation (happens sometimes), reconnect by repeating the last chunk and continue from where you were dropped off.
< !-- CONT HERE by linking to guides or Appendix for this section -->When the OS installation is done, it’s a good time to setup static DHCP for this VM and port forwarding for the SSH port. Both of these settings are done on your router. (There are some links in the Bibliography to help get you started). You probably also want to setup passwordless SSH access to the VM (since you already have a server, I’m going to assume you know how to do that). Make sure to set the VM’s hostname
in both /etc/hostname
and /etc/hosts
.
With those things in place, we will also setup a new subdomain for accessing this Rstudio server instance from anywhere. If you configured SSH port forwarding earlier, you will also be able to SSH from anywhere. I cannot go into how to do this in detail since it depends on your registrar (Loopia, GoDaddy, Amazon Route 53, etc.). But in summary, go to your registrar, and create a CNAME
record that points <name-for-VM>.yourdomain.se
to the already existing (sub)domain of your host server (yourhost.yourdomain.se
). Once the new DNS settings have propagated across the web, you should be able to reach your new virtual machine using your chosen URL.
While still on the host, make sure that the virtual machine is running:
$ virsh list --all $ virsh start <guestname>
Share directories on the host to the guests
Shutdown the virtual machine and edit its configuration XML file:
$ virsh shutdown <guestname> $ virsh edit <guestname>
and add these blocks inside the devices
tree:
<filesystem type='mount' accessmode='passthrough'> <source dir='/opt/R'/> <target dir='R'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/local/texlive'/> <target dir='texlive'/> </filesystem> <filesystem type='mount' accessmode='passthrough'> <source dir='/usr/shares'/> <target dir='s'/> </filesystem>
The string in target dir
is just a label, which we will have use for in the fstab (on the guest) to mount these shares. Make the label informative.
Restart the VM.
$ virsh start <guestname>
And with that, let’s move on to mounting the passed-through directories on the VM.
Setup the guest (virtual machine)
Filesystem passthrough for R, TeXLive and s
Log in to the VM. Append the following lines to the file /etc/modules
:
loop virtio 9p 9pnet 9pnet_virtio
then load those modules (make sure to do this step as root):
$ sudo service kmod start
Create the mountpoints for the shares on the guest. The mountpoints for R and TeXLive should not exist on the guest, so simple create those directories (use sudo
if necessary):
$ mkdir /opt/R $ mkdir /usr/local/texlive
The s directory probably exists already, so let’s delete it and recreate an empty one:
$ sudo rm -rf /usr/shares $ sudo mkdir /usr/shares
Now you should be ready to mount the passed-through directories (here we make use the label we set in target dir
above):
$ sudo mount R /opt/R -t 9p -o trans=virtio $ sudo mount texlive /usr/local/texlive -t 9p -o trans=virtio $ sudo mount s /usr/shares -t 9p -o trans=virtio
Now make sure everything works as expected on the guest. You should be able to read from these directories, but not write.
If everything works as expected, go ahead and add these shares to /etc/fstab
for automatic mounting at boot:
R /opt/R 9p trans=virtio 0 0 texlive /usr/local/texlive 9p trans=virtio 0 0 s /usr/shares 9p trans=virtio 0 0
Note: by default, the security profile of KVM will not allow the guest to write to any directories mounted using the passthrough method. And this works just fine for our purposes, since all maintenance of the R, LaTeX and s directories should be done on the host machine anyway.
Make R and TeXLive executables accessible
For R and LaTeX to work as expected on the guest, it’s not enough to make the /opt/R
and /usr/local/texlive
trees available. We should also expose the executables where other programs expect to find them, namely somewhere on the system’s PATH.
Additionally, there are a few crucial TeX utilities that we need to get from the host to the guest.
$ ls -1 /usr/bin/tex* /usr/bin/texi2any /usr/bin/texi2dvi /usr/bin/texi2pdf /usr/bin/texindex
These are actual files and not links. They are not available on the guest, but need to be for TeXLive to work properly in conjunction with R.
There’s no way to passthrough single files from host to guest, and we definitely do not want to passthrough the entire /usr/bin/
directory. So although inelegant, the best solution appears to be to simply copy these files from host to guest.
Note that you will need to redo this if these files should be updated by tlmgr
on the host in the future. Assuming you have setup SSH access from host to guest, you can copy them like this:
chepec@yourhost:~ $ scp -p /usr/bin/texi* <guestname>:/home/chepec/Downloads/ $ ssh <guestname> chepec@<guestname>:~ $ sudo cp -a /home/chepec/Downloads/texi* /usr/bin/ $ sudo chown root:root /usr/bin/texi*
Create symlinks in /usr/bin/
for the R executables:
$ sudo ln -s /opt/R/3.5.1/bin/R /usr/bin/R $ sudo ln -s /opt/R/3.5.1/bin/Rscript /usr/bin/Rscript
And add the TexLive path to the system PATH
. In /etc/profile
, append
PATH="/usr/local/texlive/2018/bin/x86_64-linux:$PATH"
Why not add the R/bin
directory to PATH as well, you ask? Well, you may well want to do that, especially if you have more than just two executables in there. Both methods (symlinking or editing $PATH
) achieve the same result.
NOTE: if you have a network share or similar where you keep your work files, you might want to mount that too on the guest. Since you most definitely want to read and write to this directory from the VM, you might want to use NFS (if it’s inside the LAN) or SSHFS (otherwise) to mount it instead of KVM passthrough. If you do, having your regular user account on the VM have the same username, uid and gid as that on the share will most certainly make your life as an admin easier.
Install Rstudio Server
Browse to https://www.rstudio.com/products/rstudio/download-server/ and follow the instructions to download and install the latest RStudio Server Open Source Edition.
The installation will automatically set up a systemd startup script (or upstart, depending on your system) for Rstudio server. Neat!
By default, RStudio Server listens on port 8787 on all interfaces, 0.0.0.0:8787
. At this point, you should be able to verify that RStudio Server is accessible via your browser by entering http://<ip-address-guest>:8787
.
Back to the host
Configure Apache reverse proxy and Let’s Encrypt certificates
The Apache web server on our host machine will act as a reverse proxy, and we will also install a free TLS (HTTPS) certificate from Let’s Encrypt for our RStudio Server instance.
Let’s say you named your virtual machine callisto
and created a subdomain rstudio.callisto.yourhost.se
for it.
We would thus create a new Apache virtualhost file /etc/apache2/sites-available/rstudio.callisto.yourhost.se.conf
with the following contents:
<VirtualHost *:80> ServerAdmin webmaster@yourhost.se ServerName rstudio.callisto.yourhost.se Redirect permanent / https://rstudio.callisto.yourhost.se/ ErrorLog ${APACHE_LOG_DIR}/rstudio.callisto.yourhost.se_error.log CustomLog ${APACHE_LOG_DIR}/rstudio.callisto.yourhost.se_access.log combined </VirtualHost> <VirtualHost *:443> ServerAdmin webmaster@yourhost.se ServerName rstudio.callisto.yourhost.se SSLEngine on SSLCertificateKeyFile /etc/letsencrypt/live/rstudio.callisto.yourhost.se/privkey.pem SSLCertificateFile /etc/letsencrypt/live/rstudio.callisto.yourhost.se/cert.pem SSLCertificateChainFile /etc/letsencrypt/live/rstudio.callisto.yourhost.se/chain.pem <Proxy *> Allow from localhost </Proxy> RewriteEngine on RewriteCond %{HTTP:Upgrade} =websocket RewriteRule /(.*) ws://192.168.100.110:8787/$1 [P,L] RewriteCond %{HTTP:Upgrade} !=websocket RewriteRule /(.*) http://192.168.100.110:8787/$1 [P,L] ProxyPass / http://192.168.100.110:8787/ ProxyPassReverse / http://192.168.100.110:8787/ ProxyRequests Off ErrorLog ${APACHE_LOG_DIR}/rstudio.callisto.yourhost.se_error.log CustomLog ${APACHE_LOG_DIR}/rstudio.callisto.yourhost.se_access.log combined </VirtualHost>
You should of course edit domain name, email address and IP address above to fit your setup.
With DNS in place, now is the time to install the Let’s Encrypt certificate. This will be managed by certbot
. EFF has excellent instructions on how to install the certbot
client. With certbot
installed, fetch the certificate (I like to manage the integration with Apache myself, so I use the certonly
option):
$ sudo certbot certonly --authenticator standalone \ --pre-hook "systemctl stop apache2.service" \ --post-hook "systemctl start apache2.service" \ --email webmaster@yourhost.se -d rstudio.callisto.yourhost.se
You should really setup a CRON job or a systemd timer for automatic renewals your Let’s Encrypt certificates. Check the Certbot docs and the web for info on how to accomplish that.
With the TLS certificate installed, we are now ready to enable our virtualhost and reload Apache:
$ sudo a2ensite rstudio.callisto.yourhost.se.conf $ sudo systemctl reload apache2.service
If no errors were thrown, you should new be able to reach your RStudio Server instance running on your VM from your browser using the URL http://<rstudio.callisto.yourhost.se>
which should automatically redirect to https.
Create additional VMs (KVM cloning)
Shutdown the VM before cloning.
The cloning operation is straight-forward (should be done as root):
$ virt-clone --original callisto --name ganymede --file /var/lib/libvirt/images/ganymede.qcow2 Allocating 'ganymede.qcow2' | 20 GB 00:00:39 Clone 'ganymede' created successfully.
This clones callisto
and sets the new VM’s name to ganymede
. Note that this naming only affects the VM hypervisor, we still have to login to the new guest and edit the hostname (in /etc/hosts
and /etc/hostname
).
The bridged networking we setup above means that the new VM will get its own IP address assigned by the router (which you might want to assign statically by the way). We need to create a new subdomain that this instance will be reached on, and once that’s in place we will create a new Apache vhost and download a Let’s Encrypt certificate.
Create /etc/apache2/sites-available/rstudio.ganymede.yourhost.se.conf
with the following contents:
<VirtualHost *:80> ServerAdmin webmaster@yourhost.se ServerName rstudio.ganymede.yourhost.se Redirect permanent / https://rstudio.ganymede.yourhost.se/ ErrorLog ${APACHE_LOG_DIR}/rstudio.ganymede.yourhost.se_error.log CustomLog ${APACHE_LOG_DIR}/rstudio.ganymede.yourhost.se_access.log combined </VirtualHost> <VirtualHost *:443> ServerAdmin webmaster@yourhost.se ServerName rstudio.ganymede.yourhost.se SSLEngine on SSLCertificateKeyFile /etc/letsencrypt/live/rstudio.ganymede.yourhost.se/privkey.pem SSLCertificateFile /etc/letsencrypt/live/rstudio.ganymede.yourhost.se/cert.pem SSLCertificateChainFile /etc/letsencrypt/live/rstudio.ganymede.yourhost.se/chain.pem <Proxy *> Allow from localhost </Proxy> RewriteEngine on RewriteCond %{HTTP:Upgrade} =websocket RewriteRule /(.*) ws://192.168.100.111:8787/$1 [P,L] RewriteCond %{HTTP:Upgrade} !=websocket RewriteRule /(.*) http://192.168.100.111:8787/$1 [P,L] ProxyPass / http://192.168.100.111:8787/ ProxyPassReverse / http://192.168.100.111:8787/ ProxyRequests Off ErrorLog ${APACHE_LOG_DIR}/rstudio.ganymede.yourhost.se_error.log CustomLog ${APACHE_LOG_DIR}/rstudio.ganymede.yourhost.se_access.log combined </VirtualHost>
Note the different IP address and server name (edit to fit your setup).
And install the Let’s Encrypt certificate:
$ sudo certbot certonly --authenticator standalone \ --pre-hook "systemctl stop apache2.service" \ --post-hook "systemctl start apache2.service" \ --email webmaster@yourhost.se -d rstudio.ganymede.yourhost.se
Enable the site and reload the web server:
$ sudo a2ensite rstudio.callisto.yourhost.se.conf $ sudo systemctl reload apache2.service
And that should be it.
Autostart the virtual machines
Virtual machines created with KVM do not autostart by default when the host reboots. But for our application, it might be desirable to have them autostart. To achieve that, issue the following for each VM.
$ virsh autostart <guestname>
You can confirm the new setting by:
$ virsh list --all --autostart $ virsh list --all --no-autostart
Conclusion
This is admittedly a resource-demanding solution to the “single-session per browser” limitation put on RStudio Server OSE by RStudio. Also, it requires the admin to setup multiple VMs and the user to remember multiple URLs.
Still, it’s the most reliable and resource-efficient way to achieve actual multi-sessions without any restrictions in the browser I have seen to date, outside of the Pro offering of course. And the single R/LaTeX installation at its core makes it quite maintainable and means adding more VMs (i.e., more RStudio Server instances) marginally cost-free in terms of setup.
I have been running this setup myself for about a year now, and I decided to write-up this project in the hopes that others (single researchers like me, or groups/departments) might find it useful. And if you have comments or questions, please feel free to ask them on my Twitter or Mastodon accounts (and sorry for the lack of comments on this site – I’m still figuring out how to implement that with Hugo outside of Disqus).
Appendix
Remove an existing R aptitude installation
Here’s how to completely remove an existing R installation that was installed from the Debian repos. Please be careful to record any configuration you want to move over to your new installation. This will remove everything, including config files.
The installed R version was 3.4.4 (2018-03-15). List all of the installed R packages:
chepec@yourhost:~ $ dpkg -l | grep ^ii | awk '$2 ~ /^r-/ { print $2 }' r-base r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster r-cran-codetools r-cran-crayon r-cran-digest r-cran-evaluate r-cran-foreign r-cran-irdisplay r-cran-jsonlite r-cran-kernsmooth r-cran-lattice r-cran-magrittr r-cran-mass r-cran-matrix r-cran-memoise r-cran-mgcv r-cran-nlme r-cran-nnet r-cran-pbdzmq r-cran-r6 r-cran-repr r-cran-rgl r-cran-rpart r-cran-spatial r-cran-stringi r-cran-stringr r-cran-survival r-cran-uuid r-doc-html r-recommended
Uninstall all the R packages identified above in one fell swoop:
chepec@yourhost:~ $ dpkg -l | grep ^ii | awk '$2 ~ /^r-/ { print $2 }' | xargs sudo apt remove --purge -y
Clean up any remaining directories (adapt this to catch all of them):
$ sudo rm -rf /usr/lib/R $ sudo rm -rf /etc/R
Jupyter, an alternative to RStudio Server?
One drastic solution is to abandon RStudio Server altogether and look for alternatives.
The most obvious being Jupyter notebooks, which should offer a comparable web-based interface in particular with the JupyterLab/JupyterHub module. The advantage is that Jupyter supports not only R but other kernels too. The disadvantage lies mainly in my own unfamiliarity with the JupyterHub setup.
Sharing RStudio Server projects with other users
I like to include a few words on the things I tried that did not pan out, in the hopes my experiences may help someone else save some time, or perhaps spur someone to explore a weak spot in my reasoning.
One of the first things I tried was to use RStudio Server’s project feature with other users. This ultimately failed.
Files and folders inside the RStudio project directory, .Rproj.user/
, always end up with no group write permissions, which means the last user of the project effectively takes the project hostage.
This effectively blocks the most obvious route, i.e., to create different user accounts belonging to the same group, and run concurrent RStudio Server sessions in different browsers while logged in as a different user in each one. I found that this works, technically, but only up until you try to actually open a project, which always fails (except by the user that created the project).
One idea is to set the umask value in the system Rprofile to something more permissive, which I tried but I found it had no apparent effect on RStudio Server’s newly created project directories (the setting did seem to affect the R session, except it did not alter the permissions on the *.Rproj
file). So, unfortunately, it seems the RStudio project feature is somehow setting its own more restrictive umask independently of the R session, effectively blocking this approach.
In short, I found no way to make RStudio projects work with more than one Linux user. Of course, abandoning the projects feature “solves” this issue, but that’s a trade-off I was not willing to make.
Sidenote: this could possibly, maybe, be solved by using ACL permissions. But that is a whole other can of worms, and I am far from convinced it is a viable approach. Haven’t tried it, though.
Links
Sort of a bibliography. May or may not have been cited/linked above.
R compilation from source
- https://rviews.rstudio.com/2018/03/21/multiple-versions-of-r/
- http://docs.rstudio.com/ide/server-pro/r-versions.html#recommended-installation-directories
- https://cran.r-project.org/doc/manuals/R-admin.html
- https://stackoverflow.com/questions/7541101/logic-of-installation-location-of-r-packages-under-linux
KVM setup
- https://linuxnewbieguide.org/how-to-setup-a-kvm-server-the-fast-way/
- https://www.ostechnix.com/setup-headless-virtualization-server-using-kvm-ubuntu/
- https://www.dedoimedo.com/computers/kvm-intro.html
- https://www.cyberciti.biz/faq/installing-kvm-on-ubuntu-16-04-lts-server/
- https://serverfault.com/questions/208693/difference-between-kvm-and-qemu
- http://rabexc.org/posts/how-to-get-started-with-libvirt-on
- http://www.jaredlog.com/?p=1484
- https://www.dedoimedo.com/computers/kvm-clone.html
KVM filesystem passthrough
- https://wiki.qemu.org/Documentation/9psetup
- http://www.linux-kvm.org/page/9p_virtio
- https://troglobit.github.io/2013/07/05/file-system-pass-through-in-kvm-slash-qemu-slash-libvirt/
- https://dustymabe.com/2012/09/11/share-a-folder-between-kvm-host-and-guest/
- https://pascalandreas.wordpress.com/2015/04/24/setting-up-kvm-shared-directory-in-ubuntu-14-04/
- https://serverfault.com/questions/342801/read-write-access-for-passthrough-9p-filesystems-with-libvirt-qemu
- https://askubuntu.com/questions/548208/sharing-folder-with-vm-through-libvirt-9p-permission-denied/567541
Network configuration, bridging, static DHCP, port forwarding
- https://www.linux-kvm.org/page/Networking#Public_Bridge
- https://www.dedoimedo.com/computers/kvm-bridged.html
- https://help.ubuntu.com/community/KVM/Networking
- https://www.cyberciti.biz/faq/debian-ubuntu-linux-kvm-guest-shared-physical-network/
- https://www.howtogeek.com/184310/ask-htg-should-i-be-setting-static-ip-addresses-on-my-router/
- https://wiki.dd-wrt.com/wiki/index.php/Static_DHCP
- https://superuser.com/questions/284051/what-is-port-forwarding-and-what-is-it-used-for
- https://www.lorextechnology.com/self-serve/port-forwarding-a-router/R-sc2900030
- https://help.ubnt.com/hc/en-us/articles/217367937-EdgeRouter-Port-Forwarding
Apache virtual hosts, proxies, etc.
- https://serverfault.com/questions/195611/how-do-i-redirect-subdomains-to-a-different-port-on-the-same-server/195831#195831
- https://serverfault.com/questions/370724/proxypass-apache-over-nonstandard-port-how-can-it-be-that-the-192-168-9999-w
- https://serverfault.com/questions/472482/proxypass-redirect-directory-url-to-non-standard-port
- https://stackoverflow.com/questions/21598787/retaining-protocol-and-port-number-from-reverse-proxy-request
Rstudio server configuration
Rstudio server multiple sessions
- Run multiple instances of RStudio in a web browser, StackOverflow
- Multiple R Sessions in RStudio Server Pro, RStudio Support
- Unable to run multiple RStudio Server sessions at the same time, Batch Connect – OSC RStudio Server
- Parallel processes for Rstudio-server, much like Jupyter Notebooks, RStudio Community
- Multiple Rstudio server sessions from one user account, RStudio Support
- Sharing a Project with multiple users
- Troubleshooting Project Sharing in RStudio Server Pro
- Using RStudio projects AND Git with multiple users
- https://discuss.ropensci.org/t/peace-between-git-and-dropbox-with-git-worktree/289
R and Rstudio environment (package management, etc.)
sessionInfo()
sessionInfo() ## R version 3.5.1 (2018-07-02) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 16.04.5 LTS ## ## Matrix products: default ## BLAS: /usr/lib/libblas/libblas.so.3.6.0 ## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] ggplot2_3.0.0 dplyr_0.7.6 magrittr_1.5 common_0.0.0.9012 ## [5] knitr_1.20 ## ## loaded via a namespace (and not attached): ## [1] Rcpp_0.12.18 bindr_0.1.1 munsell_0.5.0 ## [4] tidyselect_0.2.4 colorspace_1.3-2 R6_2.2.2 ## [7] rlang_0.2.2 plyr_1.8.4 stringr_1.3.1 ## [10] tools_3.5.1 grid_3.5.1 gtable_0.2.0 ## [13] xfun_0.3 withr_2.1.2 htmltools_0.3.6.9003 ## [16] lazyeval_0.2.1 yaml_2.2.0 rprojroot_1.3-2 ## [19] digest_0.6.17 assertthat_0.2.0 tibble_1.4.2 ## [22] crayon_1.3.4 bookdown_0.7 bindrcpp_0.2.2 ## [25] purrr_0.2.5 glue_1.3.0 evaluate_0.11 ## [28] rmarkdown_1.10 blogdown_0.8 stringi_1.2.4 ## [31] compiler_3.5.1 pillar_1.3.0 scales_1.0.0 ## [34] backports_1.1.2 pkgconfig_2.0.2
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.