RStudio Workbench Load Balancing Changes
As we’re putting the finishing touches on the RStudio Workbench 2021.09.0 “Ghost Orchid” release, we’d like to share one of the new sets of features we’re most excited about. We’ve revisited and revamped the administration experience for load balancing clusters.
Specifically, we’ve worked to improve the cluster management and troubleshooting. To make this possible, cluster data is now stored within the internal database. The load balancing configuration file no longer requires a list of each node in the cluster. In fact, the file can be completely empty – though its presence is required. This means nodes can join and leave the cluster without bringing down and re-configuring every node – scaling your cluster has never been easier!
When provided an empty configuration file, RStudio Workbench predicts the address that other nodes can reach each node at. For more complicated configurations, we’ve included an escape hatch through the new www-host-name
option which be can included in the file to instruct RStudio Workbench to use a specified hostname. A detailed explanation of the approach taken to determine each node’s address and the new option can be found in the Admin Guide.
Furthermore, we’ve added several new commands to the rstudio-server
admin tool to improve load balancing cluster management.
The first command, rstudio-server list-nodes
displays each node and information about its current status. It is intended to be use in conjunction with the existing status endpoint (accessed through curl http://localhost:8787/load-balancer/status
) to monitor the status of your nodes and aid in identifying and addressing issues.
The following is an example of this output:
$ sudo rstudio-server list-nodes Cluster ------- Protocol Http Nodes ----- ID Host IPv4 Port Status Last Seen 1 rsw-primary 172.98.8.241 80 Online 2021-Sep-20 17:08:53 2 rsw-secondary 172.98.14.255 80 Invalid secure cookie key 2021-Sep-20 17:10:25 3 rsw-tertiary 172.98.6.205 80 Offline 2021-Sep-20 17:10:34
Because load balancing now makes use of the internal database, each node validates its secure cookie key and configured protocol against the database before coming online. The first node online sets the values used for validation. The results of that validation are stored in the database and easily retrievable through the rstudio-server list-nodes
command, allowing for easy troubleshooting when encountering unexpected issues with your cluster.
We’ve added the command rstudio-server reset-cluster
to reset the cluster’s state used for validation. This should be run after replacing the secure cookie key on each node or after updating the protocol the cluster is using (http
, https
, or https-no-verify
). Again, the first node brought online or restarted after this reset will determine the configuration used for validation.
Finally, the command rstudio-server delete-node <node-id>
allows you to easily remove nodes from the cluster. The required node-id
parameter can be retrieved from the output of the rstudio-server list-nodes
command. When a node is deleted, the other nodes in the cluster will no longer try to contact that node; there is no need to restart the active nodes after running this. This command should only be used for nodes that are offline and will not be coming back online.
There are many more features coming with this release. If you’re interested in giving them a try, check out the RStudio 2021.09.0 Preview for the latest installers and release notes.