Securely storing your secrets in R code
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Andrie de Vries
Last month I wrote about How to store and use webservice keys and authentication details, a summary of the options mentioned in a twitter discussion started by Jennifer Bryan. All of the options in my article really stored the secrets in plain text somewhere on your system, but in such a way to minimize the risk of accidentally publishing the secrets. Since then, I've had several comments (via twitter as well as the blog comments), about alternative options to really store your keys securely:
Using an encrypted disk
The first solution is to use an encrypted disk for the storage. As long as the encrypted disk is mounted, you work with the contents as if it is available in plain text. Thus, from the point of view of the R user, you simply store your secrets in “plain text” somewhere on the encrypted disk.
Using the digest package
The second alternative is to use the digest package, maintained by Dirk Eddelbuettel. Stephane Doyen provided the following solution:
- I use the digest package that allows AES encryption.
- I use a two functions one that write AES encrypted files, the other that read and decrypt those files. You can find those functions at github
- Finally, I use the digest package to generate the key required to encrypt and decrypt files.
- Once all of this is in place I create a dataframe that contains the login and the password.
- I use the write.aes() function to write the credential locally in an encrypted file
- The read.aes() allows to decrypt the credentials and import it in R
That way no credential appears in plain text or in the code. Additionally, one could decide to store the key elsewhere (remote server, usb drive, ect..). Also, this solution does not require to prompt for password each time.
Stephane provides this sample code to illustrate:
source("crypt.R") load("key.RData") credentials <- data.frame(login = "foo", password = "bar", stringsAsFactors = FALSE) write.aes(df = credentials, filename = "credentials.txt",key = key) rm(credentials) credentials <- read.aes(filename = "credentials.txt",key = key) print(credentials)
Using the sodium package
The second option is to use the sodium package, written by Jeroen Ooms. The sodium package is an R wrapper around the libsodium cryptographic library.
Bindings to libsodium: a modern, easy-to-use software library for encryption, decryption, signatures, password hashing and more. Sodium uses curve25519, a state-of-the-art Diffie-Hellman function by Daniel Bernstein, which has become very popular after it was discovered that the NSA had backdoored Dual EC DRBG.
What this means is that you can use sodium to configure secure communication, including using asymmetric keys, directly from R.
To use sodium to encrypt your keys, you can use a similar strategy as described above in the section on digest.
Jeroen provides two excellent vignettes for the package:
These vignettes give a very easy to follow an illuminating overview of encryption as well as symmetric and asymmetric keys.
Using the secure package
The final option is to use the secure package, written by Hadley Wickham. From the package readme:
The secure package provides a secure vault within a publicly available code repository. It allows you to store private information in a public repository so that only select people can read it. This is particularly useful for testing because you can now store private credentials in your public repo, without them being readable by the world.
Secure is built on top of asymmetric (public/private key) encryption. Secure generates a random master key and uses that to encrypt (with AES256) each file in vault/
. The master key is not stored unencrypted anywhere; instead, an encrypted copy is stored for each user, using their own public key. Each user can than decrypt the encrypted master key using their private key, then use that to decrypt each file.
To understand how this works, might require careful study.
However, the bottom line is this:
- The secrets are stored in the repository using a key that consists of the public keys of each person you want to be able to decrypt.
- You can use the public key available in github for every user.
- You can also use a public key of Travis, if you use continuous integration
Hadley gives step-by-step instructions for using the package at the github repository.
Conclusion
You can use several mechanisms to store your secrets using packages that are readily available on CRAN and github.
Let me know how your experience in the comments!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.