Site icon R-bloggers

{filebin} Quick & Easy File Sharing

[This article was first published on R - datawookie, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

At Fathom Data we have a number of workflows that require us to share various bits of data for a short time. The data are not sensitive, so we can freely share them. We have been doing this manually via platforms like Google Drive, Box or Dropbox. However we need to remember to go back and delete the file some time later. This is not ideal. What we needed was a simple “fire and forget” solution which would allow us to share the files and they would disappear automatically after some time. Well, this is precisely what Filebin does.

Filebin allows you to upload and share a file. The file can then be deleted at any time and, if not manually deleted, then will be automatically removed after 6 days.

{filebin} R Package

There’s a neat Filebin API, so I built a little wrapper package, {filebin}, which allows direct access from R.

Install the package.

remotes::install_github("datawookie/filebin")

Load the package and check the version.

library(filebin)

packageVersion("filebin")
[1] ‘0.0.3’

Posting a File

I’ve got copies of a selection of Open Source licenses.

licenses
[1] "license-AGPL-3.md"   "license-apache-2.md" "license-cc0.md"     
[4] "license-ccby-4.md"   "license-GPL-2.md"    "license-GPL-3.md"   
[7] "license-LGPL-2.1.md" "license-LGPL-3.md"   "license-mit.md"     

Let’s upload the LGPL to Filebin.

lgpl <- post("license-LGPL-3.md")
str(lgpl)
tibble [1 × 9] (S3: tbl_df/tbl/data.frame)
 $ url         : chr "https://filebin.net/d4i1rhv6ic6kl8fz/license-LGPL-3.md"
 $ bin         : chr "d4i1rhv6ic6kl8fz"
 $ filename    : chr "license-LGPL-3.md"
 $ content_type: chr "text/plain; charset=utf-8"
 $ bytes       : int 7560
 $ md5         : chr "YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY="
 $ sha256      : chr "446e755fae55ff034bbb21be44670b5f116c2b2667947e7036f2bfe6632539a8"
 $ created     : chr "2021-11-18T07:25:11.148268Z"
 $ updated     : chr "2021-11-18T07:25:11.148268Z"

The result contains the following fields:

  • url — the URL at which the file can be accessed
  • bin — the bin containing the file
  • filename — the file name
  • content_type — the inferred MIME type of the file
  • bytes — the file size
  • md5 — the MD5 checksum
  • sha256 — the SHA256 hash
  • created — the time at which the file was uploaded
  • updated — the time at which it was updated (or created if not updated).

The MD5 checksum is Base64 encoded.

md5sum("license-LGPL-3.md") %>% charToRaw() %>% base64enc::base64encode()
[1] "YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY="

Bins

Files are organised into bins, which are analogous to folders or directories. By default the name of the bin is a random selection of text characters (see output above, where the bin name is d4i1rhv6ic6kl8fz). However, you can use the bin argument to specify a bin name.

gpl <- post("license-GPL-3.md", bin = "licenses")
str(gpl)
tibble [1 × 9] (S3: tbl_df/tbl/data.frame)
 $ url         : chr "https://filebin.net/licenses/license-GPL-3.md"
 $ bin         : chr "licenses"
 $ filename    : chr "license-GPL-3.md"
 $ content_type: chr "text/plain; charset=utf-8"
 $ bytes       : int 34904
 $ md5         : chr "MjlhOTAxMjk0MWE2YmNiMjZiZjBmYjQzODJjNWRkNzU="
 $ sha256      : chr "585e25ef8f5946a52bf2aed68d5becfc38be94a8663aa01c1b31d88aa57f1de3"
 $ updated     : chr "2021-11-18T07:28:21.876551Z"
 $ created     : chr "2021-11-18T07:28:21.876551Z"

Multiple Files

You can simultaneously upload multiple files.

post(c(
  "license-AGPL-3.md",
  "license-GPL-2.md",
  "license-GPL-3.md",
  "license-LGPL-2.1.md",
  "license-LGPL-3.md"
)) %>% select(url, created, updated)
# A tibble: 5 × 3
  url                                                      created                    updated                    
  <chr>                                                    <chr>                      <chr>                      
1 https://filebin.net/87ve2dy4mif2ci9v/license-AGPL-3.md   2021-11-18T07:49:22.51542Z 2021-11-18T07:49:22.51542Z
2 https://filebin.net/87ve2dy4mif2ci9v/license-GPL-2.md    2021-11-18T07:49:23.31774Z 2021-11-18T07:49:23.31774Z
3 https://filebin.net/87ve2dy4mif2ci9v/license-GPL-3.md    2021-11-18T07:49:23.54684Z 2021-11-18T07:49:23.54684Z 
4 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md 2021-11-18T07:49:24.14585Z 2021-11-18T07:49:24.14585Z
5 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-3.md   2021-11-18T07:49:26.82930Z 2021-11-18T07:49:26.82930Z

When you upload multiple files they all end up in the same bin. Each file is uploaded sequentially and assigned a created and updated time.

Updating a File

You can update an existing file. In order to update a file rather than simply create a new upload, you need to specify the bin of the existing upload.

post("license-LGPL-2.1.md", bin = "87ve2dy4mif2ci9v") %>% select(url, created, updated)
# A tibble: 1 × 3
  url                                                      created                    updated                    
  <chr>                                                    <chr>                      <chr>                      
1 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md 2021-11-18T07:49:24.14585Z 2021-11-18T07:50:57.52473Z

The created time is consistent with the original time that the file was uploaded (see above), but the updated time has been modified.

Retrieving a File

You can share either the url or filename and bin. The file can then either be downloaded via a browser, on the command line using curl or wget, or in R. Of course we are interested in the last option.

# Retrieve file using URL.
#
file_get("https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md")

# Retrieve file using filename and bin.
#
file_get(
  "license-LGPL-2.1.md",
  "87ve2dy4mif2ci9v",
  overwrite = TRUE
)

In the second call to file_get() we need to specify the overwrite option so that the second download overwrites the result of the first download.

Checking on a Bin

We can interrogate a bin using the bin_get() function.

licenses <- bin_get("87ve2dy4mif2ci9v")

The result is a list with two components, bin and files. The bin component includes the number of files and the total size. The readonly field indicates whether the bin has been locked for further updates.

str(licenses$bin)
tibble [1 × 7] (S3: tbl_df/tbl/data.frame)
 $ id      : chr "87ve2dy4mif2ci9v"
 $ readonly: logi FALSE
 $ bytes   : int 121022
 $ files   : int 5
 $ updated : chr "2021-11-18T07:50:57.529291Z"
 $ created : chr "2021-11-18T07:49:22.39557Z"
 $ expired : chr "2021-11-25T07:50:57.52929Z"

The files component has the details of each of the files in the bin.

licenses$files %>% select(filename, content_type, bytes, md5)
# A tibble: 5 × 4
  filename            content_type              bytes md5                                         
  <chr>               <chr>                     <int> <chr>                                       
1 license-AGPL-3.md   text/plain; charset=utf-8 34303 ZmIwMTYyNWVmMDE5NzM0OTBiY2Y0ZWZiOWFkZTIzYWU=
2 license-GPL-2.md    text/plain; charset=utf-8 17941 M2Q4Mjc4MGU4OTE3YjM2MGNiZWU3YjllYzNlNDA3MzQ=
3 license-GPL-3.md    text/plain; charset=utf-8 34904 MjlhOTAxMjk0MWE2YmNiMjZiZjBmYjQzODJjNWRkNzU=
4 license-LGPL-2.1.md text/plain; charset=utf-8 26314 OGY1MTA3ZDk4NzU3NzExZWNjMWIwN2FjMzM4Nzc1NjQ=
5 license-LGPL-3.md   text/plain; charset=utf-8  7560 YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY=

Locking a Bin

It’s possible to lock a bin, making it read only. Once locked, a bin will not accept new file uploads nor updates of existing files.

bin_lock("87ve2dy4mif2ci9v")

If we check back on the readonly field for this bin we find that it’s now TRUE.

bin_get("87ve2dy4mif2ci9v")$bin$readonly
[1] TRUE

Bin QR Code

A QR code is a handy way to share content. You can generate a QR code pointing to a bin as a PNG copy of with bin_qr_code().

bin_qr_code("87ve2dy4mif2ci9v")
[1] "87ve2dy4mif2ci9v.png"

Try it out. If you scan this code you’ll get the URL https://filebin.net/87ve2dy4mif2ci9v, which will not be valid after 25 November 2021 when it expires.

Deleting

You can delete individual files with file_delete() and whole bins with bin_delete(). Note: It’s still possible to delete a locked bin.

Conclusion

We’re going to be integrating the {filebin} package into a number of existing workflows.

< !-- The tweet was dumped by script (search elsewhere in project for tweet images). -->

To leave a comment for the author, please follow the link and comment on their blog: R - datawookie.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.