Compute/Visualize Drive Space Consumption of Your Installed R Packages

[This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The fs package makes it super quick and easy to find out just how much “package hoarding” you’ve been doing:

library(fs)
library(ggalt) # devtools::install_github("hrbrmstr/ggalt")
library(igraph) 
library(ggraph) # devtools::install_github("thomasp85/igraph")
library(hrbrthemes) # devtools::install_github("hrbrmstr/hrbrthemes")
library(tidyverse)

installed.packages() %>%
  as_data_frame() %>%
  mutate(pkg_dir = sprintf("%s/%s", LibPath, Package)) %>%
  select(pkg_dir) %>%
  mutate(pkg_dir_size = map_dbl(pkg_dir, ~{
    fs::dir_info(.x, all=TRUE, recursive=TRUE) %>%
      summarise(tot_dir_size = sum(size)) %>% 
      pull(tot_dir_size)
  })) %>% 
  summarise(
    total_size_of_all_installed_packages=ggalt::Gb(sum(pkg_dir_size))
  ) %>% 
  unlist()
## total_size_of_all_installed_packages 
##                             "1.6 Gb"

While you can modify the above and peruse the list of packages/directories in tabular format or programmatically, you can also do a bit more work to get a visual overview of package size (click/tap the image for a larger view):

installed.packages() %>%
  as_data_frame() %>%
  mutate(pkg_dir = sprintf("%s/%s", LibPath, Package)) %>%
  mutate(dir_info = map(pkg_dir, fs::dir_info, all=TRUE, recursive=TRUE)) %>% 
  mutate(dir_size = map_dbl(dir_info, ~sum(.x$size))) -> xdf

select(xdf, Package, dir_size) %>% 
  mutate(grp = "ROOT") %>% 
  add_row(grp = "ROOT", Package="ROOT", dir_size=0) %>% 
  select(grp, Package, dir_size) %>% 
  arrange(desc(dir_size)) -> gdf

select(gdf, -grp) %>% 
  mutate(lab = sprintf("%s\n(%s)", Package, ggalt::Mb(dir_size))) %>% 
  mutate(lab = ifelse(dir_size > 1500000, lab, "")) -> vdf

g <- graph_from_data_frame(gdf, vertices=vdf)

ggraph(g, "treemap", weight=dir_size) +
  geom_node_tile(fill="lightslategray", size=0.25) +
  geom_text(
    aes(x, y, label=lab, size=dir_size), 
    color="#cccccc", family=font_ps, lineheight=0.875
  ) +
  scale_x_reverse(expand=c(0,0)) +
  scale_y_continuous(expand=c(0,0)) +
  scale_size_continuous(trans="sqrt", range = c(0.5, 8)) +
  ggraph::theme_graph(base_family = font_ps) +
  theme(legend.position="none")

treemap of package disk consumption

Challenge

Do some wrangling with the above data and turn it into a package “disk explorer” with @timelyportfolio’s d3treeR? package.

To leave a comment for the author, please follow the link and comment on their blog: R – rud.is.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)