New features in {overviewR}

[This article was first published on R-post on Cosima Meyer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

overviewR (v 0.0.10) is on CRAN and comes with new features πŸš€

The package is meant to serve as a Swiss army knife for exploratory data analysis. The basic functions allow you to investigate sample coverage across different time points, missing values across your variables, and also the overlap among two data sets.

Here are the changes in a nutshell:


First we start by installing the newest version and other packages that might be helpful.

# Load the newest CRAN version
install.packages("overviewR", force = TRUE)
library(overviewR) # Easily Extracting Information About Your Data
library(dplyr)
library(magrittr) # A Forward-Pipe Operator for R

Multiple time arguments

overview_tab allows you to use multiple time arguments. Here are some examples how to use the function:

Time can be a character vector containing one time variable (it can come in a YYYY or YYYY-MM-DD format and can either come as an integer or in the POSIXt format)

overview_tab(dat = toydata, id = ccode, time = year)

# A tibble: 5 Γ— 2
# Groups: ccode [5]
ccode time_frame
<chr> <chr>
1 AGO 1990 - 1992
2 BEN 1995 - 1999
3 FRA 1993, 1996, 1999
4 GBR 1991, 1993, 1995, 1997, 1999
5 RWA 1990 - 1995

It can also be a list containing multiple time variables (time = list(year = NULL, month = NULL, day = NULL)).

overview_tab(dat = toydata,
id = ccode,
time = list(year = toydata$year,
month = toydata$month,
day = toydata$day),
complex_date = TRUE)

# A tibble: 5 Γ— 2
# Groups: ccode [5]
ccode time_frame
<chr> <chr>
1 AGO 1990-01-01, 1990-02-02, …
2 BEN 1995-01-01, 1995-02-02, …
3 FRA 1993-01-01, 1993-02-02, …
4 GBR 1991-01-01, 1991-02-02, …
5 RWA 1990-01-01 - 1990-01-12, …

Colors in overview_plot

You can use colors in overview_plot to identify time periods. Here, we introduce a dummy variable that indicates whether the year was before 1995 or not. We use this dummy to color the time lines using the color argument.

# Code whether a year was before 1995
toydata %<>%
dplyr::mutate(before = ifelse(year < 1995, 1, 0))
# Plot using the `color` argument
overview_plot(dat = toydata, id = ccode, time = year, color = before)

small_image


Change dot size in overview_plot

You can also change the dot size in overview_plot.

# Plot using the `color` argument
overview_plot(dat = toydata, id = ccode, time = year, dot_size = 5)

small_image


Visuale cross plots with overview_crossplot

overview_crosstab has now its visualizing counter-part with overview_crossplot!

overview_crossplot(
toydata,
id = ccode,
time = year,
cond1 = gdp,
cond2 = population,
threshold1 = 25000,
threshold2 = 27000,
color = TRUE,
label = TRUE
)

small_image


Compare two datasets directly

Using overview_overlap, you can now compare the overlap in time and id variables across two data sets visually.

# Subset one data set for comparison
toydata2 <- toydata %>% dplyr::filter(year > 1992)
overview_overlap(
dat1 = toydata,
dat2 = toydata2,
dat1_id = ccode,
dat2_id = ccode,
plot_type = "bar" # This is the default
)

small_image


Use data.table under the hood

And, last but not least, overview_tab and overview_na now also work if you’re using data.table objects πŸ₯³ (Thanks to my old team @ Kienbaum for being patient enough to explain and let me learn the (not so intuitive) syntax πŸ‘©πŸΌβ€πŸ’»)

Here’s a more detailed overview of what each function can do:

Works with data.frame objects Works with data.table Multiple time arguments
overview_tab βœ“ βœ“ βœ“
overview_na βœ“ βœ“
overview_plot βœ“
overview_crossplot βœ“
overview_crosstab βœ“
overview_heat βœ“
overview_overlap βœ“
overview_print βœ“

New website

And, as a bonus, we also updated our package website using the {preferably} theme ✨

small_image

To leave a comment for the author, please follow the link and comment on their blog: R-post on Cosima Meyer.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)