validate version 1.5 is out
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A new version of the validate package for data validation was just accepted on CRAN and will be available on all mirrors in a few days.
The most important addition is that you can now reference the data set as a whole, using the “dot” syntax like so:
iris %>% check_that( nrow(.)>100 , "Sepal.Width" %in% names(.)) %>% summary() rule items passes fails nNA error warning expression 1 V1 1 1 0 0 FALSE FALSE nrow(.) > 100 2 V2 1 1 0 0 FALSE FALSE "Sepal.Width" %in% names(.)
Also, it is now possible to return a logical, even when the result is NA, by passing the na.value
option.
dat = data.frame(x=c(1,NA,-1)) v = validator(x > 0) values(confront(dat,v)) V1 [1,] TRUE [2,] NA [3,] FALSE values(confront(dat,v,na.value=FALSE)) V1 [1,] TRUE [2,] FALSE [3,] FALSE
A complete list of changes and bugfixes can be found in the NEWS file. Below I include changes in 1.4 since I did not write about it before.
I will be talking about this package at the upcoming useR!2016 event, so join me if you’re interested!
version 0.1.5
– The ‘.’ is now used to reference the validated data set as whole.
– Small change in output of ‘compare’ to match the table in van den Broek et al. (2013)
version 0.1.4
– ‘confront’ now emits a warining when variable name conflicts with name of a reference data set
– Deprecated ‘validate_reset’, in favour of the shorter ‘reset’ (use ‘validate::reset’ in case of ambiguity)
– Deprecated ‘validate_options’ in favour of the shorter ‘voptions’
– New option na.value with default value NA, controlling the output when a rule evaluates to NA.
– Added rules from the ESSnet on validation (deliverable 17) to automated tests.
– added ‘grepl’ to allowed validation syntax (suggested by Dusan Sovic)
– exported a few functions w/ keywords internal for extensibility
– Bugfix: blocks sometimes reported wrong nr of blocks (in case of a single connected block.)
– Bugfix: macro expansion failed when macros were reused in other macros.
– Bugfix: certain nonlinear relations were recognized as linear
– Bugfix: rules that use (anonymous) function definitions raised error when printed.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.