Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
R 3.2.0 (codename “Full of Ingredients”) was released yesterday. You can get the latest binaries version from here. (or the .tar.gz source code from here). The full list of new features and bug fixes is provided below.
Upgrading to R 3.2.0 on Windows
If you are using Windows you can easily upgrade to the latest version of R using the installr package. Simply run the following code:
# installing/loading the latest installr package: install.packages("installr"); library(installr) #load / install+load installr updateR() # updating R. |
Running “updateR()” will detect if there is a new R version available, and if so it will download+install it (etc.).
I try to keep the installr package updated and useful, so if you have any suggestions or remarks on the package – you are invited to leave a comment below.
CHANGES IN R 3.2.0:
As always, David smith mentioned in his post some of the main changes, writing how many of the changes in this release have happened behind the scenes to improve R’s engine for performance and reliability. These include:
- A number of fixes proposed by Radford Neal, bringing some of the performance improvements of pqR to R while maintaining backwards compatibility.
- more progress in handling big in-memory data objects (for example, you can now cbind/rbind matrices with more than 2 billion elements).
- some significant updates to R’s byte compiler with new instructions that allow many scalar subsetting and assignment and scalar arithmetic operations to be handled more efficiently. This can result in significant performance improvements in scalar numerical code.
- the package-checking system now does a more thorough job of making sure contributed packages comply with CRAN policies.
And here is also the full list of new features, bug fixes, etc:
NEW FEATURES
anyNA()
gains arecursive
argument.- When
x
is missing andnames
is not false (including the default value),Sys.getenv(x, names)
returns an object of class"Dlist"
and hence prints tidily. - (Windows.)
shell()
no longer consults the environment variable SHELL: too many systems have been encountered where it was set incorrectly (usually to a path where software was compiled, not where it was installed). R_SHELL, the preferred way to select a non-default shell, can be used instead. - Some unusual arguments to
embedFonts()
can now be specified as character vectors, and the defaults have been changed accordingly. - Functions in the
Summary
group duplicate less. (PR#15798) - (Unix-alikes.)
system(cmd, input = )
now uses ‘shell-execution-environment’ redirection, which will be more natural ifcmd
is not a single command (but requires a POSIX-compliant shell). (Wish of PR#15508) read.fwf()
andread.DIF()
gain afileEncoding
argument, for convenience.- Graphics devices can add attributes to their description in
.Device
and.Devices
. Several of those included with R use a"filepath"
attribute. pmatch()
uses hashing in more cases and so is faster at the expense of using more memory. (PR#15697)pairs()
gains new arguments to select sets of variables to be plotted against each other.file.info(, extra_cols = FALSE)
allows a minimal set of columns to be computed on Unix-alikes: on some systems without properly-configured caching this can be significantly faster with large file lists.- New function
dir.exists()
in package base to test efficiently whether one or more paths exist and are directories. dput()
and friends gain new controls hexNumeric and digits17 which output double and complex quantities as, respectively, binary fractions (exactly, seesprintf("%a")
) and as decimals with up to 17 significant digits.save()
,saveRDS()
andserialize()
now supportascii = NA
which writes ASCII files usingsprintf("%a")
for double/complex quantities. This is read-compatible withascii = TRUE
but avoids binary->decimal->binary conversions with potential loss of precision. Unfortunately the Windows C runtime’s lack of C99 compliance means that the format cannot be read correctly there in R before 3.1.2.- The default for
formatC(decimal.mark =)
has been changed to begetOption("OutDec")
; this makes it more consistent withformat()
and suitable for use in print methods, e.g. those for classes"density"
,"ecdf"
,"stepfun"
and"summary.lm"
.getOption("OutDec")
is now consulted by the print method for class"kmeans"
, bycut()
,dendrogram()
,plot.ts()
andquantile()
when constructing labels and for the report fromlegend(trace = TRUE)
.(In part, wish of PR#15819.)
printNum()
and henceformat()
andformatC()
give a warning ifbig.mark
anddecimal.mark
are set to the same value (period and comma are not uncommonly used for each, and this is a check that conventions have not got mixed).merge()
can create a result which uses long vectors on 64-bit platforms.dget()
gains a new argumentkeep.source
which defaults toFALSE
for speed (dput()
anddget()
are most often used for data objects where this can makedget()
many times faster).- Packages may now use a file of common macro definitions in their help files, and may import definitions from other packages.
- A number of macros have been added in the new ‘share/Rd’ directory for use in package overview help pages, and
promptPackage()
now makes use of them. tools::parse_Rd()
gains a newpermissive
argument which converts unrecognized macros into text. This is used byutils:::format.bibentry
to allow LaTeX markup to be ignored.options(OutDec =)
can now specify a multi-byte character, e.g.,options(OutDec = "u00b7")
in a UTF-8 locale.is.recursive(x)
is no longer true whenx
is an external pointer, a weak reference or byte code; the first enablesall.equal(x, x)
whenx <- getClass(.)
.ls()
(akaobjects()
) andas.list.environment()
gain a new argumentsorted
.- The
"source"
attribute (which has not been added to functions by R since before R version 2.14.0) is no longer treated as special. - Function
returnValue()
has been added to giveon.exit()
code access to a function’s return value for debugging purposes. crossprod(x, y)
allows more matrix coercions whenx
ory
are vectors, now equallingt(x) %*% y
in these cases (also reported by Radford Neal). Similarly,tcrossprod(x,y)
and%*%
work in more cases with vector arguments.- Utility function
dynGet()
useful for detecting cycles, aka infinite recursions. - The byte-code compiler and interpreter include new instructions that allow many scalar subsetting and assignment and scalar arithmetic operations to be handled more efficiently. This can result in significant performance improvements in scalar numerical code.
apply(m, 2, identity)
is now the same as the matrixm
when it has named row names.- A new function
debuggingState()
has been added, allowing to temporarily turn off debugging. example()
gets a new optional argumentrun.donttest
andtools::Rd2ex()
a correspondingcommentDonttest
, with a default such thatexample(..)
in help examples will rundonttest
code only if used interactively (a change in behaviour).rbind.data.frame()
gains an optional argumentmake.row.names
, for potential speedup.- New function
extSoftVersion()
to report on the versions of third-party software in use in this session. Currently reports versions ofzlib
,bzlib
, theliblzma
fromxz
, PCRE, ICU, TRE and theiconv
implementation.A similar function
grSoftVersion()
in package grDevices reports on third-party graphics software.Function
tcltk::tclVersion()
reports the Tcl/Tk version. - Calling
callGeneric()
without arguments now works with primitive generics to some extent. vapply(x, FUN, FUN.VALUE)
is more efficient notably for largelength(FUN.VALUE)
; as extension of PR#16061.as.table()
now allows tables with one or more dimensions of length 0 (such asas.table(integer())
).names(x) <- NULL
now clears the names of call and...
objects.library()
will report a warning when an insufficient dependency version is masking a sufficient one later on the library search path.- A new
plot()
method for class"raster"
has been added. - New
check_packages_in_dir_changes()
function in package tools for conveniently analyzing how changing sources impacts the check results of their reverse dependencies. - Speed-up from Peter Haverty for
ls()
andmethods:::.requirePackage()
speeding up package loading. (PR#16133) - New
get0()
function, combiningexists()
andget()
in one call, for efficiency. match.call()
gains anenvir
argument for specifying the environment from which to retrieve the...
in the call, if any; this environment was wrong (or at least undesirable) when thedefinition
argument was a function.topenv()
has been made.Internal()
for speedup, based on Peter Haverty’s proposal in PR#16140.getOption()
no longer callsoptions()
in the main case.- Optional use of
libcurl
(version 7.28.0 from Oct 2012 or later) for Internet access:capabilities("libcurl")
reports if this is available.libcurlVersion()
reports the version in use, and other details of the"libcurl"
build including which URL schemes it supports.curlGetHeaders()
retrieves the headers forhttp://
,https://
,ftp://
andftps://
URLs: analysis of these headers can provide insights into the ‘existence’ of a URL (it might for example be permanently redirected) and is so used inR CMD check --as-cran
.download.file()
has a new optional method"libcurl"
which will handle more URL schemes, follow redirections, and allows simultaneous downloads of multiple URLs.url()
has a new method"libcurl"
which handles more URL schemes and follows redirections. The default method is controlled by a new optionurl.method
, which applies also to the opening of URLs viafile()
(which happens implicitly in functions such asread.table
.)- When
file()
orurl()
is invoked with ahttps://
orftps://
URL which the current method cannot handle, it switches to a suitable method if one is available.
- (Windows.) The DLLs ‘internet.dll’ and ‘internet2.dll’ have been merged. In this version it is safe to switch (repeatedly) between the internal and Windows internet functions within an Rsession.
The Windows internet functions are still selected by flag –internet2 or
setInternet2()
. This can be overridden for anurl()
connection via its newmethod
argument.download.file()
has new method"wininet"
, selected as the default by –internet2 orsetInternet2()
. parent.env<-
can no longer modify the parent of a locked namespace or namespace imports environment. Contributed by Karl Millar.- New function
isLoadedNamespace()
for readability and speed. names(env)
now returns all the object names of anenvironment
env
, equivalently tols(env, all.names = TRUE, sorted = FALSE)
and also to the names of the corresponding list,names(as.list(env, all.names = TRUE))
. Note that althoughnames()
returns a character vector, the names have no particular ordering.- The memory manager now grows the heap more aggressively. This reduces the number of garbage collections, in particular while data or code are loaded, at the expense of slightly increasing the memory footprint.
- New function
trimws()
for removing leading/trailing whitespace. cbind()
andrbind()
now consider S4 inheritance during S3 dispatch and also obeydeparse.level
.cbind()
andrbind()
will delegate recursively tomethods::cbind2
(methods::rbind2
) when at least one argument is an S4 object and S3 dispatch fails (due to ambiguity).- (Windows.)
download.file(quiet = FALSE)
now uses text rather than Windows progress bars in non-interactive use. - New function
hsearch_db()
in package utils for building and retrieving the help search database used byhelp.search()
, along with functions for inspecting the concepts and keywords in the help search database. - New function
.getNamespaceInfo()
, a no-check version ofgetNamespaceInfo()
mostly for internal speedups. - The help search system now takes keyword entries in Rd files which are not standard keywords (as given in ‘KEYWORDS’ in the R documentation directory) as concepts. For standard keyword entries the corresponding descriptions are additionally taken as concepts.
- New
lengths()
function for getting the lengths of all elements in a list. - New function
toTitleCase()
in package tools, tailored to package titles. - The matrix methods of
cbind()
andrbind()
allow matrices as inputs which have 2^31 or more elements. (Forcbind()
, wish of PR#16198.) - The default method of
image()
has an explicit check for a numeric or logical matrix (which was always required). URLencode()
will not by default encode further URLs which appear to be already encoded.BIC(mod)
andBIC(mod, mod2)
now give non-NA numbers forarima()
fitted models, asnobs(mod)
now gives the number of “used” observations for such models. This fixes PR#16198, quite differently than proposed there.- The
print()
methods for"htest"
,"pairwise.htest"
and"power.htest"
objects now have adigits
argument defaulting to (a function of)getOption("digits")
, and influencing all printed numbers coherently. Unavoidably, this changes the display of such test results in some cases. - Code completion for namespaces now recognizes all loaded namespaces, rather than only the ones that are also attached.
- The code completion mechanism can now be replaced by a user-specified completer function, for (temporary) situations where the usual code completion is inappropriate.
unzip()
will now warn if it is able to detect truncation when unpacking a file of 4GB or more (related to PR#16243).methods()
reports S4 in addition to S3 methods; output is simplified when theclass
argument is used..S3methods()
andmethods::.S4methods()
report S3 and S4 methods separately.- Higher order functions such as the
apply
functions andReduce()
now force arguments to the functions they apply in order to eliminate undesirable interactions between lazy evaluation and variable capture in closures. This resolves PR#16093.
INSTALLATION and INCLUDED SOFTWARE
- The
donttest
sections of R‘s help files can be tested by
make check TEST_DONTTEST=TRUE
. - It is possible to request the use of system
valgrind
headers viaconfigure
option –with-system-valgrind-headers: note the possible future incompatibility of such headers discussed in the ‘R Installation and Administration’ manual. (Wish of PR#16068.) - The included version of
liblzma
has been updated toxz-utils
5.0.7 (minor bug fixes from 5.0.5). configure
options –with-system-zlib, –with-system-bzlib and –with-system-pcre are now the default. For the time being there is fallback to the versions included in the R sources if no system versions are found or (unlikely) if they are too old.Linux users should check that the
-devel
or-dev
versions of packages zlib, bzip2/libbz2 and pcre as well as xz-devel/liblzma-dev (or similar names) are installed.configure
by default looks for thetexi2any
script from texinfo 5.1 or later, rather than themakeinfo
program. (makeinfo
is a link to the Perl scripttexi2any
in texinfo 5.x.)R CMD INSTALL
gains an option –built-timestamp=STAMP allowing 100% reproducible package building, thanks to Dirk Eddelbuettel.
UTILITIES
- There is support for testing the
dontrun
anddonttest
parts of examples in packages.tools::testInstalledPackage()
accepts new argumentscommentDontrun = FALSE
andcommentDonttest = FALSE
.R CMD check
gains options –run-dontrun and –run-donttest. - The HTML generated by
tools::Rd2HTML()
andtools::toHTML()
methods is now ‘XHTML 1.0 Strict’. - The compiler package’s utility function
setCompilerOptions()
now returns the old values invisibly. The initial optimization level can also be set with the environment variableR_COMPILER_OPTIMIZE. R CMD build
adds a NeedsCompilation field if one is not already present in the ‘DESCRIPTION’ file.R CMD check
gains option –test-dir to specify an alternative set of tests to run.R CMD check
will now by default continue with testing after many types of errors, and will output a summary count of errors at the end if any have occurred.R CMD check
now checks that the Title and Description fields are correctly terminated.R CMD check --as-cran
now:- checks a ‘README.md’ file can be processed: this needs
pandoc
installed. - checks the existence and accessibility of URLs in the ‘DESCRIPTION’, ‘CITATION’, ‘NEWS.Rd’ and ‘README.md’ files and in the help files (provided the build has
libcurl
support). - reports non-ASCII characters in R source files when there is no package encoding declared in the ‘DESCRIPTION’ file.
- reports (apparent) S3 methods exported but not registered.
- reports overwriting registered S3 methods from base/recommended packages. (Such methods are replaced in the affected package for the rest of the session, even if the replacing namespace is unloaded.)
- reports if the
Title
field does not appear to be in title case (see ‘Writing R Extensions’: there may be false positives, but note that technical words should be single-quoted and will then be accepted).
Most of these checks can also be selected by environment variables: see the ‘R Internals’ manual.
- checks a ‘README.md’ file can be processed: this needs
C-LEVEL FACILITIES
- New C API utility
logspace_sum(logx[], n)
. - Entry points
rbinom_mu
,rnbinom_mu
andrmultinom
are remapped (by default) toRf_rbinom_mu
etc. This requires packages using them to be re-installed. .C(DUP = FALSE)
and.Fortran(DUP = FALSE)
are now ignored, so arguments are duplicated ifDUP = TRUE
would do so. As their help has long said,.Call()
is much preferred.- New entry point
R_allocLD
, likeR_alloc
but guaranteed to have sufficient alignment forlong double
pointers. isPairList()
now returnsTRUE
for DOTSXP.
WINDOWS BUILD CHANGES
A number of changes to the Windows build system are in development. The following are currently in place.
- Installation using external binary distributions of zlib, bzip2, liblzma, pcre, libpng, jpeglib and libtiff is now required, and the build instructions have been revised.
- A new
make
targetrsync-extsoft
has been added to obtain copies of the external libraries from CRAN. - Building the manuals now requires
texi2any
from texinfo 5.1 or later. CRAN binary builds include the manuals, but by default builds from source will not, and they will be accessed fromCRAN. See the comments in ‘src/gnuwin32/MkRules.dist’ for how to specify the location oftexi2any
. - (Windows) Changes have been made to support an experimental Windows toolchain based on GCC 4.9.2. The default toolchain continues to be based on GCC 4.6.3, as the new toolchain is not yet stable enough. A change to a new toolchain is expected during the R 3.2.x lifetime.
PACKAGE INSTALLATION
- (Windows) The use of macro
ZLIB_LIBS
in file ‘src/Makevars.win’ (which has not been documented for a long time) now requires an external ‘libz.a’ to be available (it is part of the ‘goodies’ used to compile Windows binary packages). It would be simpler to use-lz
instead. - The default for option
pkgType
on platforms using binary packages is now"both"
, so source packages will be tried if binary versions are not available or not up to date.There are options for what
install.packages(type = "both")
(possibly called viaupdate.packages()
) will do if compilation of a source package is desirable: see?options
(under utils).If you intend not to accept updates as source packages, you should use
update.packages(type = "binary")
.
DEPRECATED AND DEFUNCT
download.file(method = "lynx")
is defunct.- Building R using the included versions of
zlib
,bzip2
,xz
and PCRE is deprecated: these are frozen (bar essential bug-fixes) and will be removed for R 3.3.0. - The
configure
option –with-valgrind-instrumentation=3 has been withdrawn, as it did not work with recentvalgrind
headers: it is now treated as level2
. - The
MethodsList
class in package methods had been deprecated in R 2.11.0 and is defunct now. Functions using it are defunct if they had been deprecated in R 2.11.0, and are deprecated now, otherwise.
BUG FIXES
- Fixed two obscure bugs in pairlist subassignment, reported by Radford Neal as part of pqR issue 16.
- Fixes for bugs in handling empty arguments and argument matching by name in
log()
. all.equal()
gains methods forenvironment
s andrefClass
es.[<-
and[[<-
gain S4data.frame
methods to avoid corruption of S4 class information by the S3 methods.callNextMethod()
should now work within a.local
call when...
is absent fromformals(.local)
.dput(pairlist(x))
generates a call to thepairlist
constructor instead of thelist
constructor.- Fix
missing()
when arguments are propagated through...
. (PR#15707) eigen(m)
now defaults tosymmetric = TRUE
even when the dimnames are asymmetric if the matrix is otherwise symmetric. (PR#16151)- Fix issues with forwarding
...
throughcallGeneric()
andcallNextMethod()
. (PR#16141) callGeneric()
now works after acallNextMethod()
.- Subclass information is kept consistent when replacing an ordinary S4 class with an “old class” via the
S4Class
argument tosetOldClass()
. Thus, for example, adata.frame
is valid for alist
argument in the signature, and afactor
is valid forvector
arguments. - In
qbeta()
the inversion ofpbeta()
is much more sophisticated. This works better in corner cases some of which failed completely previously (PR#15755), or were using too many iterations. - Auto-printing no longer duplicates objects when printing is dispatched to a method.
kmeans(x, k)
would fail whennrow(x) >= 42949673
. (Comment 6 of PR#15364)- ‘Abbreviated’ locale-specific day and month names could have been truncated in those rare locales where there are the same as the full names.
- An irrelevant warning message from updating subclass information was silenced (the namespace would not be writable in this case).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.