Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
R 3.3.0 (codename “Supposedly Educational”) was released today. You can get the latest binaries version from here. (or the .tar.gz source code from here). The full list of new features and bug fixes is provided below.
Upgrading to R 3.3.0 on Windows
If you are using Windows you can easily upgrade to the latest version of R using the installr package. Simply run the following code in Rgui:
install.packages("installr") # install setInternet2(TRUE) installr::updateR() # updating R. |
Running “updateR()” will detect if there is a new R version available, and if so it will download+install it (etc.). There is also a step by step tutorial (with screenshots) on how to upgrade R on Windows, using the installr package. If you only see the option to upgrade to an older version of R, then change your mirror or try again in a few hours (it usually take around 24 hours for all CRAN mirrors to get the latest version of R).
I try to keep the installr package updated and useful, so if you have any suggestions or remarks on the package – you are invited to open an issue in the github page.
CHANGES IN R 3.3.0
SIGNIFICANT USER-VISIBLE CHANGES
nchar(x, *)
‘s argumentkeepNA
governing how the result forNA
s inx
is determined, gets a new defaultkeepNA = NA
which returnsNA
wherex
isNA
, except fortype = "width"
which still returns2
, the formatting / printing width ofNA
.- All builds have support for https: URLs in the default methods for
download.file()
,url()
and code making use of them.Unfortunately that cannot guarantee that any particular https: URL can be accessed. For example, server and client have to successfully negotiate a cryptographic protocol (TLS/SSL, …) and the server’s identity has to be verifiable via the available certificates. Different access methods may allow different protocols or use private certificate bundles: we encountered a https: CRAN mirror which could be accessed by one browser but not by another nor bydownload.file()
on the same Linux machine.
NEW FEATURES
- The
print
method formethods()
gains abyclass
argument. - New functions
validEnc()
andvalidUTF8()
to give access to the validity checks for inputs used bygrep()
and friends. - Experimental new functionality for S3 method checking, notably
isS3method()
.Also, the names of the R ‘language elements’ are exported as character vectortools::langElts
. str(x)
now displays"Time-Series"
also for matrix (multivariate) time-series, i.e. whenis.ts(x)
is true.- (Windows only) The GUI menu item to install local packages now accepts ‘*.tar.gz’ files as well as ‘*.zip’ files (but defaults to the latter).
- New programmeR’s utility function
chkDots()
. D()
now signals an error when given invalid input, rather than silently returningNA
. (Request of John Nash.)formula
objects are slightly more “first class”: e.g.,formula()
ornew("formula", y ~ x)
are now valid. Similarly, for"table"
,"ordered"
and"summary.table"
. Packages defining S4 classes with the above S3/S4 classes as slots should be reinstalled.- New function
strrep()
for repeating the elements of a character vector. rapply()
preserves attributes on the list whenhow = "replace"
.- New S3 generic function
sigma()
with methods for extracting the estimated standard deviation aka “residual standard deviation” from a fitted model. news()
now displays R and package news files within the HTML help system if it is available. If no news file is found, a visibleNULL
is returned to the console.as.raster(x)
now also acceptsraw
arraysx
assuming values in0:255
.- Subscripting of matrix/array objects of type
"expression"
is now supported. type.convert("i")
now returns a factor instead of a complex value with zero real part and missing imaginary part.- Graphics devices
cairo_pdf()
andcairo_ps()
now allow non-default values of the cairographics ‘fallback resolution’ to be set.This now defaults to 300 on all platforms: that is the default documented by cairographics, but apparently was not used by all system installations. file()
gains an explicitmethod
argument rather than implicitly usinggetOption("url.method", "default")
.- Thanks to a patch from Tomas Kalibera,
x[x != 0]
is now typically faster thanx[which(x != 0)]
(in the case wherex
has no NAs, the two are equivalent). read.table()
now always uses the names for a namedcolClasses
argument (previously names were only used whencolClasses
was too short). (In part, wish ofPR#16478.)- (Windows only)
download.file()
with defaultmethod = "auto"
and a ftps:// URL chooses"libcurl"
if that is available. - The out-of-the box Bioconductor mirror has been changed to one using https://: use
chooseBioCmirror()
to choose a http:// mirror if required. - The data frame and formula methods for
aggregate()
gain adrop
argument. available.packages()
gains arepos
argument.- The undocumented switching of methods for
url()
on https: and ftps: URLs is confined tomethod = "default"
(and documented). smoothScatter()
gains aret.selection
argument.qr()
no longer has a...
argument to pass additional arguments to methods.[
has a method for class"table"
.- It is now possible (again) to
replayPlot()
a display list snapshot that was created byrecordPlot()
in a different R session.It is still not a good idea to use snapshots as a persistent storage format for R plots, but it is now not completely silly to use a snapshot as a format for transferring an R plot between two R sessions.The underlying changes mean that packages providing graphics devices (e.g., Cairo, RSvgDevice, cairoDevice, tikzDevice) will need to be reinstalled.
Code for restoring snapshots was contributed by Jeroen Ooms and JJ Allaire.
Some testing code is available at https://github.com/pmur002/R-display-list.
tools::undoc(dir = D)
andcodoc(dir = D)
now also work whenD
is a directory whosenormalizePath()
ed version does not end in the package name, e.g. from a symlink.abbreviate()
has more support for multi-byte character sets – it no longer removes bytes within characters and knows about Latin vowels with accents. It is still only really suitable for (most) European languages, and still warns on non-ASCII input.abbreviate(use.classes = FALSE)
is now implemented, and that is more suitable for non-European languages.match(x, table)
is faster (sometimes by an order of magnitude) whenx
is of length one andincomparables
is unchanged, thanks to Peter Haverty (PR#16491).- More consistent, partly not back-compatible behavior of
NA
andNaN
coercion to complex numbers, operations less often resulting in complexNA
(NA_complex_
). lengths()
considers methods forlength
and[[
onx
, so it should work automatically on any objects for which appropriate methods on those generics are defined.- The logic for selecting the default screen device on OS X has been simplified: it is now
quartz()
if that is available even if environment variable DISPLAY has been set by the user.The choice can easily be overridden via environment variable R_INTERACTIVE_DEVICE. - On Unix-like platforms which support the
getline
C library function,system(*,intern = TRUE)
no longer truncates (output) lines longer than 8192 characters, thanks to Karl Millar. (PR#16544) rank()
gains aties.method = "last"
option, for convenience (and symmetry).regmatches(invert = NA)
can now be used to extract both non-matched and matched substrings.data.frame()
gains argumentfix.empty.names
;as.data.frame.list()
gets newcut.names
,col.names
andfix.empty.names
.plot(x ~ x, *)
now warns that it is the same asplot(x ~ 1, *)
.recordPlot()
has new argumentsload
andattach
to allow package names to be stored as part of a recorded plot.replayPlot()
has new argumentreloadPkgs
to load/attach any package names that were stored as part of a recorded plot.- S4 dispatch works within calls to
.Internal()
. This means explicit S4 generics are no longer needed forunlist()
andas.vector()
. - Only family names starting with “Hershey” (and not “Her” as before) are given special treatment by the graphics engine.
- S4 values are automatically coerced to vector (via
as.vector
) when subassigned into atomic vectors. findInterval()
gets aleft.open
option.- The version of LAPACK included in the sources has been updated to 3.6.0, including those ‘deprecated’ routines which were previously included. Ca 40 double-complex routines have been added at the request of a package maintainer.As before, the details of what is included are in ‘src/modules/lapack/README’ and this now gives information on earlier additions.
tapply()
has been made considerably more efficient without changing functionality, thanks to proposals from Peter Haverty and Suharto Anggono. (PR#16640)match.arg(arg)
(the one-argument case) is faster; so issort.int()
. (PR#16640)- The
format
method forobject_size
objects now also accepts “binary” units such as"KiB"
and e.g.,"Tb"
. (Partly from PR#16649.) - Profiling now records calls of the form
foo::bar
and some similar cases directly rather than as calls to<Anonymous>
. Contributed by Winston Chang. - New string utilities
startsWith(x, prefix)
andendsWith(x, suffix)
. Also provide speedups for somegrepl("^...",*)
uses (related to proposals in PR#16490). - Reference class finalizers run at exit, as well as on garbage collection.
- Avoid parallel dependency on stats for port choice and random number seeds. (PR#16668)
- The radix sort algorithm and implementation from data.table (
forder
) replaces the previous radix (counting) sort and adds a new method fororder()
. Contributed by Matt Dowle and Arun Srinivasan, the new algorithm supports logical, integer (even with large values), real, and character vectors. It outperforms all other methods, but there are some caveats (see?sort
). - The
order()
function gains amethod
argument for choosing between"shell"
and"radix"
. - New function
grouping()
returns a permutation that stably rearranges data so that identical values are adjacent. The return value includes extra partitioning information on the groups. The implementation came included with the new radix sort. rhyper(nn, m, n, k)
no longer returnsNA
when one of the three parameters exceeds the maximal integer.switch()
now warns when no alternatives are provided.parallel::detectCores()
now has defaultlogical = TRUE
on all platforms – as this was the default on Windows, this change only affects Sparc Solaris.Optionlogical = FALSE
is now supported on Linux and recent versions of OS X (for the latter, thanks to a suggestion of Kyaw Sint).hist()
for"Date"
or"POSIXt"
objects would sometimes give misleading labels on the breaks, as they were set to the day before the start of the period being displayed. The display format has been changed, and the shift of the start day has been made conditional onright = TRUE
(the default). (PR#16679)- R now uses a new version of the logo (donated to the R Foundation by RStudio). It is defined in ‘.svg’ format, so will resize without unnecessary degradation when displayed on HTML pages—there is also a vector PDF version. Thanks to Dirk Eddelbuettel for producing the corresponding X11 icon.
- New function
.traceback()
returns the stack trace whichtraceback()
prints. lengths()
dispatches internally.dotchart()
gains apt.cex
argument to control the size of points separately from the size of plot labels. Thanks to Michael Friendly and Milan Bouchet-Valat for ideas and patches.as.roman(ch)
now correctly deals with more diverse character vectorsch
; also arithmetic with the resulting roman numbers works in more cases. (PR#16779)prcomp()
gains a new optionrank.
allowing to directly aim for less thanmin(n,p)
PC’s. Thesummary()
and itsprint()
method have been amended, notably for this case.gzcon()
gains a new optiontext
, which marks the connection as text-oriented (so e.g.pushBack()
works). It is still always opened in binary mode.- The
import()
namespace directive now accepts an argumentexcept
which names symbols to exclude from the imports. Theexcept
expression should evaluate to a character vector (after substituting symbols for strings). See Writing R Extensions. - New convenience function
Rcmd()
in package tools for invokingR CMD
tools from within R. - New functions
makevars_user()
andmakevars_site()
in package tools to determine the location of the user and site specific ‘Makevars’ files for customizing package compilation.
UTILITIES
R CMD check
has a new option –ignore-vignettes for use with non-Sweave vignettes whose VignetteBuilder package is not available.R CMD check
now by default checks code usage (via codetools) with only the base package attached. Functions from default packages other than base which are used in the package code but not imported are reported as undefined globals, with a suggested addition to theNAMESPACE
file.R CMD check --as-cran
now also checks DOIs in package ‘CITATION’ and Rd files.R CMD Rdconv
andR CMD Rd2pdf
each have a new option –RdMacros=pkglist which allows Rd macros to be specified before processing.
DEPRECATED AND DEFUNCT
- The previously included versions of
zlib
,bzip2
,xz
and PCRE have been removed, so suitable external (usually system) versions are required (see the ‘R Installation and Administration’ manual). - The unexported and undocumented Windows-only devices
cairo_bmp()
,cairo_png()
andcairo_tiff()
have been removed. (These devices should be used as e.g.bmp(type = "cairo")
.) - (Windows only) Function
setInternet2()
has no effect and will be removed in due course. The choice between methods"internal"
and"wininet"
is now made by themethod
arguments ofurl()
anddownload.file()
and their defaults can be set via options. The out-of-the-box default remains"wininet"
(as it has been since R 3.2.2). [<-
with an S4 value into a list currently embeds the S4 object into its own list such that the end result is roughly equivalent to using[[<-
. That behavior is deprecated. In the future, the S4 value will be coerced to a list withas.list()
.- Package tools‘ functions
package.dependencies()
,pkgDepends()
, etc are deprecated now, mostly in favor ofpackage_dependencies()
which is both more flexible and efficient.
INSTALLATION and INCLUDED SOFTWARE
- Support for very old versions of
valgrind
(e.g., 3.3.0) has been removed. - The included
libtool
script (generated byconfigure
) has been updated to version 2.4.6 (from 2.2.6a). libcurl
version 7.28.0 or later with support for thehttps
protocol is required for installation (except on Windows).- BSD networking is now required (except on Windows) and so
capabilities("http/ftp")
is always true. configure
usespkg-config
for PNG, TIFF and JPEG where this is available. This should work better with multiple installs and with those using static libraries.- The minimum supported version of OS X is 10.6 (‘Snow Leopard’): even that has been unsupported by Apple since 2012.
- The
configure
default on OS X is –disable-R-framework: enable this if you intend to install under ‘/Library/Frameworks’ and use withR.app
. - The minimum preferred version of PCRE has since R 3.0.0 been 8.32 (released in Nov 2012). Versions 8.10 to 8.31 are now deprecated (with warnings from
configure
), but will still be accepted until R 3.4.0. configure
looks for C functions__cospi
,__sinpi
and__tanpi
and uses these ifcospi
etc are not found. (OS X is the main instance.)- (Windows) R is now built using
gcc
4.9.3. This build will require recompilation of at least those packages that include C++ code, and possibly others. A build of R-devel using the older toolchain will be temporarily available for comparison purposes.During the transition, the environment variable R_COMPILED_BY has been defined to indicate which toolchain was used to compile R (and hence, which should be used to compile code in packages). TheCOMPILED_BY
variable described below will be a permanent replacement for this. - (Windows) A
make
andR CMD config
variable namedCOMPILED_BY
has been added. This indicates which toolchain was used to compile R (and hence, which should be used to compile code in packages).
PACKAGE INSTALLATION
- The
make
macroAWK
which used to be made available to files such as ‘src/Makefile’ is no longer set.
C-LEVEL FACILITIES
- The API call
logspace_sum
introduced in R 3.2.0 is now remapped as an entry point toRf_logspace_sum
, and its first argument has gained aconst
qualifier. (PR#16470)Code using it will need to be reinstalled.Similarly, entry point
log1pexp
also defined in ‘Rmath.h’ is remapped there toRf_log1pexp
R_GE_version
has been increased to11
.- New API call
R_orderVector1
, a faster one-argument version ofR_orderVector
. - When R headers such as ‘R.h’ and ‘Rmath.h’ are called from C++ code in packages they include the C++ versions of system headers such as ‘<cmath>’ rather than the legacy headers such as ‘<math.h>’. (Headers ‘Rinternals.h’ and ‘Rinterface.h’ already did, and inclusion of system headers can still be circumvented by defining
NO_C_HEADERS
, including as from this version for those two headers.)The manual has long said that R headers should not be included within anextern "C"
block, and almost all the packages affected by this change were doing so. - Including header ‘S.h’ from C++ code would fail on some platforms, and so gives a compilation error on all.
- The deprecated header ‘Rdefines.h’ is now compatible with defining
R_NO_REMAP
. - The connections API now includes a function
R_GetConnection()
which allows packages implementing connections to convert Rconnection
objects toRconnection
handles used in the API. Code which previously used the low-level R-internalgetConnection()
entry point should switch to the official API.
BUG FIXES
- C-level
asChar(x)
is fixed for whenx
is not a vector, and it returns"TRUE"
/"FALSE"
instead of"T"
/"F"
for logical vectors. - The first arguments of
.colSums()
etc (with an initial dot) are now namedx
rather thanX
(matchingcolSums()
): thus error messages are corrected. - A
coef()
method for class"maov"
has been added to allowvcov()
to work with multivariate results. (PR#16380) method = "libcurl"
connections signal errors rather than retrieving HTTP error pages (where the ISP reports the error).xpdrows.data.frame()
was not checking for unique row names; in particular, this affected assignment to non-existing rows via numerical indexing. (PR#16570)tail.matrix()
did not work for zero rows matrices, and could produce row “labels” such as"[1e+05,]"
.- Data frames with a column named
"stringsAsFactors"
now format and print correctly. (PR#16580) cor()
is now guaranteed to return a value with absolute value less than or equal to 1. (PR#16638)- Array subsetting now keeps
names(dim(.))
. - Blocking socket connection selection recovers more gracefully on signal interrupts.
- The
data.frame
method ofrbind()
constructionrow.names
works better in borderline integer cases, but may change the names assigned. (PR#16666) - (X11 only)
getGraphicsEvent()
miscoded buttons and missed mouse motion events. (PR#16700) methods(round)
now also listsround.POSIXt
.tar()
now works with the defaultfiles = NULL
. (PR#16716)- Jumps to outer contexts, for example in error recovery, now make intermediate jumps to contexts where
on.exit()
actions are established instead of trying to run allon.exit()
actions before jumping to the final target. This unwinds the stack gradually, releases resources held on the stack, and significantly reduces the chance of a segfault when running out of C stack space. Error handlers established usingwithCallingHandlers()
andoptions("error")
specifications are ignored when handling a C stack overflow error as attempting one of these would trigger a cascade of C stack overflow errors. (These changes resolve PR#16753.) - The spacing could be wrong when printing a complex array. (Report and patch by Lukas Stadler.)
pretty(d, n, min.n, *)
for date-time objectsd
works again in border cases with largemin.n
, returns alabels
attribute also for small-range dates and in such cases its returned length is closer to the desiredn
. (PR#16761) Additionally, it finally does cover the range ofd
, as it always claimed.tsp(x) <- NULL
did not handle correctly objects inheriting from both"ts"
and"mts"
. (PR#16769)install.packages()
could give false errors whenoptions("pkgType")
was"binary"
. (Reported by Jose Claudio Faria.)- A bug fix in R 3.0.2 fixed problems with
locator()
in X11, but introduced problems in Windows. Now both should be fixed. (PR#15700) download.file()
withmethod = "wininet"
incorrectly warned of download file length difference when reported length was unknown. (PR#16805)diag(NULL, 1)
crashed because of missed type checking. (PR#16853)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.