Articles by David Smith

Teach kids about R with Minecraft

June 2, 2017 | David Smith

As I mentioned earlier this week, I was on a team at the ROpenSci Unconference (with Brooke Anderson, Karl Broman, Gergely Daróczi, and my Microsoft colleagues Mario Inchiosa and Ali Zaidi) to work on a project to interface the R language with Minecraft. The resulting R package, miner, is ... [Read more...]

Python and R top 2017 KDnuggets rankings

June 1, 2017 | David Smith

The results of KDnuggets' 18th annual poll of data science software usage are in, and for the first time in three years Python has edged out R as the most popular software. While R increased its share of usage from 45.7% in last year's poll to 52.1% this year, Python's usage among ... [Read more...]

Watch presentations from R/Finance 2017

May 31, 2017 | David Smith

It was another great year for the R/Finance conference, held earlier this month in Chicago. This is normally a fairly private affair: with attendance capped at around 300 people every year, it's a somewhat exclusive gathering of the best and brightest minds from industry and academia in financial data analysis ... [Read more...]

Reflections on ROpenSci Unconference 2017

May 30, 2017 | David Smith

Last week I attended the ROpenSci Unconference in Los Angeles, and it was fantastic. Now in its fourth year, the ROpenSci team brought together a talented and diverse group of about 70 R developers from around the world to work on R-related projects in an intense 2-day hackathon. Not only did ... [Read more...]

Love is all around: Popular words in pop hits

May 25, 2017 | David Smith

Data scientist Giora Simchoni recently published a fantastic analysis of the history of pop songs on the Billboard Hot 100 using the R language. Giora used the rvest package in R to scrape data from the Ultimate Music Database site for the 350,000 chart entries (and 35,000 unique songs) since 1940, and used those ... [Read more...]

Microsoft R Open 3.4.0 now available

May 24, 2017 | David Smith

Microsoft R Open (MRO), Microsoft's enhanced distribution of open source R, has been upgraded to version 3.4.0 and is now available for download for Windows, Mac, and Linux. This update upgrades the R language engine to R 3.4.0, reduces the size of the installer image, and updates the bundled packages. R 3.4.0 (upon ... [Read more...]

Create smooth animations in R with the tweenr package

May 23, 2017 | David Smith

There are several tools available in R for creating animations (movies) from statistical graphics. The animation package by Yihui Xie will create an animated GIF or video file, using a series of R charts you generate as the frames. And the gganimate package by David Robinson is an extension to ... [Read more...]

Preview of EARL San Francisco

May 22, 2017 | David Smith

The first ever EARL (Enterprise Applications of the R Language) conference in San Francisco will take place on June 5-7 (and it's not too late to register). The EARL conference series is now in its fourth year, and the prior conferences in London and Boston have each been a fantastic ... [Read more...]

R/Finance 2017 livestreaming today and tomorrow

May 19, 2017 | David Smith

If you weren't able to make it to Chicago for R/Finance, the annual conference devoted to applications of R in the financial industry, don't fret: the entire conference is being livestreamed (with thanks to the team at Microsoft). You can watch the proceedings at aka.ms/r_finance, and ... [Read more...]

An Introduction to Spatial Data Analysis and Visualization in R

May 17, 2017 | David Smith

The Consumer Data Research Centre, the UK-based organization that works with consumer-related organisations to open up their data resources, recently published a new course online: An Introduction to Spatial Data Analysis and Visualization in R. Created by James Cheshire (whose blog Spatial.ly regularly features interesting R-based data visualizations) and ... [Read more...]

R in Financial Services: Challenges and Opportunities

May 16, 2017 | David Smith

At the New York R Conference earlier this year, my colleague Lixun Zhang gave a presentation on the challenges and opportunites financial services companies encounter when using R. In the talk, he shares some lessons learned while working with an couple of international banks that have been using SAS, but ... [Read more...]

R and Python support now built in to Visual Studio 2017

May 15, 2017 | David Smith

The new Visual Studio 2017 has built-in support for programming in R and Python. For older versions of Visual Studio, support for these languages has been available via the RTVS and PTVS add-ins, but the new Data Science Workloads in Visual Studio 2017 make them available without a separate add-in. Just choose ... [Read more...]

Analyzing the home advantage in English soccer, with R

May 12, 2017 | David Smith

It's well-known that the home team has an advantage in soccer (or football, as it's called in England). But which teams have made the most of their home-field advantage over the years? Evolutionary biologist (and Liverpool fan) Joe Gallagher analyzed the percentage of points won in the UK Premier League (... [Read more...]

Analyzing data on CRAN packages

May 11, 2017 | David Smith

There's a handy new function in R 3.4.0 for anyone interested in data about CRAN packages. It's not documented, but it's pretty simple: tools::CRAN_package_db() returns a data frame with one row for every package on CRAN and 65 columns of data on those packages, as shown below. __ names(tools::... [Read more...]

Stack Overflow Trends

May 10, 2017 | David Smith

Developer Q&A site Stack Overflow recently introduced Stack Overflow Trends, a useful tool for tracking the growth and decline in the rate of questions asked on various topics (by their Stack Overflow tag). For example, you can see that activity around both R and Python has been increasing over ... [Read more...]

Real-time scoring with Microsoft R Server 9.1

May 4, 2017 | David Smith

Once you've built a predictive model, in many cases the next step is to operationalize the model: that is, generate predictions from the pre-trained model in real time. In this scenario, latency becomes the critical metric: new data typically become available a single row at a time, and it's important ... [Read more...]

Technical Foundations of Informatics: A modern introduction to R

May 3, 2017 | David Smith

Informatics (or Information Science) is the practice of creating, storing, finding, manipulating and sharing information. These are all tasks that the R language was designed for, and so Technical Foundations of Informatics, the online course guide for the University of Washington course of the same name, also provides an excellent ... [Read more...]

The Datasaurus Dozen

May 2, 2017 | David Smith

There's a reason why data scientists spend so much time exploring data using graphics. Relying only on data summaries like means, variances, and correlations can be dangerous, because wildly different data sets can give similar results. This is a principle that has been demonstrated in statistics classes for decades with ... [Read more...]

Using Microsoft R with Alteryx

May 1, 2017 | David Smith

Alteryx Designer, the self-service analytics workflow tool, recently added integration with Microsoft R. This allows you to train models provided by Microsoft R, and create predictions from them, without needing to write R code — you simply drag-and-drop to create a workflow. In a recent post at the Microsoft R blog, ... [Read more...]
1 13 14 15 16 17 94

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)