Articles by John Mount

What can be in an R data.frame column?

April 9, 2015 | John Mount

As an R programmer have you every wondered what can be in a data.frame column? The documentation is a bit vague, help(data.frame) returns some comforting text including: Value A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and ... [Read more...]

New video course: Campaign Response Testing

April 8, 2015 | John Mount

I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing John Mount, Win-Vector LLC This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to one or more sales/marketing campaigns ... [Read more...]

How and why to return functions in R

April 3, 2015 | John Mount

One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses ... [Read more...]

Using closures as objects in R

March 27, 2015 | John Mount

For more and more clients we have been using a nice coding pattern taught to us by Garrett Grolemund in his book Hands-On Programming with R: make a function that returns a list of functions. This turns out to be a classic functional programming techique: use closures to implement objects (... [Read more...]

The Win-Vector R data science value pack

March 11, 2015 | John Mount

Win-Vector LLC is proud to announce the R data science value pack. 50% off our video course Introduction to Data Science (available at Udemy) and 30% off Practical Data Science with R (from Manning). Pick any combination of video, e-book, and/or print-book you want. Instructions below. Please share and Tweet! For 50% ... [Read more...]

Announcing: Introduction to Data Science video course

February 25, 2015 | John Mount

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data ... [Read more...]

Check your return types when modeling in R

January 27, 2015 | John Mount

Just a warning: double check your return types in R, especially when using different modeling packages. We consider ourselves pretty familiar with R. We have years of experience, many other programming languages to compare R to, and we have taken Hadley Wickham’s Master R Developer Workshop (highly recommended). We ... [Read more...]

R bracket is a bit irregular

January 17, 2015 | John Mount

While skimming Professor Hadley Wickham’s Advanced R I got to thinking about nature of the square-bracket or extract operator in R. It turns out “[,]” is a bit more irregular than I remembered. The subsetting section of Advanced R has a very good discussion on the subsetting and selection operators ... [Read more...]

A comment on preparing data for classifiers

December 4, 2014 | John Mount

I have been working through (with some honest appreciation) a recent article comparing many classifiers on many data sets: “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, Dinani Amorim; 15(Oct):3133−3181, 2014 (which we will call “the DWN paper” in ... [Read more...]

Excel spreadsheets are hard to get right

November 18, 2014 | John Mount

Any practicing data scientist is going to eventually have to work with a data stored in a Microsoft Excel spreadsheet. A lot of analysts use this format, so if you work with others you are going to run into it. We have already written how we Related posts: Please stop ... [Read more...]

Factors are not first-class citizens in R

September 23, 2014 | John Mount

The primary user-facing data types in the R statistical computing environment behave as vectors. That is: one dimensional arrays of scalar values that have a nice operational algebra. There are additional types (lists, data frames, matrices, environments, and so-on) but the most common data types are vectors. In fact vectors ... [Read more...]

Reading the Gauss-Markov theorem

August 26, 2014 | John Mount

What is the Gauss-Markov theorem? From “The Cambridge Dictionary of Statistics” B. S. Everitt, 2nd Edition: A theorem that proves that if the error terms in a multiple regression have the same variance and are uncorrelated, then the estimators of the parameters in the model produced by least squares estimation ... [Read more...]

Automatic bias correction doesn’t fix omitted variable bias

July 4, 2014 | John Mount

Page 94 of Gelman, Carlin, Stern, Dunson, Vehtari, Rubin “Bayesian Data Analysis” 3rd Edition (which we will call BDA3) provides a great example of what happens when common broad frequentist bias criticisms are over-applied to predictions from ordinary linear regression: the predictions appear to fall apart. BDA3 goes on to exhibit ... [Read more...]

Frequentist inference only seems easy

July 1, 2014 | John Mount

Two of the most common methods of statistical inference are frequentism and Bayesianism (see Bayesian and Frequentist Approaches: Ask the Right Question for some good discussion). In both cases we are attempting to perform reliable inference of unknown quantities from related observations. And in both cases inference is made possible ... [Read more...]

R has some sharp corners

May 15, 2014 | John Mount

R is definitely our first choice go-to analysis system. In our opinion you really shouldn’t use something else until you have an articulated reason (be it a need for larger data scale, different programming language, better data source integration, or something else). The advantages of R are numerous: Single ... [Read more...]
1 19 20 21 22 23 24

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)