Articles by matloff

Any Forward Progress on p-Values?

February 9, 2017 | matloff

Statisticians have long known that the use of p-values has major problems. Some of us have long called for reform, weaning the profession away from these troubling beasts. At one point, I was pleased to see Frank Harrell suggest that R should stop computing them. That is not going to ... [Read more...]

Threading in R?

December 11, 2016 | matloff

I was pleased to see today’s post, “(A Very) Experimental Threading in R,” by Lukasz Bartnik, as this is a long-standing interest of mine. My own effort in this direction has been my package Rdsm. The notion of threading, for those who may not have this background, refers to ... [Read more...]

Using CART: Implementation Matters

December 11, 2016 | matloff

In preparing the following example for my forthcoming book, I was startled by how different two implementations of Classification and Regression Trees (CART) performed on a particular data set. Here is what happened: For my example, I first used the Vertebral Column data set from the UCI Machine Learning Repository. ... [Read more...]

My regtools Package Is Now on CRAN

November 7, 2016 | matloff

In my posts to this blog (less frequent than I would like, hopefully more frequent in the future), I’ve often mentioned my R package regtools, which contains a number of functions useful for regression and classification. None of them duplicate what is available in the excellent packages on CRAN, ...
[Read more...]

Manny Parzen Used R!

August 20, 2016 | matloff

Prof. Manny Parzen, a pioneer of modern statistics, passed away in February, aged 87. I should have commented back then, but it’s still worth saying something today. I happened to be thinking of him this morning. I did not know Manny personally. This makes it odd that I refer to ... [Read more...]

StatET IDE for R

July 22, 2016 | matloff

I personally do not use Integrated Development environments (IDEs) for R, or for that matter for any programming language. From my point of view, they take up too much precious real estate on the screen, and most important, they generally do not allow me to use my own text editor ...
[Read more...]

New Release of partools Package

July 17, 2016 | matloff

My new release of partools is now on CRAN. The package is aimed at doing parallel data science in what I call an “un-MapReduce” manner. It takes the point of view that MapReduce-based frameworks such as Hadoop and Spark are fine for the types of applications their designers had in ... [Read more...]

Bad Coder, Bad Coder!

July 7, 2016 | matloff

My title here is in the sense of “Bad dog, bad dog!”, a scolding I sometimes see dog owners use to tame their pets, and is also an allusion to Bad Reporter, a sometimes hilarious and always irreverent political comic strip in the San Francisco Chronicle. And my title is ...
[Read more...]

Latest on the Julia Language (vs. R)

July 6, 2016 | matloff

I’ve written before about the Julia language. As someone who is very active in the R community, I am biased of course, and have been (and remain) a skeptic about Julia. But I would like to report on a wonderful talk I attended today at Stanford. To my surprise ... [Read more...]

Student-Run Conference in Data Science

May 5, 2016 | matloff

I’d like to urge all of you in Northern California to attend iidata, a student-run conference in data science, to be held on the UC Davis campus on May 21. According to the Web page, iidata is a one-day, collegiate-level Data Science convention aimed at educating students in the new, ... [Read more...]

Talk on regtools and P-Values

April 28, 2016 | matloff

I’m deeply greatful to Hui Lin and the inimitable Yihui Xie for arranging for me to give a “virtual seminar talk” to the Central Iowa R Users Group. You can view my talk, including an interesting Q&A session, online. (The actual start is at 0:34.) There are two separate ... [Read more...]

GTC 2016

March 29, 2016 | matloff

I will be an invited speaker at GTC 2016, a large conference on GPU computation. The main topic will be usage of GPU in conjunction with R, and I will also speak on my Software Alchemy method, especially in relation to GPU computing.. GTC asked me to notify my “network” about ... [Read more...]

Even Businessweek Is Talking about P-Values

March 28, 2016 | matloff

The March 28 issue of Bloomberg Businessweek has a rather good summary of the problems of p-values, even recommending the use of confidence intervals and — wonder of wonders — “[looking] at the evidence as a whole.” What, statistics can’t make our decisions for us?  :-) It does make some vague and ... [Read more...]

P-values: the Continuing Saga

March 10, 2016 | matloff

I highly recommend the blog post by Yoav Benjamini and Tal Galili in defense of (carefully used) p-values. I disagree with much of it, but the exposition is very clear, and there is a nice guide to relevant R tools, including for simultaneous inference, a field in which Yoav is ... [Read more...]

Further Comments on the ASA Manifesto

March 9, 2016 | matloff

On Tuesday I commented here on the ASA (in their words) “Position on p-values:  context, process, and purpose.” A number of readers replied, some of them positive, some mistakenly thinking I don’t think statistical inferences are needed, and some claiming I overinterpreted the ASA’s statement. I’ll respond ... [Read more...]

After 150 Years, the ASA Says No to p-values

March 7, 2016 | matloff

Sadly, the concept of p-values and significance testing forms the very core of statistics. A number of us have been pointing out for decades that p-values are at best underinformative and often misleading. Almost all statisticians agree on this, yet they all continue to use it and, worse, teach it. ... [Read more...]

Quick Intro to NMF (the Method and the R Package)

March 6, 2016 | matloff

Nonnegative matrix factorization (NMF) is a popular tool in many applications, such as image and text recognition. If you’ve ever wanted to learn a little bit about NMF, you can do so right here, in this blog post, which will summarize the (slightly) longer presentation here. The R package ...
[Read more...]

Innumeracy, Statistics and R

March 1, 2016 | matloff

A couple of years ago, when an NPR journalist was interviewing me, the conversation turned to quantitative matters. The reporter said, only half jokingly, “We journalists are innumerate and proud.” :-) Some times it shows, badly. This morning a radio reporter stated, “Hillary Clinton beat Bernie Sanders among South Carolina ... [Read more...]

50% Draft of Forthcoming Book Available

March 1, 2016 | matloff

As I’ve mentioned here a couple of times, I am in the midst of writing a book, From Linear Models to Machine Learning: Regression and Classification, with Examples in R. As has been my practice with past books, I have now placed a 50% rough draft of the book on ... [Read more...]

Some Comments on Donaho’s “50 Years of Data Science”

January 23, 2016 | matloff

An old friend recently called my attention to a thoughtful essay by Stanford statistics professor David Donaho, titled “50 Years of Data Science.” Given the keen interest these days in data science, the essay is quite timely. The work clearly shows that Donaho is not only a grandmaster theoretician, but also ... [Read more...]
1 2 3 4 5 6

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)