statistics

Significant Figures in R and Rounding

April 16, 2010 | Neil Gunther

This is a follow-on to my previous post about determining significant digits, or sigdigs, in performance and capacity planning calculations. Once we know how to do that, inevitably we will be faced with rounding the result of a calculation to the least...
[Read more...]

Plotting “time of day” data using ggplot2

April 14, 2010 | nsaunders

William asks: How can I make a graph that looks like this, “tweet density” style, showing time intervals? He then helpfully describes his input data: a CSV file with headers “time started, time finished, date”. Here’s a simple CSV file, tasks.csv: task,date,start,end task1,2010-03-05,09:00:00,13:00:00 ... [Read more...]

Repeated measures ANOVA with R (tutorials)

April 13, 2010 | Tal Galili

Repeated measures ANOVA is a common task for the data analyst. There are (at least) two ways of performing “repeated measures ANOVA” using R but none is really trivial, and each way has it’s own complication/pitfalls (explanation/solution to which I was usually able to find through searching ... [Read more...]

Arizona court rules statistical sampling is legal

April 12, 2010 | David Smith

A court in Arizona has ruled that statistical sampling is legal for determining damages awarded to individual claimants when there are thousands of similar cases to be assessed simultaneously. In a case where 30,000 claims were filed Maricopa County, AZ by hospitals for improper reimbursement, the trial judge appointed a former ... [Read more...]

Significant Figures in R and Info Zeros

April 11, 2010 | Neil Gunther

The other day, I stumbled upon the signif function in R, so I thought I'd take a look at what it does and compare it with some results discussed in Chap. 3 "Damaging Digits in Capacity Calculations" of my GCaP book, viz., Example 3.5 on page 31. The m...
[Read more...]

Poor man’s pairs trading…

April 11, 2010 | M. Parzakonis

There is a central notion in Time Series Econometrics, cointegration. Loosely it refers to finding the long run equilibrium of two non-stationary series. As the most know non-stationary series examples comes from finance, cointegration is nowadays a tool for traders (not a common one though!). They use it as the ... [Read more...]

The Future of Math is Statistics

April 9, 2010 | JD Long

The future of math is statistics… and the language of that future is R: I’ve often thought there was way too little “statistical intuition” in the workplace. I think Author Benjamin would agree. [Read more...]

An obscure integral

April 7, 2010 | xi'an

Here is an email from Thomas I received yesterday about a computation in our book Introducing Monte Carlo Methods with R: I’m currently reading your book “Introduction to Monte Carlo Methods with R” and I quite highly appreciate your work. I’m not able to see how the integral ... [Read more...]

Correlation scatter-plot matrix for ordered-categorical data

April 7, 2010 | Tal Galili

When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be ... [Read more...]

Matrix determinant with the Lapack routine dspsv

April 6, 2010 | Matt Shotwell

The Lapack routine dspsv solves the linear system of equations Ax=b, where A is a symmetric matrix in packed storage format. However, there appear to be no Lapack functions that compute the determinant of such a matrix. We need to compute the determinant, for instance, in order to compute ... [Read more...]

Le Monde rank test (corr’d)

April 6, 2010 | xi'an

Since my first representation of the rank statistic as paired was incorrect, here is the histogram produced by the simulation perm=sample(1:20) saple[t]=sum(abs(sort(perm[1:10])-sort(perm[11:20]))) when . It is obviously much closer to zero than previously. An interesting change is that the regression of the log-mean ... [Read more...]

Le Monde rank test (cont’d)

April 5, 2010 | xi'an

Following a comment from efrique pointing out that this statistic is called Spearman footrule, I want to clarify the notation in namely (a) that the ranks of and are considered for the whole sample, i.e. instead of being computed separately for the ‘s and the ‘s, and then (b) ... [Read more...]

Le Monde rank test

April 4, 2010 | xi'an

In the puzzle found in Le Monde of this weekend, the mathematical object behind the silly story is defined as a pseudo-Spearman rank correlation test statistic, where the difference between the ranks of the paired random variables and is in absolute value instead of being squared as in the Spearman ... [Read more...]

A free book on Geostatistical Mapping with R

April 2, 2010 | David Smith

Tomislav Hengl of the University of Amsterdam has published new book, A Practical Guide to Geostatistical Mapping. It's jam-packed with 291 pages on mapping and analyzing spatial data using free software including R, SAGA, GRASS, ILWIS and Google Earth, and freely-available map data. The book itself is also available for free, ... [Read more...]
1 30 31 32 33 34 41

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)