Site icon R-bloggers

Improved evolution of correlations

[This article was first published on Nicebread » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Update June 2013: A systematic analysis of the topic has been published:
Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609-612. doi:10.1016/j.jrp.2013.05.009

Check also the supplementary website, where you can find the PDF of the paper.

As an update of this post: here’s an improved version of “The evolution of correlations”.

From the original post:
“This is the evolution of a bivariate correlation between two questionnaire scales, “hope of power” and “fear of losing control”. Both scales were administered in an open online study. The video shows how the correlation evolves from r = .69*** (n=20) to r = .26*** (n=271). It does not stabilize until n = 150.

Data has not been rearranged – it is the random order how participants dropped into the study. This had been a rather extreme case of an unstable correlation – other scales in this study were stable right from the beginning. Maybe this video could help as an anecdotal caveat for a careful interpretation of correlations with small n’s (and with ‘small’ I mean n < 100) …”

The right panel now displays the correlation in each step. The horizontal green line is the final correlation that is approached, the curved dotted line shows the marginal correlation that would be significant at that sample size. As the empirical curve always is above this dotted line, it is significantly different from zero in each step.

Evolution of correlations – improved from Felix Schönbrodt on Vimeo.

 

Here the code that created the movie. It’s not fully self-contained – the function plotReg plots the dual-panel display, dat0, A, and B are parameters passed to this function. You can insert any other function here. The function loops through the rows of a data frame and saves a plot at every step into a subfolder. Finally, the function needs the command line version of ffmpeg, which connects the pictures to a movie.

?View Code RSPLUS
makeMovie <- function(fname, dat0, A, B, fps=15) {
 
	# create a new directory for the pictures
	dir.create(fname)
 
	# create the picture sequence
	picName <- paste(fname, "/", fname, "_%03d.jpg", sep="")
	jpeg(picName, width=800, height=450, quality=95)
	for (i in 15:nrow(dat0)) {
	  print(i)
	  plotReg(A, B, i, keep=15)
	}
	dev.off()
 
	# delete any existing movie file
	unlink(paste(fname,".mpg",sep=""))
 
	# point system to R's working directory
	system(paste("cd ", gsub(" ", "\\ ", getwd(), fixed=TRUE)))
 
	# show & execute the command line expression for ffmpeg to glue the pictures together
	print(paste(paste0("ffmpeg -r ", fps, " -i ", fname, "/", fname, "_%03d.jpg -sameq -r 25 ",  paste0(fname,".avi"))))
	system(paste(paste0("ffmpeg -r ", fps, " -i ", fname, "/", fname, "_%03d.jpg -sameq -r 25 ",  paste0(fname,".avi"))))
}

To leave a comment for the author, please follow the link and comment on their blog: Nicebread » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.