Toronto Data Science Group – A Survey of Data Visualization Techniques and Practice
[This article was first published on everyday analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Recently I spoke at the Toronto Data Science group. The folks at Mozilla were kind enough to record it and put it on Air, so here it is for your viewing pleasure (and critique):Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Overall it was quite well received. Aside from the usual omg does my voice really sound like that?? which is to be expected, a couple of thoughts on the business of giving presentations which were quite salient here:
- Talk slower and enunciate
- Gesture, but not too much
- Tailor sizing and colouring of visuals, depending on projection & audience size
I’ve reproduced the code which was used to create the figures made in R (including the bubble chart example, with code and data from FlowingData), which regrettably at the time I neglected to save:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Toronto Data Science Group Talk plots | |
# Myles Harrison | |
# http://www.everydayanalytics.ca/2014/02/toronto-data-science-group-talk.html | |
library(hexbin) | |
library(RColorBrewer) | |
# Create random data | |
x <- rnorm(5000, mean=1000) | |
y <- rnorm(5000, mean=2000) | |
# Scatterplot | |
plot(x, y, pch=16, col='black') | |
# Scatter with transparency | |
plot(x, y, pch=16, col=rgb(0,0,0,0.1)) | |
# Smaller plotting symbols | |
plot(x, y, pch=16, col='black', cex=0.1) | |
# Simulate kernel density estimation with Euclidean distance to means | |
d <- sqrt((x-mean(x))^2 + (y-mean(y))^2) | |
# Select palette and normalize to fit in color range | |
palette(rainbow(32)) | |
d <- d/max(d)*32+1 | |
# Plot | |
plot(x, y, pch=16, col=d) | |
# Hexbinning | |
h <- hexbin(x,y) | |
plot(h) | |
# With color | |
plot(h, colramp=BTY) | |
# DISTRIBUTION | |
# Regular histogram | |
hist(x, breaks=125, col='red', xlab='x', main='Histogram of x') | |
# Boxplot | |
x2 <- rnorm(1000, mean=1000) | |
boxplot(x2) | |
# Add jitter | |
stripchart(x2,vertical=T,method="jitter",jitter=0.1,add=T, pch=16, | |
cex=0.1, col=rgb(0,0,0,0.5)) | |
# Bubble chart demo | |
# Data from Flowing Data: http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ | |
crime <- read.csv("http://datasets.flowingdata.com/crimeRatesByState2005.tsv", header=TRUE, sep="\t") | |
radius <- sqrt( crime$population/ pi ) | |
plot(crime$murder, crime$burglary, col='red', pch=16, ylab='Burglary Rate', xlab='Murder Rate', xlim=c(0, 10), ylim=c(150, 1250)) | |
symbols(crime$murder, crime$burglary, circles=radius, inches=0.35, fg="black", bg="red", xlab="Murder Rate", ylab="Burglary Rate", xlim=c(0, 10), ylim=c(150, 1250)) | |
# Category-like colouring | |
palette(colorRampPalette(c("red", "green", "darkgreen"))(3)) | |
c <- round(runif(50)*2)+1 | |
symbols(crime$murder, crime$burglary, circles=radius, inches=0.35, fg="black", bg=c, xlab="Murder Rate", ylab="Burglary Rate", xlim=c(0, 10), ylim=c(150, 1250)) | |
# Quantitive-like colouring | |
palette(colorRampPalette(c("lightblue", "yellow", "red"), interpolate='spline')(11)) | |
c <- crime$population/max(crime$population)*10+1 | |
symbols(crime$murder, crime$burglary, circles=radius, inches=0.35, fg="black", bg=c, xlab="Murder Rate", ylab="Burglary Rate", xlim=c(0, 10), ylim=c(150, 1250)) |
Lessons learned: talk slower, always save your code, and Google stuff before starting – because somebody’s probably already done it before you.
To leave a comment for the author, please follow the link and comment on their blog: everyday analytics.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.