Correctly interpreting predictive models can be tricky. One solution to this problem is to create interactive simulators, where users can manipulate the predictor variables and see...
Back in the “olden days” of the 1970s it was apparently not unknown for statisticians to create 3D visualizations using tinkertoys. For some inexplicable reason, the advent... [Read more...]
Lots of different visualizations have been proposed for understanding distributions. In this post, I am going show how to create my current favorite, which is...
An interactive infographic can be used to communicate a lot of information in an engaging way. With the right tools, they are also relatively straightforward to create. In this post, I show step-by-step how to...
Infographics, dashboards, and reports often need to highlight or visualize a single number. But how do you highlight a single number so that it has an impact and looks good? It can be a big...
Using multiple visual elements to represent one variable in a chart can increase accuracy and improve readability. This is called adding redundancy or redundant encoding and, if done right, it will improve the chances of a reader...
Creating a meaningful visualization from data with long lists can be challenging. While word clouds are often the popular choice, they are not always the best option. This post illustrates seven alternatives to word clouds...
R functions can be used to create chart templates, which keep the look and feel of reports consistent. This post gives step by step guide on how to create chart templates using R functions. R... [Read more...]
In my sordid past, I was a data science consultant. One thing about data science that they don’t teach you at school is that senior managers in most large companies require reports to be in PowerPoint....
If you have tried to communicate research results and data visualizations using R, there is a good chance you will have come across one of its great limitations. R is painful when you need to...
What is a choice simulator? A choice simulator is an online app or an Excel workbook that allows users to specify different scenarios and get predictions. Here is an example of a choice simulator. Choice simulators have...
Correspondence analysis is a popular tool for visualizing the patterns in large tables. To many practitioners it is probably a black box. Table goes in, chart comes out. In this post I explain the mathematics...
If you have ever looked with any depth at statistical computing for multivariate analysis, there is a good chance you have come across the singular value decomposition (SVD). It is a workhorse for techniques that decompose data, such as correspondence analysis and principal...
In this post I explore two different methods for computing the relative importance of predictors in regression: Johnson’s Relative Weights and Partial Least Squares (PLS) regression. Both techniques solve a problem with Multiple Linear Regression, which can perform poorly when there are correlations...