Can Money Really Buy Happiness? Or How to Lie with Statistics in Science

Learning Machines

7 hours ago

[This article was first published on R-Bloggers – Learning Machines, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It’s a widely accepted notion that money influences happiness, a concept famously associated with Noble laureate Daniel Kahneman, who purportedly demonstrated that emotional wellbeing increases with income but plateaus beyond an annual threshold of about $75,000.

This idea has permeated both academic circles and popular media, reinforcing the belief that there’s a direct correlation between financial prosperity and happiness. But how accurate is this belief when we scrutinize the data more closely? To find out read on!

Recent research (indeed the last paper ever published by Kahneman!) attempts to delve deeper into this relationship, suggesting that the connection between income and happiness is real and relevant. However, a closer examination of the publicly available data tells a different story.

My own analysis reveals a Pearson correlation coefficient of just 0.07 between wellbeing and income, indicating a very weak relationship despite its statistical significance:

data <- read.csv("Data/Income_and_emotional_wellbeing_a_conflict_resolved.csv") # modify path accordingly
data$income_factor <- factor(data$income, levels = sort(unique(data$income)))

cor.test(data$wellbeing, data$income)
## 
##  Pearson's product-moment correlation
## 
## data:  data$wellbeing and data$income
## t = 13.293, df = 33389, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.06187719 0.08321620
## sample estimates:
##      cor 
## 0.072555

One of the more subtle learnings of academic research is that a relationship between two variables can statistically be highly significant while in practice being useless because the effect is so minuscule. Paradoxically, the more data points you have the higher the chances that you will find something statistically significant that has no practical significance.

In this case we have more than 33,300 data points and while there is a tiny increase in happiness with greater income, the effect is so slight that its real-world implications are negligible. Indeed, the difference between the medians of happiness at household incomes of $15,000 and $250,000 is only about five points on a 100-point scale!

To put the observed effect into a more relatable context, consider this: the difference in happiness resulting from an approximately fourfold difference in income is roughly equivalent to the happiness boost one might feel over a typical weekend! This comparison starkly illustrates the insignificance of income effects relative to everyday life experiences.

Yet this research manages to persuade us that there is indeed something substantial going on. How is this achieved in such studies? The devil is in the details, or in this case, the methodology. Three statistical choices in such studies stand out as particularly problematic: the logarithmic transformation of income, the use of z-scores and the use of averages without referring to dispersion measures for wellbeing:

Logarithmic transformation: By transforming income using a logarithmic scale, the data suggest a linear relationship where none exists. This transformation masks the reality of diminishing returns, where increases in income result in progressively smaller gains in happiness. These methods, while often applied in practice for reducing skewness, can present a distorted view of the underlying data. Apart from that, how should one interpret “log income” anyway?
Z-scores: The application of z-scores is another area where the graphical representation can be misleading. Z-scores standardize data points and effectively cut off the y-axis of the original data, which can visually exaggerate minor differences. When we depict wellbeing scores on a complete 0-100 scale, the supposed effect of income on happiness nearly vanishes, revealing a much less compelling story.
Averages without dispersion measures: While using median values (or means) is not problematic per se, it can mask the inherent dispersion of the data, e.g. to be indicated by interquartile ranges (IQR), standard deviations, variances, or confidence intervals. Especially when data is extremely dispersed, as in this case with wellbeing, interpreting results and drawing meaningful conclusions can be challenging without proper context.

The plots I’ve created from the original data starkly illustrate these points. I often start my own data analyses with a scatter plot but in this case, I first thought that I made a mistake or got the wrong data:

plot(data$wellbeing ~ data$income,
     main = "Scatterplot of Wellbeing Across Income Levels",
     xlab = "Income", ylab = "Wellbeing")
grid()

This plot shows a dense cluster of data points that scatter broadly across the graph, displaying no apparent trend or meaningful pattern linking income to wellbeing.

But worry not, by making use of the three statistical techniques from above, it is quite easy to create plots like the ones shown in the pertinent literature:

mean_well_being_zscore <- aggregate(scale(wellbeing) ~ log_income, data = data, median)
plot(mean_well_being_zscore,
     main = "Z-Score of Median of Wellbeing Across Log-Income Levels",
     xlab = "Income", ylab = "Wellbeing",
     pch = 16)
grid()
LinReg <- lm(V1 ~ log_income, data = mean_well_being_zscore)
LinReg
## 
## Call:
## lm(formula = V1 ~ log_income, data = mean_well_being_zscore)
## 
## Coefficients:
## (Intercept)   log_income  
##    -1.08108      0.09396

abline(LinReg)

Mirroring Figure 1B from the above paper, this plot suggests a clear relationship between rising levels of income and resulting wellbeing. Alas, upon closer scrutiny, this proves to be more a product of clever statistical handling than any relevant effect.

Now let us have a look at some less sophisticated plots to see what is really going on… or better, “isn’t going on”. We recreate the same plot but this time without the log transformation of income and without z-scoring the wellbeing values:

mean_well_being <- aggregate(wellbeing ~ income, data = data, median)
plot(mean_well_being,
     main = "Median of Wellbeing Across Income Levels",
     xlab = "Income", ylab = "Wellbeing",
     pch = 16)
grid()

plot(mean_well_being,
     main = "Median of Wellbeing Across Income Levels",
     xlab = "Income", ylab = "Wellbeing",
     ylim = c(0, 100),
     pch = 16)
grid()

The first version of this plot artificially caps the y-axis to suggest a strong, meaningful relationship; the second version showing the full axis demonstrates the near absence of any real effect. In this case the manipulation of the y-axis becomes more apparent because we can now see the real wellbeing values instead of the hard to interpret z-scored ones.

And these plots do not deceive the eye, e.g. nearly doubling one’s income from $35,000 to $65,000 shows not even a statistically significant difference in the level of wellbeing:

t.test(data$wellbeing[data$income == 35000], data$wellbeing[data$income == 65000])
## 
##  Welch Two Sample t-test
## 
## data:  data$wellbeing[data$income == 35000] and data$wellbeing[data$income == 65000]
## t = -1.8237, df = 4923.9, p-value = 0.06825
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.32549954  0.04789018
## sample estimates:
## mean of x mean of y 
##  62.37602  63.01482

Or even more extreme, the effect of more than quadrupling one’s income from $137,500 to $625,000 isn’t statistically significant either:

t.test(data$wellbeing[data$income == 137500], data$wellbeing[data$income == 625000])
## 
##  Welch Two Sample t-test
## 
## data:  data$wellbeing[data$income == 137500] and data$wellbeing[data$income == 625000]
## t = -1.6919, df = 533.15, p-value = 0.09125
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.4851637  0.1852509
## sample estimates:
## mean of x mean of y 
##  64.19116  65.34111

In the last chart we now also add a measure of dispersion in the form of boxplots. Here it becomes even more clear that across income levels, wellbeing hardly changes and is extremely dispersed at that:

boxplot(wellbeing ~ sort(income_factor), data = data, 
        main = "Boxplot of Wellbeing Across Income Levels",
        xlab = "Income", ylab = "Wellbeing",
        col = rainbow(length(unique(data$income))))

To be fair, the authors do briefly address some of these criticisms, but those discussions are buried deep within the paper and serve mainly to downplay their significance. It’s important to remember that “lying with statistics” doesn’t necessarily involve outright falsehoods; rather, it involves presenting results in a way that suggests misleading conclusions or exaggerates irrelevant findings.

As a sidenote, the possibility of reverse causality — where inherently happier individuals might earn more — should also be considered. It suggests that personal disposition (what self-proclaimed “life-coaches” call “mindset” nowadays!) might drive both happiness and higher income rather than the reverse. It would be interesting to see the results if you reversed both variables. Moreover, it would be insightful to examine how changes in income levels affect wellbeing, as the current research only addresses the wellbeing of individuals at their existing income levels.

To conclude, this critique is not just about debunking a popular myth; it’s a call for greater integrity and clarity in how statistical research is conducted and reported. I would be very interested in your feedback and in whether you have encountered similar overstatements in other research.

To leave a comment for the author, please follow the link and comment on their blog: R-Bloggers – Learning Machines.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Related