Articles by John Mount

Best Before Dates by Bass

March 2, 2025 | John Mount

I was searching for one last real world example for my upcoming video talk March 13th on time series forecasting. Hope to see you there! Or reach out to Win Vector LLC for custom training! I had the seemingly harmless thought: “Let’s look at Stack Overflow trends“. In particular ...
[Read more...]

Reposting Partial Pooling

January 8, 2025 | John Mount

Nina Zumel had some good articles on partial pooling estimators that I want to return to. It is a great technique to get more reliable models when using categorical variables. I wrote an introduction to them here some time ago. More importantly Nina has now repaired the damage to the […] [Read more...]

Examining Meta-Analysis

November 27, 2024 | John Mount

Joseph Rickert and I put together an experiment trying to both run a standard meta-analysis and then reproduce similar results directly using Bayesian methods. I think it came out really interesting and we share it here at R Works and also here on Github. A meta-analysis is an attempt to […]
[Read more...]

Examining Meta-Analysis

November 25, 2024 | John Mount

In this post, we would like to review the idea of meta-analysis and compare a traditional, frequentist style, random effects meta-analysis to Bayesian methods. We will do this using the meta R package and a Bayesian analysis conducted with R but...
[Read more...]

100 Bushels of Corn, Revisited

November 21, 2024 | John Mount

About the authors John Mount is a data scientist based in San Francisco, with 20+ years of experience in machine learning, statistics, and analytics. He is the co-founder of the data science consulting firm Win-Vector LLC, and (with Nin... [Read more...]

Calculating at Pencil and Paper Scale

November 6, 2024 | John Mount

Introduction It can be fun to drive a problem all the way into the ground. I don’t always get to do that on paying projects, however sometimes I can do it with hobby projects. In this case I am going to re-solve Dudeney’s Remainder Problem again and again ...
[Read more...]

Dyson’s Algorithm for the Twelve Coins Problem

October 24, 2024 | John Mount

Nina continues with the 12 coins problem by transcribing Dyson’s algorithm into R. It is kind of a fun article. Most of us see the 12 coins problem as a one-off puzzle that we spend a little time with and give up on. In her earlier “The Twelve Coins Puzzle” […] [Read more...]

Dudeney’s Remainder Problem

October 6, 2024 | John Mount

The remainder problem The description of this puzzle really cracks me up (Dudeney, Strand Magazine, January 1924). Health risks aside, how do we find the maximal integer d such that (480608 % d) = (508811 % d) = (723217 % d)? The solution A good puzzle strategy is to try […]
[Read more...]

An Easy Puzzle: The Perplexed Banker

October 4, 2024 | John Mount

Nina Zumel continues with the puzzles. This one is “The Perplexed Banker”. In my opinion, this one captures the essence of the “mathematical” aspect of a puzzle. For a mathematical puzzle one often hopes there is a systematic method that makes the puzzle easy. In this case there is indeed […] [Read more...]

The 100 Bushels Puzzle

September 26, 2024 | John Mount

Nina Zumel shares the following puzzle from the December 1908 issue of The Strand Magazine: 100 bushes of corn are distributed to 100 people such that every man receives 3 bushels, every woman 2 bushels, and every child 1/2 a bushel. How many men, women, and children are there? Check […] [Read more...]

Please Version Data

September 9, 2024 | John Mount

Introduction An important goal of our Win Vector LLC teaching offerings is to instill in engineers some familiarity with, and empathy for, how data is likely to be used for analytics and business. Having such engineers in your organization greatly increases the quality of the data later available to your […]
[Read more...]

Pulling a Loose Thread on Pull()

August 1, 2024 | John Mount

Richard Layton recently shared a neat article: A subtle flaw in pull(). This is the usual loss of reliable programmable semantics just to avoid a few quote marks (at the cost of many more force eval and paste marks). It is well considered and well writ... [Read more...]

ARMAX Offerings Remain a Muddle

July 15, 2024 | John Mount

Introduction To borrow a term from Hyndman, the following is going to come out as a bit of a muddle. However, that may be the honest way to survey a muddled situation. I am going to write on time series problems and solvers, try to run some details to ground, […]
[Read more...]

What Good is Analysis of Variance?

February 28, 2024 | John Mount

Introduction I’d like to demonstrate what “analysis of variance” (often abbreviated as “anova” or “aov”) does for you as a data scientist or analyst. After reading this note you should be able to determine how an analysis of variance style calculation can or can not help with your project. (...
[Read more...]

Omitted Variable Effects in Logistic Regression

August 18, 2023 | John Mount

Introduction I would like to illustrate a way which omitted variables interfere in logistic regression inference (or coefficient estimation). These effects are different than what is seen in linear regression, and possibly different than some expectations or intuitions. Our Example Data Let’s start with a data example in R. # […]
[Read more...]

A Time Series Apologia

May 7, 2023 | John Mount

I would like to share a new article on some of the methods and pitfalls of time series forecasting: “A Time Series Apologia”. In it I work the seemingly simple problem of forecasting a noisy copy of sin(t). The purpose of the article is to demonstrate using ARIMA methods, ... [Read more...]
1 2 3 24