The anatomy of a useful chart: NOAA’s flood forecasts
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
With thanks to NOAA's incredible data gathering and forecasting activities, I've been obsessed with this chart for the past few days:
We used to live near the Napa river where this river gage is located, and still have many friends in the area. We were in the area last weekend, when a “pineapple express” weather event brought an atmospheric river over much of California, with much rain and some flooding in low-lying areas. This was just before the first peak in the chart above, which shows the water level in the Napa river (in blue) along with a NOAA forecast (in purple). I was checking this chart obsessively, as the observed water level approached the “Major Flood” level, and experienced alternate bouts of hope and fear as the forecast skirted above the line from time to time.
Relying on this chart so intently made me appreciate what is takes to make a useful chart, so let's look at the ways this particular chart stands out. (While NOAA does use R for some hydrological charts, I don't think R was used for this one.)
The chart is updated frequently, and the most recent data point is highlighted. New river levels were posted every 15 minutes, and at as the crest was peaking knowing how recent the data were was critical.
A forecast is provided. The purple dots are based on a hydrological forecast, which includes information from upstream gages, weather forecasts, and the river formation around this particular location. This was an incredibly useful tool during the flood threat. However, the forecast is only updated every few hours, so having the recency of the forecast on the chart was incredibly helpful.
Context is provided for the measurements and forecast. I hadn't really paid much mind to the river level before — most of the time it's not much more than a minor stream. But knowing what river levels represented minor, moderate or major flooding (with their detailed definitions) was important. (As you can see, the river just avoided the major flooding stage on Sunday, and indeed the local town stayed mostly dry. Some vineyards were flooded, though)
Time zones are provided with times. There's nothing more frustrating than looking at a date or time, not knowing what time zone the data are provided in. This chart includes both the local time zone (PST) for the main axis and annotations and, on the top axis, Coordinated Universal Time. (17Z refers to 5PM Zulu time, which is 8 hours ahead of PST.)
The second Y axis. Having a second Y axis on a chart is rarely a good idea, but this is one of the examples where it's useful. The river flow is directly (but nonlinearly) related to river height, so presenting it here on the Y axis is useful for those that need it. (This is actually the value — not river height — used as input to the forecasts.) But while bridge engineers care about river flow, most are more concerned about the height, which is given top billing on the main Y axis. Bonus credit: units are provided for both axes: always a must-have, but lamentably often forgotten.
Annotations are provided for context. Having the recent and forecast peak heights and therecord flood height included on the chart provided context for the severity of the current flood threat, especially if you had experieced prior flood events in the area.
The chart is in PNG format. That's a good choice for a chart like this: it's a lossless format, which means the data points appear in perfect fidelity. It's also a fairly compact format that keeps image sizes small — important for a website that may experience a lot of traffic from many people constantly refreshing the report. (Thoughtfully, an auto-refresh option was also provided.) JPG, a lossy format that blurs small data points and straight lines, would have been a terrible choice here.
That's not to say this chart gets everything right. More resolution would have been helpful (especially when trying to comare the last data point to the prior — is the river level rising or falling?). The color key for the flood stages is far from the chart on the webpage, rather than being included on the chart itself. The NOAA logo is a bit intrusive (though I understand why it's there). And in general the styling is could use an update (pseudo-3D chrome is so last century). But this chart gets many more things tahn it gets wrong, and provides a useful lesson in presenting data graphically that people can actually use.
As you can see from the chart, another flood-level crest is now heading down the river. As things stand now, it doesn't seem like it's going to be as severe as the one on Sunday, but to everyone in the affected areas: good luck, take care, and give thanks to NOAA for keeping us all informed.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.