The Planetary Health Check 2025 updates the status of Johan Rockstrom’s “planetary boundaries” using new data. I’ve pointed out that the indicators chosen are a sort of muddle of stocks and flows from a systems perspective, but nonetheless I think it is a good attempt at scientific communication. It distills complex underlying data into a set of indicators ordinary busy/distracted (I try to avoid words like “ignorant”, but the result is the same whether we can’t or won’t educate our selves) people can understand. I still like the “ecological footprint” concept personally because it is a single system-based metric, and you can drill down a level into its individual components if you want to. Nonetheless, it seems to be out of fashion and replaced by this. Anyway, 6 of 10 planetary boundaries were already previously outside the “safe space”, and this time ocean acidification is added to the list. Only ozone and stratospheric aerosol loading are in the safe space, and paradoxically the latter exacerbates global warming somewhat. There is some nuance, with indicators like nitrogen pollution and biodiversity loss in the “high risk zone” and others like land use and carbon dioxide in the increasing risk zone and headed in the wrong direction.
Tag Archives: visualization
the ggplot2 “ecosystem”
In the beginning there was R. Or, S? I’ve heard that R actually rests on a foundation of C++ or Java. Anyway, then there was the tidyverse, sort of another whole programming language that rests in R (or a metastasizing cancer that has grown to dominate R, if you ask certain people, but I personally am a big fan). Now within the tidyverse was always ggplot2, which I have grown to rely on almost exclusively for plotting. Now ggplot2 itself has grown into an “ecosystem” of related programs and extensions. Here is a useful guide. I’ve always been interested in finding the really good ones for things like interactive charts (plotly) and animations (gganimate). And awesome as ggplot2 is, there are some things that are just clunky, like scales and legends (seriously, legends are a big pain point for me – I hope there is an extension out there that really streamlines legends). But I am also wary of using extensions that might be buggy or not updated/supported long term, which could make my code obsolete sooner. So I usually try to do things with ggplot2 proper first, and if that doesn’t work with a reasonable effort I will try one of the extensions. So this guide seems timely and useful.
most popular R books of 2023
Here is something useful (to me, personally, and maybe too others), and thankfully not too pessimistic or morally fraught.
A Crash Course in Geographic Information Systems (GIS) using R – yes, please! We must end the tyranny of the monopolistic Environmental Systems “Research Institute”. Okay, they make some nice products, but just admit you are a rapacious for-profit corporation, please!
A ggplot2 Tutorial for Beautiful Plotting in R – Who doesn’t need to improve their data visualization and communication game?
just start your y-axis at zero
Seriously, just do that and it will work out most of the time. The only exception in my mind is if you are comparing the range or spread of two data sets and neither one is close to zero.

I’ve been to Indonesia, and people there are normal human beings who are in fact somewhat shorter than Europeans on average. But their heads were typically around my shoulder height, not my knees. Some political violence has occurred there in the not-so-distant past, but I found the culture warm and hospitable. Like almost any country not at war, the biggest risk to your physical safety is probably being in a car accident or hit by a car. The next biggest if you are there for any length of time might be air pollution and second hand smoke. Once an Indonesian woman yelled at me to not sit next to her on a ferry. The ferry was crowded and there was nowhere else to sit, but I was eventually able to solve the problem by swapping seats with another woman (my gender being what made her uncomfortable apparently.) Other times I had groups of female Indonesian tourists stop me on the street and ask to take vacation pictures with me to show their friends back home. This was when I was quite a bit younger than I am now.
tile maps
Tile maps, which visually show areas with unequal areas as having equal area, are, somewhat obviously, appropriate when you don’t want the unequal geographic area to distort the message you are trying to communicate. An example might be if you want to show a variable by congressional districts, which have (roughly) equal populations but variable (spatial) areas.
A couple other ideas with tile maps are (1) to use rectangles of equal shape but different length/width ratios, and (2) to use words spatially arranged and with a variety of properties (font, size, color) to denote a variety of variables.
538 – best charts of 2022
There is nothing in 538’s best charts of 2002 that truly bowled me over. I mean, there are some graphics and maps that are effective at telling a story about their underlying data. There just aren’t any types of charts or applications of old types of charts that were a big surprise to me and that I thought I would want to copy if I could. Just purely for personal interest in the subject matter, the one I found most interesting was the map showing how college football conferences are losing all geographic meaning. I find myself slowly being less interested in college football with each passing year, and this is one reason why. My team’s losing campaign, loss to the NFL or “transfer portal” of many of their best players, blowout of the junior varsity squad in the mid-December bowl game they were lucky to even be selected for, and lackluster recruiting class are other reasons.
jobs, jobs, jobs, families, infrastructure, and more jobs…and Richard Nixon, from the bottom of my heart go fuck yourself!
Adam Tooze has a nice visualization of Biden’s spending proposals. Is this a tree plot? a cartogram? I’m not sure, experts please weigh in. A few things I noticed:
- What Biden talks most and least about does not always match the largest and smallest proposed spending amounts. I think this is called “messaging”. For example, more would be spent on electric vehicle subsidies than on community college.
- There is no clear line between the infrastructure package and the families package. For example, there is spending on public schools in the former and child care facilities in the latter.
That’s just scratching the surface. You could (and should) stare at this graphic for hours, and then there is a long article to go with it. But I have to go make breakfast now because I can hear the children getting grumpy, which means my precious little bit of early morning quiet thinking time as a working-parent-of-small-children-with-no-childcare-or-grandparent-support is now over. If Biden gets this stuff through our dysfunctional Congress, it will be mostly too late to help my family but I hope it helps others. Thanks Obama…Bush, Clinton, other Bush, Reagan, Carter, Ford, and Nixon at least. Especially Nixon, fuck you – a quick skim of the article reminded me of the bipartisan childcare program of the 1970s that you vetoed. Oh and also, fuck you Ralph Nader because maybe Al Gore would have gotten some of this stuff back on track 20 years ago. And last but not least, thank you once again Bernie Sanders for not pulling a Nader.
junkiest junk charts of 2020
Junk Charts is a great blog that takes an example of a data visualization, critiques it systematically, and then either improves it or shows a different way of displaying the same data. The site doesn’t go for overly elaborate graphics, just clear and effective ones. This post has a roundup of the most viewed posts and the author’s favorite posts of 2020.
One thing you probably shouldn’t do is describe interesting graphics in words. Nonetheless, here is some data, which I am not putting in a visual form because it would take exponentially longer than just listing it out:
- There are 12 graphics covered by the post.
- 2 scatter plots
- 3 bar charts
- 2 horizontal, not stacked – one of these gets changed to a bump chart
- 1 horizontal, stacked – actually this is more of a “tree plot” where two data points are stacked and then a third is placed underneath
- 2 pie charts
- 1 3D pie chart – gets converted to a bump chart
- 1 is allowed to continue to exist as a pie chart, with minor tweaks
- 1 “dot matrix” (I’m not even sure if this is the best name, but basically you have empty squares or circles showing the total number of a thing, then some of them get filled in to illustrate how many of that thing fit a certain category)
- 3 time series plots
- 2 conventional – although one has two vertical axes, and the author illustrates how the limits can manipulated to suggest to the eye that two trends are related, or not
- 1 showing shaded regions over time – basically a stacked bar changing over time
- 538’s election snake
There is something intuitive about pie charts – that is why we explain fractions and percentages to children in terms of pizza or pie, and they grasp it instantly. Pie charts are obviously the wrong way to compare the absolute magnitudes of things.
I do like tree plots. I made one in 2020 and I was proud of myself – it showed the number of acres served by stormwater management controls implemented by three different administrative programs. And then I made a second one where I broke the numbers down further within each of the categories. This was very effective in conveying how much is actually achieved by each of the programs compared to the effort and expense that goes into them.
Resolution for 2021 is to play with “dot matrix” plots at some point (and maybe learn what the best name for these is.) I think these are effective in putting numbers in context of bigger numbers, regardless of units. For example, my city has around 80,000 cumulative confirmed coronavirus cases, maybe 5,000 confirmed active infections (about the number of confirmed cases in the last 10 days), maybe between 80,000 and 800,000 actual cumulative infections, and a population of about 1.6 million. I don’t know how many have been vaccinated at this point, but probably a few thousand. So maybe I would make 16 or 160 boxes each representing a chunk of people, and start coloring them in. Then we could see at a glance how much of the population might have some immunity to the virus right now, and how much does not. You could slice and dice the data many ways. Of course, some people died or moved away, and others were born or moved in. Incidentally, about 2,600 people died of Covid, 400 were murdered, and 120 died in and around motor vehicles. I haven’t seen numbers on suicides or drug overdoses but they are always horrifying. Around 1% of any given population dies in any given year from a combination of preventable and not preventable causes, which is sad but news flash: we are mortal beings.
This site doesn’t do maps, which is fine. I am a big fan of maps. But I have a very simple test – is the data geographic in nature? Then make a map. But often, some other types of graphs and tables will further illuminate the data, and those often work well alongside your map rather than being shoehorned into your map where they don’t really belong. And I also find it clunky trying to do any type of mathematical analysis in mapping software when the analysis is not spatial in nature.
2020 visualizations from FiveThirtyEight
Fivethirtyeight.com has a roundup of interesting visualizations they did in 2020. There’s a lot here, but one theme I think I would like to try to make use of is is pretty simple. When you are counting something, put the count in context by first showing a bunch of empty squares that represent the potential or total number of something (voters, or citizens stopped by police, or human beings with potential Covid exposure). Then put dots in some of the boxes, or color in some of the boxes, to illustrate the count. If you want to introduce some additional categories, you can use colors or put boxes around the boxes, or to get really fancy, put groups of boxes on a map. This technique undoubtedly has a name, but the article doesn’t tell me what the name is.
one more covid tracker
I thought I was over covid trackers, but I just can’t help it. I know this isn’t my first “one more”, and it might not be my last. This one plots new cases over the past week on the vertical axis vs. total confirmed cases on the horizontal, the animates over time. You can add any country or U.S. state. The simulation starts whenever 10 cases were reported in that location, and you can see them grow at first exponentially and then deviate from the line when they start to get it under control. You can pick a log or arithmetic axis – log is good for the math, but it kind of lets you forget that there is a difference between 10 people dying and 10,000 people dying. Anyway, it’s nice and thanks to this person for posting it for free.