Tag Archives: data science

junkiest junk charts of 2020

Junk Charts is a great blog that takes an example of a data visualization, critiques it systematically, and then either improves it or shows a different way of displaying the same data. The site doesn’t go for overly elaborate graphics, just clear and effective ones. This post has a roundup of the most viewed posts and the author’s favorite posts of 2020.

One thing you probably shouldn’t do is describe interesting graphics in words. Nonetheless, here is some data, which I am not putting in a visual form because it would take exponentially longer than just listing it out:

  • There are 12 graphics covered by the post.
    • 2 scatter plots
    • 3 bar charts
      • 2 horizontal, not stacked – one of these gets changed to a bump chart
      • 1 horizontal, stacked – actually this is more of a “tree plot” where two data points are stacked and then a third is placed underneath
    • 2 pie charts
      • 1 3D pie chart – gets converted to a bump chart
      • 1 is allowed to continue to exist as a pie chart, with minor tweaks
    • 1 “dot matrix” (I’m not even sure if this is the best name, but basically you have empty squares or circles showing the total number of a thing, then some of them get filled in to illustrate how many of that thing fit a certain category)
    • 3 time series plots
      • 2 conventional – although one has two vertical axes, and the author illustrates how the limits can manipulated to suggest to the eye that two trends are related, or not
      • 1 showing shaded regions over time – basically a stacked bar changing over time
    • 538’s election snake

There is something intuitive about pie charts – that is why we explain fractions and percentages to children in terms of pizza or pie, and they grasp it instantly. Pie charts are obviously the wrong way to compare the absolute magnitudes of things.

I do like tree plots. I made one in 2020 and I was proud of myself – it showed the number of acres served by stormwater management controls implemented by three different administrative programs. And then I made a second one where I broke the numbers down further within each of the categories. This was very effective in conveying how much is actually achieved by each of the programs compared to the effort and expense that goes into them.

Resolution for 2021 is to play with “dot matrix” plots at some point (and maybe learn what the best name for these is.) I think these are effective in putting numbers in context of bigger numbers, regardless of units. For example, my city has around 80,000 cumulative confirmed coronavirus cases, maybe 5,000 confirmed active infections (about the number of confirmed cases in the last 10 days), maybe between 80,000 and 800,000 actual cumulative infections, and a population of about 1.6 million. I don’t know how many have been vaccinated at this point, but probably a few thousand. So maybe I would make 16 or 160 boxes each representing a chunk of people, and start coloring them in. Then we could see at a glance how much of the population might have some immunity to the virus right now, and how much does not. You could slice and dice the data many ways. Of course, some people died or moved away, and others were born or moved in. Incidentally, about 2,600 people died of Covid, 400 were murdered, and 120 died in and around motor vehicles. I haven’t seen numbers on suicides or drug overdoses but they are always horrifying. Around 1% of any given population dies in any given year from a combination of preventable and not preventable causes, which is sad but news flash: we are mortal beings.

This site doesn’t do maps, which is fine. I am a big fan of maps. But I have a very simple test – is the data geographic in nature? Then make a map. But often, some other types of graphs and tables will further illuminate the data, and those often work well alongside your map rather than being shoehorned into your map where they don’t really belong. And I also find it clunky trying to do any type of mathematical analysis in mapping software when the analysis is not spatial in nature.

2020 in Review

2020 has been quite a year for the U.S. and the world, but you don’t need me to tell you that! My work and family life was disrupted, but I have been lucky enough not to lose any family members or close friends to Covid-19 so far. If anyone reading this has lost someone, I want to express my condolences.

Now I’ll get right down to some highlights of my 2020 posts.

Monthly Highlights from 2020

Most frightening or depressing stories:

  • JANUARY: Open cyberwarfare became a thing in the 2010s. We read the individual headlines but didn’t connect the dots. When you do connect the dots, it’s a little shocking what’s going on.
  • FEBRUARY: The Amazon rain forest may reach a tipping point and turn into a dry savanna ecosystem, and some scientists think this point could be reached in years rather than decades. Meanwhile, Africa is dealing with a biblical locust plague. Also, bumble bees are just disappearing because it is too hot.
  • MARCH: Hmm…could it be…THE CORONAVIRUS??? The way the CDC dropped the ball on testing and tracking, after preparing for this for years, might be the single most maddening thing of all. There are big mistakes, there are enormously unfathomable mistakes, and then there are mistakes that kill hundreds of thousands of people (at least) and cost tens of trillions of dollars. I got over-excited about Coronavirus dashboards and simulations towards the beginning of month, and kind of tired of looking at them by the end of the month.
  • APRIL: The coronavirus thing just continued to grind on and on, and I say that with all due respect to anyone reading this who has suffered serious health or financial consequences, or even lost someone they care about. After saying I was done posting coronavirus tracking and simulation tools, I continued to post them throughout the month – for example herehereherehere, and here. After reflecting on all this, what I find most frightening and depressing is that if the U.S. government wasn’t ready for this crisis, and isn’t able to competently manage this crisis, it is not ready for the next crisis or series of crises, which could be worse. It could be any number of things, including another plague, but what I find myself fixating on is a serious food crisis. I find myself thinking back to past crises – We got through two world wars, then managed to avoid getting into a nuclear war to end all wars, then worked hard to secure the loose nuclear weapons floating around. We got past acid rain and closed the ozone hole (at least for awhile). Then I find myself thinking back to Hurricane Katrina – a major regional crisis we knew was coming for decades, and it turned out no government at any level was prepared or able to competently manage the crisis. The unthinkable became thinkable. Then the titans of American finance broke the global financial system. Now we have a much bigger crisis in terms of geography and number of people affected all over the world. The crises may keep escalating, and our competence has clearly suffered a decline. Are we going to learn anything?
  • MAY: Potential for long-term drought in some important food-producing regions around the globe should be ringing alarm bells. It’s a good thing that our political leaders’ crisis management skills have been tested by shorter-term, more obvious crises and they have passed with flying colors…doh!
  • JUNE: The UN just seems to be declining into irrelevancy. I have a few ideas: (1) Add Japan, Germany, India, Brazil, and Indonesia to the Security Council, (2) transform part of the UN into something like a corporate risk management board, but focused on the issues that cause the most suffering and existential risk globally, and (3) have the General Assembly focus on writing model legislation that can be debated and adopted by national legislatures around the world.
  • JULY: Here’s the elevator pitch for why even the most hardened skeptic should care about climate change. We are on a path to (1) lose both polar ice caps, (2) lose the Amazon rain forest, (3) lose our productive farmland, and (4) lose our coastal population centers. If all this comes to pass it will lead to mass starvation, mass refugee flows, and possibly warfare. Unlike even major crises like wars and pandemics, by the time it is obvious to everyone that something needs to be done, there will be very little that can be done.
  • AUGUST: We just had the 15-year anniversary of Hurricane Katrina, a major regional crisis that federal, state, and local governments failed to competently prepare for or respond to. People died, and decades later the recovery is incomplete. Coronavirus proves we learned nothing, as it is unfolding in a similar way on a much larger and longer scale. There are many potential crises ahead that we need to prepare for today, not least the inundation of major cities. I had a look at the Democratic and (absence of a) Republican platforms, and there is not enough substance in either when it comes to identifying and preparing for the risks ahead.
  • SEPTEMBER: The Covid recession in the U.S. is pretty bad and may be settling in for the long term. Demand for the capital goods we normally export (airplanes, weapons, airplanes that unleash weapons, etc.) is down, demand for oil and cars is down, and the service industry is on life support. Unpaid bills and debts are mounting, and eventually creditors will have to come to terms with this (nobody feels sorry for “creditors”, but what this could mean is we get a full-blown financial panic to go along with the recession in the real economy.
  • OCTOBER: Global ecological collapse is most likely upon us, and our attention is elsewhere. The good news is we still have enough to eat (on average – of course we don’t get it to everyone who needs it), for now.
  • NOVEMBER:  It seems likely the Clinton-Bush-Obama-Trump U.S. foreign wars may just grind on endlessly under Biden. Prove us wrong, Joe! (I give Trump a few points for trying to bring troops home over the objections of the military-industrial complex. But in terms of war and peace, this is completely negated and then some by slippage on nuclear proliferation and weapons on his watch.)
  • DECEMBER: The “Map of Doom” identifies risks that should get the most attention, including antibiotic resistance, synthetic biology (also see below), and some complex of climate change/ecosystem collapse/food supply issues.

Most hopeful stories:

  • JANUARY: Democratic socialism actually does produce a high quality of life for citizens in many parts of the world. Meanwhile, the hard evidence shows that the United States is slipping behind its peer group in many measures of economic vibrancy and quality of life. The response of our leaders is to tell us we are great again because that is what we want to hear, but not do anything that would help us to actually be great again or even keep up with the middle of the pack. This is in the hopeful category because solutions exist and we can choose to pursue them.
  • FEBRUARY: A proven technology exists called high speed rail.
  • MARCH: Some diabetics are hacking their own insulin pumps. Okay, I don’t know if this is a good thing. But if medical device companies are not meeting their patient/customers’ needs, and some of those customers are savvy enough to write software that meets their needs, maybe the medical device companies could learn something.
  • APRIL: Well, my posts were 100% doom and gloom this month, possibly for the first time ever! Just to find something positive to be thankful for, it’s been kind of nice being home and watching my garden grow this spring.
  • MAY: E.O. Wilson is alive and kicking somewhere in Massachusetts. He says if we want to save our fellow species and ourselves, we should just let half the Earth revert to a natural state. Somewhat related to this, and not implying my intellect or accomplishments are on par with E.O. Wilson, I have been giving some thought to “supporting” ecosystem services in cities. When I need a break from intellectual anything, I have been gardening in Pennsylvania with native plants.
  • JUNE: Like many people, I was terrified that the massive street demonstrations that broke out in June would repeat the tragedy of the 1918 Philadelphia war bond parade, which accelerated the spread of the flu pandemic that year. Not only does it appear that was not the case, it is now a source of great hope that Covid-19 just does not spread that easily outdoors. I hope the protests lead to some meaningful progress for our country. Meaningful progress to me would mean an end to the “war on drugs”, which I believe is the immediate root cause of much of the violence at issue in these protests, and working on the “long-term project of providing cradle-to-grave (at least cradle-to-retirement) childcare, education, and job training to people so they have the ability to earn a living, and providing generous unemployment and disability benefits to all citizens if they can’t earn a living through no fault of their own.”
  • JULY: In the U.S. every week since schools and businesses shut down in March, about 85 children lived who would otherwise have died. Most of these would have died in and around motor vehicles.
  • AUGUST: Automatic stabilizers might be boring but they could have helped the economy in the coronavirus crisis. Congress, you failed us again but you can get this done before the next crisis.
  • SEPTEMBER: The Senate Democrats’ Special Committee on the Climate Crisis had the courage to take aim at campaign finance corruption as a central reason for why the world is in its current mess. I hate to be partisan, folks, but right now our government is divided into responsible adults and children. The responsible adults who authored this report are the potential leaders who can lead us forward.
  • OCTOBER: We have almost survived another four years without a nuclear war. Awful as Covid-19 has been, we will get through it despite the current administration’s complete failure to plan, prevent, prepare, respond or manage it. There would be no such muddling through a nuclear war.
  • NOVEMBER: The massive investment in Covid-19 vaccine development may have major spillover effects to cures for other diseases. This could even be the big acceleration in biotechnology that seems to have been on the horizon for awhile. These technologies also have potential negative and frivolous applications, of course.
  • DECEMBER: The Covid-19 vaccines are a modern “moonshot” – a massive government investment driving scientific and technological progress on a particular issue in a short time frame. Only unlike nuclear weapons and the actual original moonshot, this one is not military in nature. (We should be concerned about biological weapons, but let’s allow ourselves to enjoy this victory and take a quick trip to Disney Land before we start practicing for next season…) What should be our next moonshot, maybe fusion power?

Most interesting stories, that were not particularly frightening or hopeful, or perhaps were a mixture of both:

  • JANUARY: Custom-grown human organs and gene editing and micro-satellites, oh my!
  • FEBRUARY: Corporate jargon really is funny. I still don’t know what “dropping a pin” in something means, but I think it might be like sticking a fork in it.
  • MARCH: I studied up a little on the emergency powers available to local, state, and the U.S. federal government in a health crisis. Local jurisdictions are generally subordinate to the state, and that is more or less the way it has played out in Pennsylvania. For the most part, the state governor made the policy decisions and Philadelphia added a few details and implemented them. The article I read said that states could choose to put their personnel under CDC direction, but that hasn’t happened. In fact, the CDC seems somewhat absent in all this other than as a provider of public service announcements. The federal government officials we see on TV are from the “Institute of Allergies and Infectious Diseases”, which most people never heard of, and to a certain extent the surgeon general. I suppose my expectations on this were created mostly by Hollywood, and if this were a movie the CDC would be swooping in with white suits and saving us, or possibly incinerating the few to save the many. If this were a movie, the coronavirus would also be mutating into a fog that would seep into my living room and turn me inside out, so at least there’s that.
  • APRIL: There’s a comet that might be bright enough to see with the naked eye from North America this month. [Update: It wasn’t. Thanks, 2020.]
  • MAY: There are unidentified flying objects out there. They may or may not be aliens, that has not been identified. But they are objects, they are flying, and they are unidentified.
  • JUNE: Here’s a recipe for planting soil using reclaimed urban construction waste: 20% “excavated deep horizons” (in layman’s terms, I think this is just dirt from construction sites), 70% crushed concrete, and 10% compost.
  • JULY: The world seems to be experiencing a major drop in the fertility rate. This will lead to a decrease in the rate of population growth, changes to the size of the work force relative to the population, and eventually a decrease in the population itself.
  • AUGUST: Vehicle miles traveled have crashed during the coronavirus crisis. Vehicle-related deaths have decreased, but deaths per mile driven have increased, most likely because people drive faster when there is less traffic, absent safe street designs which we don’t do in the U.S. Vehicle miles will rebound, but an interesting question is whether they will rebound short of where they were. One study predicts about 10% lower. This accounts for all the commuting and shopping trips that won’t be taken, but also the increase in deliveries and truck traffic you might expect as a result. It makes sense – people worry about delivery vehicles, but if each parcel in the vehicle is a car trip to the store not taken, overall traffic should decrease. Even if every 5 parcels are a trip not taken, traffic should decrease. I don’t know the correct number, but you get the idea. Now, how long until people realize it is not worth paying and sacrificing space to have a car sitting there that they seldom use. How long before U.S. planners and engineers adopt best practices on street design that are proven to save lives elsewhere in the world?
  • SEPTEMBER: If the universe is a simulation, and you wanted to crash it on purpose, you could try to create a lot of nested simulations of universes within universes until your overload whatever the operating system is. Just hope it’s backed up.
  • OCTOBER: There are at least some bright ideas on how to innovate faster and better.
  • NOVEMBER: States representing 196 electoral votes have agreed to support the National Popular Vote Compact, in which they would always award their state’s electoral votes to the national popular vote winner. Colorado has now voted to do this twice. Unfortunately, the movement has a tough road to get to 270 votes, because of a few big states that would be giving up a lot of power if they agreed to it.
  • DECEMBER: Lists of some key technologies that came to the fore in 2020 include (you guessed it) mRNA vaccines, genetically modified crops, a variety of new computer chips and machine learning algorithms, which seem to go hand in hand (and we are hearing more about “machine learning” than “artificial intelligence” these days), brain-computer interfaces, private rockets and moon landings and missions to Mars and mysterious signals and micro-satellites and UFOs, virtual and mixed reality, social media disinformation and work-from-home technologies. The wave of self-driving car hype seems to have peaked and receded, which probably means self-driving cars will probably arrive quietly in the next decade or so. I was surprised not to see cheap renewable energy on any lists that I came across, and I think it belongs there. At least one economist thinks we are on the cusp of a big technology-driven productivity pickup that has been gestating for a few decades.

That’s a lot to unpack, and I’m not sure I can offer a truly brilliant synthesis, but below are a few things that are on my mind as I think through all this.

We Americans affirmed that we care about our parents and grandparents (then failed to fully protect them).

One thing I think we learned is that we still value human lives more than a cold, purely economic calculation might suggest, including the lives of our elderly parents and grandparents. (Though we had significant failures of execution when it came to actually protecting people – more on that later.) We have had this debate before in the U.S., for example when thinking about how much to invest in environmental and safety regulations as I was reminded of by this Planet Money podcast. At one point, politicians (can you guess from which party) proposed valuing the lives of senior citizens at lower rates than everyone else. The backlash was fierce and instant, and the proposal was withdrawn. This year, we did not really have that debate – it was simply accepted, for the most part, that we would be willing to endure significant economy-wide pain to try to protect our parents and grandparents.

I kind of liked how Mr. Money Mustache put it back in April. He gave a “worst case scenario” with 3 million deaths and a “best case scenario” with 200,000 deaths, and the reality is on track to be somewhere in between.

In the worst case, our public officials would all downplay the risk of COVID-19, and we’d keep working and traveling and spreading it freely. We’d maximize our economic activity and let the disease run its course…

In the more compassionate case which we are currently following, we drastically reduce the amount of contact we have with each other for a few months, which cuts the number of deaths in the US down from 3-6 million, down to perhaps 200,000. In exchange, our economy shrinks by several trillion dollars (it was about 21 trillion in 2019) for a year or more.

Assuming we are preventing 3 million early deaths, this means our society is foregoing about one million dollars of economic activity for each person’s life that we extend and frankly, it makes me happy to know we are capable of that.

Mr. Money Mustache

The leaders of some countries like Russia, Brazil, and even Sweden seem to have chosen to accept the consequences of business as usual. Most other countries have chosen to try to save human lives at the expense of short-term economic activity, and some executed this strategy much more effectively than others. In the U.S. and UK, we seem to be bumbling idiots who feel some compassion for one another.

The United States has been slipping for awhile, and in 2020 we faltered.

The U.S. continues to slip below average among its developed country peers in many statistical categories like life expectancy, violence, incarceration, suicide, poverty, and public infrastructure. I picture us like a horse that used to be leading the race, then slipped into the middle of the leading pack, and has now drifted toward the back of the leading pack and is continuing to lose ground. Keep slipping and we would no longer be part of the leading pack.

But then came Covid-19, our horse faltered, and all the other horses went thundering past, leaving us in last place. With the possible exception of the UK, we had the least effective response in the world. Like I said, I think a few countries like Russia, Brazil, and Sweden basically chose to accept the consequences of a limited response, and that is different than a failed response (though not to the people who died or whose loved ones died). We tried to respond, and it turned out our government was unprepared and incompetent even compared to developing countries.

So what happened? Some particular failing of the Anglo-American countries doesn’t explain it, because Canada and Australia both did pretty well. Our lack of a public health system (or even universal access to private care) doesn’t explain it, because the UK, Canada, and Australia all have similar systems to each other and divergent outcomes.

The difference between the extraordinary low rates in Asia, and the higher rates in Europe and the Americas is particularly stark. There are a couple things that I think may explain it. First is good airport screening. I traveled in Asia during the swine flu pandemic, and the screening is robust. The U.S. obviously has to beef up its health infrastructure at international airports and other border crossings (yes, there is a certain irony here that is lost on anti-immigrant types.) Part of this is also beefing up the data systems that track who is coming in from where, where they are going and what their status is. It became obvious within weeks that the CDC’s databases were a complete failure.

I think beyond border screening and data management, the other big difference between East and West is that Asian countries were willing to restrict physical movement and enforce quarantine, whereas western countries mostly were not. Had I exhibited symptoms while I was traveling in Singapore or Thailand during the swine flu, either country would have detained me in a government facility (with three meals a day and wi-fi, one would hope) for 14 days. Asian countries have also been willing to shut down domestic airports, train systems, and highways at times. Most western countries are simply not willing to do this. In the U.S., I think it is partly a matter of law and politics, but also a stupid idea that it would be “too expensive” when quite obviously it would have saved trillions of dollars in the long run. We simply don’t have the political will, the institutional mechanisms, or the basic competence. Covid-19 was a borderline crisis – a lot of people will lose cherished parents and grandparents but it is not an existential threat to our country’s survival. The U.S. needs to plan now to quarantine effectively in an even worse pandemic or god forbid, an incident involving biological weapons.

A few words on government agencies. Hurricane Katrina came up a few times in the monthly picks above. That was a major failure of federal, state, and local governments in the U.S. to plan, respond, and rebuild after a disaster. Before that, I would have assumed FEMA was up to the task, as they seem to have been in the past. Most people’s faith in the CDC was similar or even greater, and they turned out to be bumbling fools. The U.S. will need to fund its public agencies, stock them with competent, well-trained technocrats, and appoint talented political leaders to integrate them with the rest of society if they are going to function competently in the future.

In a hurricane, FEMA basically rolls into your city and takes charge, for better or worse. Early on, there was speculation that the CDC might try to do something similar in a disease outbreak. That didn’t happen. We will also need to adequately fund and train state and local agencies, if we are going to continue to put the lion’s share of the burden on them in a decentralized disaster like this. We could just get rid of the states and have the federal government work directly with metro areas, but this seems like a pretty pie in the sky idea politically.

What other government agencies do we have faith in that might have turned into rotten hollow logs while we weren’t paying attention? The Treasury and Federal Reserve do in fact seem to know what they are doing, which has saved us a couple times now in the last couple decades. We assume the military can fight a war if they need to. We assume the Department of Agriculture can feed us. Are we sure?

The democratization of propaganda.

Governments in general, and the U.S. government in particular, are having trouble getting messages out to their citizens. We used to worry about governments and big business controlling the media to put out purely ideological or purely profit-driven messages. Now anyone in the world can pretty much say anything anytime. People have trouble telling which messages are truthful and which are more reliable than others. In the U.S., this is combined with low trust in government and low trust in experts, and the result is that people either didn’t receive important messages about public health, or received a variety of conflicting information and noise and didn’t reach reasonable conclusions reading to reasonable decisions.

We hear a lot about “following the science” and “listening to scientists”, but this is really about policy communication not science communication. Scientists are trained to communicate uncertainty to each other. Often though, the uncertainty is low enough that it is clear one course of action has better odds of a good outcome than others. Media do not communicate this well – they tend to focus on the uncertainty statements scientists make, even when uncertainty is low and the best course of action is clear. The public is not prepared to process this information in a way that will lead to reasonable conclusions and decisions.

So we need to try to educate children to evaluate the source of information and think critically about whether it makes sense in the context of what they know. We need to educate them about uncertainty and decision making. We need to train journalists better to communicate scientific information but especially policy choices. Regulating social media companies might play some small role in this, but in the U.S. at least we don’t want to see a move toward censorship.

Back to the CDC. When Covid-19 hit, I was expecting the CDC to step in and dominate communications from the beginning on the issue. They needed to use all the tools modern advertising has to get messages across. I would have trusted what they said, and I think a lot of people would. If they had seized the initiative, it would have been hard for other voices to compete, and we might be in a better place now. Unfortunately, they have probably suffered a permanent loss of credibility both through poor communication and inadequate action, but better communication would definitely have helped. Make this one more U.S. institution that has lost credibility in my eyes as I have gotten older – Congress, the State Department, and the New York Times after weapons of mass destruction (I never trusted intelligence agencies), the military after the failures in Afghanistan and Iraq (I’m not saying I trusted them per se, but I thought they were good at fighting wars), FEMA after Hurricane Katrina (and more recently the horrific non-response in Puerto Rico), and now the CDC and federal public health establishment.

I have come to respect local public health authorities more through all of this. I actually work in the same building as my local public health agency, and know some people who work there, but I never really saw the connection to the larger health care system or my daily life before this. Part of the federal government’s communication strategy should be to package crystal clear messages for delivery by trusted local individuals like public health workers, family doctors, and school nurses.

Preparing for the big (and small) risks

Covid-19 has caused me to think even more about risk management. A major pandemic was something we knew was virtually certain to happen at some point, and we knew the consequences could be severe. And yet we still failed to adequately plan, prepare, and respond. There are a few other things in this category, like (obviously) another pandemic, a major earthquake, and sea level rise. Then there are risks where we are not sure of the probability, but the consequences could be catastrophic, like nuclear and biological war, ecological collapse, and major food shortages. (Alien invasion? No, I’m not really taking this seriously, but along with things like “gray goo” it should be on the list and discussed, providing a rational basis for taking action or not.) Then there are things that are certain to happen but are geographically limited (storms, fires, floods) or steadily kill a few people here and there adding up to a lot over time (car crashes, air pollution, poor nutrition). I am not sure where some risks fit in, for example cyberattacks or antibiotic resistance – but this is the point of gathering the information and having the discussions in a rational framework. In a rational world, a risk management framework provides a way to allocate finite resources (money, effort, expertise, research) to planning, preparing, mitigating, or simply choosing to accept each of these.

The state of scientific and technological progress (is the Singularity near yet?)

I had a decent technology list under “most interesting post” for December, so I won’t repeat it here.

Above, I find myself referring to the Covid vaccine as a “moon shot”. It is clearly an example of how a big government push can get a new technology over the finish line and bring it into widespread use quickly. I am wondering though if it is a true example of accelerating a scientific breakthrough, an example of accelerating application of a scientific breakthrough to new technology, or simple a case of government correcting a market failure. We had been hearing about mRNA vaccine technology for awhile, and we know a vaccine was developed for SARS but not widely deployed. We have also been hearing for awhile that drug companies were still growing basic childhood vaccines in chicken eggs, and not investing heavily in the mRNA technology, because the market demand and profit potential was not there in the rich countries to make it worth their while. So this was at least partially a case of the U.S. and other governments making that market failure go away by simply paying for everything and simply transferring the profits to those companies. I am not saying this is bad – we do it for arms manufacturers all the time, so why not vaccines?

Vaccines for HIV, dengue fever and other similar mosquito-borne diseases would be nice. One solution to antibiotic resistance might be bacteriophages – viruses tailored specifically to infect and kill specific bacteria. It seems like this technology could be applied to this. If antibiotic resistance is really the medium- to long-term emergency some say it is, maybe this should be a top priority.

This technology is also scary. It is the ability to create a custom organism that can go into a person’s body and have a specific desired effect. Vaccines are obviously a benign application, but somebody, somewhere, sometime will use this technology for evil. This seems like a near-existential risk on the horizon that needs to be dealt with.

I am going to say no, the Singularity is not imminent in 2021. Then again, the idea is that if at some point we hit the knee of the curve on technology and productivity, it will seem to accelerate all at once, because that is the nature of exponential change. If that happens, we will shrug and say we knew it all along. The trick is to find ways to drive innovation and progress while managing the risks that could temporarily but repeatedly set back or permanently derail that path, and without destroying our planetary ecosystem in the process. I am not ready to put odds on what outcome we are headed for, but I am hoping 2021 will at least bring a gradual return to the pre-Covid status quo, and allow us to set the stage for the future.

If anyone has actually read my ramblings all the way to this point, or just skipped to the end, Happy New Year!

2020 visualizations from FiveThirtyEight

Fivethirtyeight.com has a roundup of interesting visualizations they did in 2020. There’s a lot here, but one theme I think I would like to try to make use of is is pretty simple. When you are counting something, put the count in context by first showing a bunch of empty squares that represent the potential or total number of something (voters, or citizens stopped by police, or human beings with potential Covid exposure). Then put dots in some of the boxes, or color in some of the boxes, to illustrate the count. If you want to introduce some additional categories, you can use colors or put boxes around the boxes, or to get really fancy, put groups of boxes on a map. This technique undoubtedly has a name, but the article doesn’t tell me what the name is.

why we’re numb to mass death

I’ve always found close up pictures of Hiroshima victims to be some of the most affecting images I’ve ever seen, and yet knowing that 100,000* people were vaporized in a fraction of a second has less emotional effect. We also get numb to hearing about steady numbers of deaths that add up to a lot over time, like car accidents. I’m not a monster – this is a bug in human psychology. This article in Axios gives other examples of the phenomenon, from the Holocaust to the Rwanda genocide to the U.S. coronavirus meltdown. The article links to an academic paper by Paul Slovic at the University of Oregon, who studies this “psychic numbing” effect.

A defining element of catastrophes is the magnitude of their harmful consequences. To help society prevent or mitigate damage from catastrophes, immense effort and technological sophistication are often employed to assess and communicate the size and scope of potential or actual losses. This effort assumes that people can understand the resulting numbers and act on them appropriately. However, recent behavioral research casts doubt on this fundamental assumption. Many people do not understand large numbers. Indeed, large numbers have been found to lack meaning and to be underweighted in decisions unless they convey affect (feeling). As a result, there is a paradox that rational models of decision making fail to represent. On the one hand, we respond strongly to aid a single individual in need. On the other hand, we often fail to prevent mass tragedies – such as genocide – or take appropriate measures to reduce potential losses from natural disasters. We believe this occurs, in part, because as numbers get larger and larger, we become insensitive; numbers fail to trigger the emotion or feeling necessary to motivate action. We shall address this problem of insensitivity to mass tragedy by identifying certain circumstances in which it compromises the rationality of our actions and by pointing briefly to strategies that might lessen or overcome this problem.

The More Who Die, the Less We Care: Psychic Numbing and Genocide

I’ve often thought about a class that would teach the history of a war or tragedy by the numbers, by focusing on the number of deaths, who the people were, where they occurred and when they occurred. I think that would be educational (if depressing). But to put it in perspective you might need some visuals. One idea would be a stadium with people vanishing from seats. (This would work for, say, up to 100,000 deaths.) For even larger numbers, maybe you could start with a point in the center of the town where the class is being held or where students live, and then expand the dot outward as though all the people who live inside it were to vanish. You could even make this an app based on census data, and let the user pick the center of the bubble. Then finally, you probably should tie some of the deaths to individual stories, or interviews with survivors, friends and family. For me personally though, the numbers are important to put the emotional stories in context, and I am wary of news stories that don’t have numbers. Morbid stuff!

* Okay, I admit the “100,000 people in a fraction of a second” is just a number I picked somewhat for shock value. According to Wikipedia, 70,000-80,000 people were either vaporized instantly or burned to death shortly after the blast. Then a bunch more died of radiation poisoning of course. Does this make it any better? No, when it’s my turn please just vaporize me.

my official election prediction

There’s plenty of election coverage out there, so who needs this post? Well, I’ve been looking for one source of information on when the swing state polls close, what the vote counting situation is, and what the current poll/forecast situation is. I don’t see all of that in one place so here, just for myself, is some info.

I’m a little partial to FiveThirtyEight, just because I’ve been following them for a few elections now. There are other polling and modeling sources out there. I got poll closing times from 270 to win.

Florida

  • Poll closing: 7:00 p.m. ET (for most of the state including all the sizable cities, except that little bit of the panhandle including Pensacola at 8:00 p.m. ET)
  • The counting situation, according to 538: Despite their bad reputation from that election year that shall not be named around the turn of the century, they expect to have most or all results within two hours of closing. They count absentee and mail-in votes in advance, so they just need to combine them with live results and it should result in a more or less complete count. Unless things are really really close, like, you know, that one year…
  • 538 poll average on Friday 10/30 around 4:30 p.m.: Biden +2.2%
  • 538 odds on Friday 10/30 around 4:30 p.m.: Biden 66/34

Georgia

  • Poll closing: 7:00 p.m. ET
  • The counting situation: quick. They’ve counted mail-in ballots in advance. They expect overseas ballots to trickle in, but things would have to be really close for those to matter.

Ohio

  • Poll closing: 7:30 p.m. ET
  • The counting situation: They count absentee ballots in advance, then in-person votes, but they will still count absentee ballots received up to November 13. So if it is close enough that outstanding mail-in ballots could make a difference, news organizations won’t call it on election night.

North Carolina

  • Poll closing: 7:30 p.m. ET
  • The counting situation: About 80% should be counted right away, and more over the next few hours. But then they will still count ballots arriving by November 12, so same story: news organizations won’t call it if it is close.

Texas

  • Poll closing: 8:00 ET (locations in the Central Time Zone, which is almost all of Texas), 9:00 PM (locations in the Mountain Time Zone, which is basically El Paso)
  • The counting situation: Almost everything early on election night. They will still count ballots received by 5 p.m. the day after election day.

Pennsylvania

  • Poll closing: 8:00 p.m. ET
  • The counting situation: Oh, my beloved home state. Pennsyltucky as some call it, but that is completely unfair to the great state of Kentucky which plans to count 90% of ballots on election night. Under state law, we will not start counting mail-in ballots until polls open on election day. The process is supposed to conclude around Friday. Enormous numbers of people have voted by mail, including yours truly. Republicans will tend to vote in person, Democrats by mail. The state is about equally split (basically the Philadelphia metro region and downtown Pittsburgh vs. pretty much everyone else). So it could look like things are trending Republican on election night, but there will be enormous numbers of outstanding ballots expected to skew Democratic. Pennylvania will also count ballots received up to three days after election day, as allowed in not one but two Supreme Court cases over the past few weeks. Bottom line, it seems unlikely this one will be called on election night.

Michigan

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “a few days”. They will start counting mail-in ballots one day early, but are not expecting to finish until around Friday.

Arizona

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “most” on election night

Iowa

  • Poll closing: 10:00 p.m. ET
  • The counting situation: “most” on election night, and they are counting mail-in ballots early

Wisconsin

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “all results by Wednesday morning”

Nevada

  • Poll closing: 10:00 p.m. ET
  • The counting situation: expecting to get most votes from the Vegas area on election night, but counting all votes could take until November 10

Okay, so how might election night unfold. First, I went to 270 to Win’s interactive map. You can pre-populate it with a variety of forecasts from a variety of sources, which is cool. I stuck with 538. Then I turned all the states above into “tossups”. I gave Trump one bonus electoral vote from Maine’s second district, which I don’t know anything about or what to do with.

This starting point is: Biden 227, Trump 126 (remember, you need 270 to win)

Let’s do a scenario where things go unexpectedly well for Trump.

  • Florida closes and is counted quickly. Biden 227, Trump 155
  • Counting also goes well in Georgia. Biden 227, Trump 171
  • Let’s say things go well for Trump in Ohio (where he is a slight favorite), and news organizations are willing to call it: Biden 227, Trump 189
  • North Carolina is counted quickly and goes to Trump: Biden 227, Trump 204
  • Texas goes to Trump quickly and decisively: Biden 227, Trump 242
  • Pennsylvania: no call on election night
  • Michigan: no call on election night
  • Arizona goes to Biden: Biden 238, Trump 242
  • Iowa is counted quickly and goes to Trump: Biden 238, Trump 248
  • Wisconsin is not really close. Even with some outstanding ballots, let’s say news organizations call it for Biden on election night. Biden 248, Trump 248
  • Nevada is not really close, but let’s say there is no call on election night.

We are tied. We go to bed, and every politician in America from President on down starts running their mouth on Wednesday. Lawsuits ensue. But those votes from Pennsylvania, Michigan, and Nevada trickle in during the week, and Biden has substantial leads in all three. It would take an extraordinary amount of luck just for Trump to get close to 50/50 odds.

Here’s a more likely scenario, so let’s consider this my prediction:

  • Florida is called for Biden around 8 p.m. The call is made by the same news organizations that called Florida for Al Gore precisely 20 years ago, but they are much more conservative (in the statistical sense, meaning looking for a higher degree of certainty) these days. Biden 256, Trump 126
  • Georgia goes narrowly for Trump. Biden 256, Trump 142
  • Ohio goes to Trump. Biden 256, Trump 160
  • North Carolina is called for Biden around 9 p.m. Biden 271, Trump 160. IT’S OVER!!!
  • I’m going to stop doing math now. Texas and Iowa go narrowly to Trump, but Arizona, Wisconsin, and Nevada pile on for Biden late Tuesday night or sometime on Wednesday, and the route is on.
  • It doesn’t matter if Pennsylvania and Michigan take a long time to count their votes, but eventually they do, and the route becomes a landslide. I’ll call 300+ a landslide, although it certainly falls short of the near-sweep (525/538) Ronald Reagan pulled off in 1984. Like the guy or not, that was a clear victory.
  • My final prediction: Biden 334, Trump 204

coronavirus trackers and simulations revisited

Update: December 13, 2020 (and from time to time since then, I update links if I notice they are broken)

This post is getting a surprising amount of attention. I don’t normally update posts, but I am updating this one since it is getting attention and the commentary in the original post is significantly outdated. Rest assured, if you are a historian in the far future studying what I was thinking back in June 2020, I have kept the original post at the bottom. I am keeping all the links, just grouping them somewhat and removing (from this section) the outdated commentary. (Thank you, Word Press, for making a simple copy-and-paste operation like this beyond excruciating.)

Data Trackers

  • Johns Hopkins – map, stats, access to data sets
  • New York Times – a national (U.S.) map by county and plots by state (now, with a paywall! as of 7/30/21. Which I will never pay because WEAPONS OF MASS DESTRUCTION!)
  • Financial Times – similar to others, but they look at excess deaths a little differently and have some interesting graphics
  • BBC – similar to NYT, but international
  • CDC – changed this link to their “COVID-19 by County” page on 2/26/22; the updated recommendation is to mask indoors if new cases in your county are 200,000 per 100,000 population per week, AND if the number of people entering the hospital and/or in the hospital is above certain thresholds. It’s a little hard to find the data and figure out yourself, so if you trust the CDC (and who wouldn’t?) you can just type in your county and they will tell you if it is high/medium/low.
  • https://coronavirus.thebaselab.com/ – a variety of maps and plots
  • City Observatory – intermittent data-based articles and maps
  • Our World in Data – excellent interactive country-level data, maps, and plots. A tip – you can also type in “world” or the name of a continent in the country box.
  • https://aatishb.com/covidtrends/ – a very clever animated time series of growth in cases over time, by country
  • Reuters – just more numbers and maps, similar to NYT
  • Covid Act Now – state-level data and communication in a simple, easy to understand index format
  • Harvard Global Health Institute COVID Risk Levels Dashboard – similar to Covid Act Now, but less simple and less easy to understand. Seems to have more ability to drill down into county-level data, although when you do that much of it is blank.
  • Wastewater surveillance from “Biobot Analytics” – added 4/30/22.

Simulations

  • University of Washington IHME – the best place I have found for understandable future projections. At the state level.
  • FiveThirtyEight – compares different models (no longer updating as of 7/30/21)
  • https://covid19risk.biosci.gatech.edu/ – This site calculates the probability that someone in a group of a given size is infected, based on the estimated rate of active cases in a U.S. state.
  • MicroCOVID – a risk calculator based on local data and allowing you to adjust your risk tolerance and try out various scenarios (added 8/8/21), such as “one night stand with a random person” (on the latter, please remember there are other diseases besides just Covid-19, for example antibiotic-resistant syphilis…)
  • Covid-19 Forecast hub – another visualization of various models and ensembles of models

Vaccine Trackers

Local Pennsylvania/Philadelphia Interest

  • The state of Pennsylvania has a useful dashboard which they have now made public (or it was public before and I didn’t notice.) It compares cases, positive tests, and hospital data for the current and last 7-day period, at the county level.
  • Speaking of Philadelphia, a shout out to the Philadelphia Health Department which provides some open downloadable data.

Miscellaneous Stuff

Original Post (June 27, 2020)

I decided to list out and summarize the variety of trackers and simulations I’ve mentioned in previous posts. Like many people (in the U.S. Northeast at least), I was glued to coronavirus info on various screens from roughly mid-March to mid-May, then my attention started to gradually drift to other things as the situation got better. Now, it seems that it has either stabilized at a not-quite-out-of-the-woods level, or is slowly reversing itself as we see other parts of the country start to be affected more seriously (sorry if you are reading this and are being affected, we in the Northeast take no pleasure in your suffering, I promise, although we suggest you turn out any bigoted anti-science politicians in your area who are letting this happen.) Anyway, I find that I am interested in starting to look at trackers and simulations again on a daily basis. These are in the order I discovered them.

  • Johns Hopkins – a neat map early on, although now the entire world has become a blob. Still a good place to stare at data.
  • New York Times – a national (U.S.) map by county and plots by state. seems to load even though I have used all my free articles for the month.
  • BBC – they update continuously but I’m not sure if this link will be to the latest
  • CDC – this is what I would have predicted would be the go-to source of information and expertise if you asked me before all this started…but it’s mediocre at best. Yes, that just about sums it up.
  • https://coronavirus.thebaselab.com/ – a variety of maps and plots to stare at, not my first stop but a little different if I am tired of others
  • University of Washington IHME – still the most informative state-level simulations I have found, accounting for hospital capacity among other things
  • City Observatory – they did an awesome analysis by U.S. metro area, which I have not seen anyone else do (human beings interact with each other socially and economically in cities and their suburbs, which often cut across states, and states often contain metro areas that are not connected much socially or economically. Economists, social scientists and urban planners know this of course, but nobody else studying the epidemic seems to have figured this out. Seriously, other data visualization and simulation sites, you can do this, it’s just a matter of grouping data by counties.) Unfortunately, they quit updating it and have not automated it. I still check every now and then to see if they have picked it up.
  • Our World in Data – pretty much every conceivable way of looking at data by country. I like to look at confirmed deaths per million across countries. By this measure, the starkest contrast is east vs. west. The eastern countries were hit first, hard, and without warning, and their death rates are very, very low. They have a variety of government types, responses, ethnicities and cultures. I just don’t think anybody has come close to explaining it. The U.S. is in the middle of the pack of western countries, which somewhat contradicts conventional wisdom and suggests news organizations are making the obvious error of not normalizing by population.
  • https://aatishb.com/covidtrends/ – an animated time series of new confirmed cases in the past week vs. total confirmed cases, both on a log scale, by country. As I write this, shows the beginning of a concerning uptick for the United States, and Brazil out of control.
  • Reuters – I actually never wrote about this one, but it has a map and some numbers.
  • FiveThirtyEight – they have an aggregation of various simulation models out there. New York and New Jersey look like a stream sprayed horizontally out of a garden hose, while Texas and Florida (today) look more like a fire hose.
  • https://covid19risk.biosci.gatech.edu/ – This site calculates the probability that someone in a group of a given size is infected, based on the estimated rate of active cases in a U.S. state. I assume it’s estimated active cases, anyway, or it wouldn’t make sense. It would be better by metro area (seriously guys, someone just get this done), but still a nice idea. I’m in Philadelphia, but I figure the New Jersey numbers are probably the most applicable.
  • Covid Act Now – provides a composite risk index at the state level, and county when county level data is available in the right format (which is not that often)
  • Harvard Global Health Institute COVID Risk Levels Dashboard – keeps it simple with just data on new cases, but gives you a variety of nice mapping, charting, and tabular formats to slice and dice the data at country, (U.S.) state or county level.
  • The state of Pennsylvania has a useful dashboard which they have now made public (or it was public before and I didn’t notice.) It compares cases, positive tests, and hospital data for the current and last 7-day period, at the county level.
  • Speaking of Philadelphia, a shout out to the Philadelphia Health Department which provides some open downloadable data.
  • I look at the FAO food price index on occasion. It’s falling lately. Sometimes I look at oil and gold prices, and how many Special Drawing Rights can be bought with one U.S. dollar. Oh and, the Rapture Index is at an all time high!

April 2020 in Review

Most frightening and/or depressing story:

  • The coronavirus thing just continued to grind on and on, and I say that with all due respect to anyone reading this who has suffered serious health or financial consequences, or even lost someone they care about. After saying I was done posting coronavirus tracking and simulation tools, I continued to post them throughout the month – for example here, here, here, here, and here. After reflecting on all this, what I find most frightening and depressing is that if the U.S. government wasn’t ready for this crisis, and isn’t able to competently manage this crisis, it is not ready for the next crisis or series of crises, which could be worse. It could be any number of things, including another plague, but what I find myself fixating on is a serious food crisis. I find myself thinking back to past crises – We got through two world wars, then managed to avoid getting into a nuclear war to end all wars, then worked hard to secure the loose nuclear weapons floating around. We got past acid rain and closed the ozone hole (at least for awhile). Then I find myself thinking back to Hurricane Katrina – a major regional crisis we knew was coming for decades, and it turned out no government at any level was prepared or able to competently manage the crisis. The unthinkable became thinkable. Then the titans of American finance broke the global financial system. Now we have a much bigger crisis in terms of geography and number of people affected all over the world. The crises may keep escalating, and our competence has clearly suffered a decline. Are we going to learn anything?

Most hopeful story:

  • Well, my posts were 100% doom and gloom this month, possibly for the first time ever! Just to find something positive to be thankful for, it’s been kind of nice being home and watching my garden grow this spring.

Most interesting story, that was not particularly frightening or hopeful, or perhaps was a mixture of both:

  • There’s a comet that might be bright enough to see with the naked eye from North America this month.

one more covid tracker

I thought I was over covid trackers, but I just can’t help it. I know this isn’t my first “one more”, and it might not be my last. This one plots new cases over the past week on the vertical axis vs. total confirmed cases on the horizontal, the animates over time. You can add any country or U.S. state. The simulation starts whenever 10 cases were reported in that location, and you can see them grow at first exponentially and then deviate from the line when they start to get it under control. You can pick a log or arithmetic axis – log is good for the math, but it kind of lets you forget that there is a difference between 10 people dying and 10,000 people dying. Anyway, it’s nice and thanks to this person for posting it for free.

more coronavirus tracking

This massive data analysis entry from Our World in Data is a pretty good example of how to take a data set and beat the crap out of it from every angle.

I like what they did. Since it’s by country, it allows interesting comparisons across countries but is not meant to provide local or regional-specific information. Countries are pretty big. My favorite trackers that are most relevant to my situation are still the City Observatory analyses of U.S. metro areas and the University of Washington simulations of available hospital capacity. The latter are by state.