Tag Archives: statistics

an agent-based stock market simulator

This agent-based stock market simulator, which was originally programmed in NetLogo and later moved to R, captures the behavior of the market in a statistical sense. Which is to say, it shows how multiple traders following logical strategies can add up to a whole lot of randomness and unpredictability. Also known as autoregressive conditional heteroscedasticity and/or generalized autoregressive conditional heteroscedasticity, if I remember my statistics class correctly. But the article does not go into that.

stock market returns as simulated by an agent-based model

the (thing everybody calls the) Nobel Prize in economics

This year’s Nobel prize for economics, which we are supposed to call the “Sveriges Riksbank Prize in Economic Sciences”, is for a method of identifying natural experiments in data. The importance of this is to get over the “correlation is not causation” hump and actually be able to make some statements about causation. In my very simplistic understanding, you would find at least two data sets where at least two variables are correlated in one but not the other, and the state of the system they represent is roughly the same except for one other variable. Then you can infer that other variable had some role in causing the correlation or lack thereof. This is how you design an experiment of course, but in this case you are looking in existing data sets for cases where this occurred “naturally”.

That’s my simplistic understanding. Let’s look at how Nature describes it.

In 1994, Angrist and Imbens developed a mathematical formalization for extracting reliable information about causation from natural experiments, even if their ‘design’ is limited and compromised by unknown circumstances such as incomplete compliance by participants3. Their approach showed which causal conclusions could and could not be supported in a given situation.

Nature

It seems like this would have applications well beyond economics and social science. For example, ecology, and environmental science in general, where there are just so many variables and complex interactions that setting up randomized controlled experiments in daunting. (Although it can be done – in ecological microcosms, for example). It must have evil applications too of course, from advertising to politics.

more on lottery winners

I followed up on a link in yesterday’s story about lottery winners. In 2017 a group publishing in the Columbia Journalism Review submitted Freedom of Information Act requests to basically all the U.S. state lotteries and analyzed all the data they were able to get. The results are really surprising, verging on basically impossible.

  • Clarance Jones of Lynn, Massachusetts, the nation’s most frequent winner, claimed more than 7,300 tickets worth $600 or more in only six years.
  • Jones would have had to spend at least $300 million to have a 1-in-10 million chance of winning so often, according to a statistician we consulted at the University of California, Berkeley. (Jones did not respond to requests for comment.)
  • The odds are extraordinary even for winners with far smaller win tallies. According to the analysis, Nadine Vukovich, Pennsylvania’s most frequent winner, would have had to spend $7.8 million to have a 1-in-10 million chance of winning her 209 tickets worth $600 or more.

What could explain any of this? I don’t know, of course. But here are a few explanations that would fit the evidence.

  1. Psychic powers, or just straight up magic. Let’s rule this out.
  2. The data is flawed and/or the analysis of the data is flawed. An intern filled down the same name next to all the winning numbers in a spreadsheet. Something like this seems likely.
  3. Corruption. Certainly plausible.
  4. Computer bugs or computer hacking. This does not seem impossible to me. A pseudo-random number generator could be programmed wrong, using a seed that is predictable somehow. Or someone stole the code and figured out the seed. This has happened with slot machines. I don’t know how similar lottery machines are to slot machines but they would seem similar.
  5. People are figuring out ways to exploit certain obscure, flawed games. We know this has happened. The people who run the lottery know this too, and it is hard to imagine them making these mistakes often, and not correcting them quickly when they occasionally do.
  6. Shadowy crime syndicates, corporations, middle eastern princes, Russian oligarchs, Professor Moriarty (etc.) are funding corruption and/or exploiting flaws on a large scale and/or hacking into lottery computers. The world is not what it seems, and if you are not one of the chosen few you are just another victim plugged into the blood-sucking matrix.

I’d place most of my bets on #2 and #3, and a small side bet on #4 or #5.

beating the lottery

Here’s a long, interesting article in Huffington Post about a couple who developed a system to beat flawed lottery games in Michigan and Massachusetts. Eventually, they got found out, but not before making over $7 million. They reported all their earnings and paid all their taxes. Nobody really got in trouble, expect some store owners who lost their licenses to sell lottery tickets for breaking minor rules. Some other groups of people managed to exploit this same game too.

As interesting as the whole story is, there are a few paragraphs buried in the middle that really caught my eye. There really are people out there who win the lottery more than anyone should by random chance.

A 2017 investigation by the Columbia Journalism Review found widespread anomalies in lottery results, difficult to explain by luck alone. According to CJR’s analysis, nearly 1,700 Americans have claimed winning tickets of $600 or more at least 50 times in the last seven years, including the country’s most frequent winner, a 79-year-old man from Massachusetts named Clarance W. Jones, who has redeemed more than 10,000 tickets for prizes exceeding $18 million.

It’s possible, as some lottery officials have speculated, that a few of these improbably lucky individuals are simply cashing tickets on behalf of others who don’t want to report the income. There are also cases in which players have colluded with lottery employees to cheat the game from the inside; last August, a director of a multistate lottery association was sentenced to 25 years in prison after using his computer programming skills to rig jackpots in Colorado, Iowa, Kansas, Oklahoma and Wisconsin, funneling $2.2 million to himself and his brother.

But it’s also possible that math whizzes like Jerry Selbee are finding and exploiting flaws that lottery officials haven’t noticed yet. In 2011, Harper’s wrote about “The Luckiest Woman on Earth,” Joan Ginther, who has won multimillion-dollar jackpots in the Texas lottery four times. Her professional background as a PhD statistician raised suspicions that Ginther had discovered an anomaly in Texas’ system. In a similar vein, a Stanford- and MIT-trained statistician named Mohan Srivastava proved in 2003 that he could predict patterns in certain kinds of scratch-off tickets in Canada, guessing the correct numbers around 90 percent of the time. Srivastava alerted authorities as soon as he found the flaw. If he could have exploited it, he later explained to a reporter at Wired, he would have, but he had calculated that it wasn’t worth his time. It would take too many hours to buy the tickets in bulk, count the winners, redeem them for prizes, file the tax forms. He already had a full-time job.

Bayes’ Theorem

The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy

There aren’t that many popular books on hard-core statistical approaches to predicting the future. Here is the Amazon description of this book:

Drawing on primary source material and interviews with statisticians and other scientists, “The Theory That Would Not Die” is the riveting account of how a seemingly simple theorem ignited one of the greatest scientific controversies of all time. Bayes’ rule appears to be a straightforward, one-line theorem: by updating our initial beliefs with objective new information, we get a new and improved belief. To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok. In the first-ever account of Bayes’ rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years – at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information, even breaking Germany’s Enigma code during World War II, and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes’ rule is used everywhere from DNA decoding to Homeland Security. “The Theory That Would Not Die” is a vivid account of the generations-long dispute over one of the greatest breakthroughs in the history of applied mathematics and statistics.

Dense as all this might seem, it matters as we enter a more data-driven future, and we need people with the knowledge and training to deal with it. We should no longer assume that steering our sons into math, statistics, and actuarial science majors means they will never get a date.

There’s a much more hard-core set of slides on Bayes’ Theorem available on R-bloggers.