Tag Archives: machine learning

My Dear Watson

Sir Arthur Conan Doyle’s Watson was, in fact, a medical doctor. IBM’s Watson is not a doctor, but tried to play one in real life, and apparently failed. The idea was to use machine learning to crunch huge amounts of medical records and treatment data, and provide recommendations to real doctors treating real patients. And according to this article in Slate, it just didn’t work and the division is being “sold for parts”. I assume this means part of the IBM company legal entity and/or its “intellectual property” being sold, not the actual computer hardware which must be obsolete by now?

Along with the recent Zillow house flipping failure, this seems like another high profile failure for machine learning/AI-based business plans. It might be that the business plans are ahead of the technology, or the technology is ahead of the data (one gets tired of the phrase “garbage in, garbage out”, but it is a real thing – a lot of what is in my medical records is garbage, anyway), or both.

the (thing everybody calls the) Nobel Prize in economics

This year’s Nobel prize for economics, which we are supposed to call the “Sveriges Riksbank Prize in Economic Sciences”, is for a method of identifying natural experiments in data. The importance of this is to get over the “correlation is not causation” hump and actually be able to make some statements about causation. In my very simplistic understanding, you would find at least two data sets where at least two variables are correlated in one but not the other, and the state of the system they represent is roughly the same except for one other variable. Then you can infer that other variable had some role in causing the correlation or lack thereof. This is how you design an experiment of course, but in this case you are looking in existing data sets for cases where this occurred “naturally”.

That’s my simplistic understanding. Let’s look at how Nature describes it.

In 1994, Angrist and Imbens developed a mathematical formalization for extracting reliable information about causation from natural experiments, even if their ‘design’ is limited and compromised by unknown circumstances such as incomplete compliance by participants3. Their approach showed which causal conclusions could and could not be supported in a given situation.

Nature

It seems like this would have applications well beyond economics and social science. For example, ecology, and environmental science in general, where there are just so many variables and complex interactions that setting up randomized controlled experiments in daunting. (Although it can be done – in ecological microcosms, for example). It must have evil applications too of course, from advertising to politics.