Saturday 9 March 2013

What happened to the scientific process?


David Greenlaw, James Hamilton, Peter Hooper, and Rick Mishkin (no, those names did not mean much to me either) have published an op-ed in the Wall Street Journal based on their recent paper on debts, deficits and borrowing interest rates.

Thus far, the sentence reveals nothing extraordinary. What they end up concluding (the most policy-absorbed might have surmised it from where their op-ed was published) was that there was a strong relationship between debt level and interest rates. There is a detailed takedown of the conclusions by Matt O'Brien, so I'll focus on something else. The data they based their conclusion on can be plotted on a graph easily enough, like that:


You can see that on this graph - which I borrowed from Paul Krugman- there are two different kinds of dots. This is called a stratified scatterplot. You plot all your data points according to the two characteristics between which you think there might be a relation (in this case, interes rates and gross debt ratio, measured as a percentage of GDP), but you stratify the data between categories that are both complete -each data point is in one of the groups- and exclusive -no point is in two groups. Here, the categories are euro and non-euro. Then you have a different marker per category.

And what it shows here is that there are two populations (in this case, euro and non-euro) that behave totally differently. If you were to fit lines, the one for euro would be fairly steep, the one for non-euro would be flat. In fact, possibly very slightly downward-sloping.

I tried to explain it thoroughly because it's a written medium and I can't see the feedback of a nod of understanding, but one thing should be clear: in terms of data analysis, this is very basic. I've been teaching Lean Six Sigma for some years, and it's a field in which the scientifically minded can feel some frustration at the fact that there is not enough time to properly cover analytical tools.
Well, the stratified scatterplot was mentioned in the half a day introduction sessions, and taught in every course, from the 2-days Yellow Belt to the 4-weeks Black Belt. Bear in mind that analytical tools is only one of the topics covered -it gets allocated a third of the time at the most, and the candidates may have no background at all in data analysis.

Typically, there would be a similar graph given as an exercise in the class, and we wouldn't go beyond that point before the class had correctly explained what was going on. It never delayed a class much. I have interviewed candidates for Lean Six Sigma roles, and if one had stumbled on such a simple exercise there would have had to be a  VERY pressing reason for his application not to end there and then. I swear that, even on a bad day, it would not take me five seconds to think "there are two populations there" -and that is without the distinct markers for euro and non-euro.

What you then learn is that, when there is such a clear stratification, you should never make the kind of model called a "regression" on the whole of the data (but you might do so on each strata).

So guess what our four op-ed writers had done? They ran a regression on the whole of the data.

Now, that this could be published in the Wall Street Journal should not necessarily come as a surprise. The WSJ probably has less truth in it than Pravda ever had (although at least its title is not misleading in the way Pravda -truth- was. It really is what Wall Street wants to read). But their op-ed was based on a published paper.

Now, the way things are supposed to go in science (even in those fields where the word "science" is stretched almost to breaking point) is that, for your paper to be published, it needs to be peer-reviewed. This paper was based on the most horrible use of regression you could imagine. Yet it was published. What is going on?

In case you'd be tempted to be charitable and think that maybe everyone had a blank and overlooked that:
  • In the paper they explain how they collected the data: they restricted their analysis to countries that borrow in their own currency ... yet over half of their points are euro countries, that don't even have their own currency. Note that the peer-reviewers did not see fit to object.
  • After writing that it makes a big difference to have your own currency, they do not stratify the data according to it.
  • Even if you go that far without noticing, any software that will do a regression for you will give you something called residuals analysis, which would scream bloody murder here. It would also yell that two points (they are, as some would have guessed, Greece and Japan) have far too much of a weight for a regression to be meaningfully run on the data. The authors must have had that in front of their eyes, as it is probably what makes them say that their model "does not have predictive power". But it does not have descriptive power. Yet they submitted their paper, and it was accepted.

To really rub it in, one of the authors, David Greenlaw, happens to be ... Morgan Stanley's chief economist! We have been told for years that extremely high salaries in finance were justified and needed in order to attract the best and brightest. Well... I must admit to having often, when teaching in investment banks, played at taking a few of their economic notes from the reception to make the class work on finding the mistakes, but they were never that trivial.
Quite a few of the people I taught in 2-days classes would not have been paid much (if at all) above median wages. But not one of them would have made the simplistic error made by the "best and brightest" David Greenlaw, who must receive over 40 times more.

And who then submitted it, got it peer-reviewed, and published.

Something has gone very wrong with the peer-review process.

No comments:

Post a Comment