(As a side
note, this is the problem with not having enough time. I caught up on that
story very early, and wrote most of that –but was offline. By now, it’s done
the rounds and more. Still, here’s another take)
Oh my!
Regular
readers (yes, you two, I’m talking to you) will remember that I wrote an
article (What happened to the scientific process?) expressing some dismay
regarding what can apparently be published these days, even when failing the
simplest rules of data analysis, the kind that you learn in a two day course if
you work in a lowish operational job.
OK, maybe I
should have rephrased that as “what can apparently be published when it happens to suit what the powerful
want to read”, but that would have been a stronger statement, and all I was
concerned was the (very) poor quality of the analysis.
Still, it
seems that this was small fry. Those who follow macroeconomics at all –or who
follow politics beyond poll-tracking- will probably have heard, either of
Carmen Reinhart and Kenneth Rogoff, or at any rate will have heard of one of
their results, even if they were not named (despite being highly controversial
in Academic circles, it has been treated as consensual by the mainstream media,
aka Very Serious People, who say things like “90% of GDP, the level of debt
that economist recognise as strongly hampering econoc prospects”).
Actually,
since I’ve also been posting about chess, some people have heard of them even
without following economics or politics at all: Kenneth Rogoff is a chess
grandmaster, and articles about him, and his research, have been published in
chess media.
Anyway. It
turns out that their most discussed (by no mean the only one, and they seem to
have done some properly conducted and interesting research as well) result is
based on even worse than what I had been discussing. Before I briefly (it’s
been done very well by others on the web and it’s only fair to link to their
work there, there, or there for the original refutation paper) discuss what they did, let’s state their “conclusion”: when public debt
to GDP goes over 90%, growth is strongly hampered and in fact probably turns
negative.
The study
was embarrassingly poor in any case (the 90% threshold was arbitrary, extremely
small data samples, everything done through a simple correlation while there
was a well-established reverse causation that they did not even try to correct
for…), but it gets worse when you simply try to replicate their results.
Because if you’re honest and careful, you can’t do it.
It appears
that there are no less than three howlers (one of them possibly –I do mean
possibly, since it’s hard to prove intention but even harder to disprove it- a
simple clicking mistake), none of which was discussed in their description of
the study.
First, they
eliminated quite a few points of data, for reasons that we can only guess since
they don’t even mention that they have done it. ALL of the points they took out
would have gone against their claim. In fact, the 6 points they took of New
Zealand would have, had they been included, added 1.5% per year to the
calculated growth worldwide, annihilating their claims. Instead, what they did
was to eliminate all the years that had average growth despite high debt, and
keep the final year, that had plummeting growth. This is beyond ineptitude. If
a country brings its debt back through one heroic effort at cutting public
spending (which will tank the economy), you’d expect to have hugely negative
growth, until going back to the 90% threshold where they would stop appearing
in the data. But then it’s the silly massive spending cut that is the cause,
not the initial debt, as proven by the previous years of high debt and average
growth.
Second,
they then use one point per “episode” of high debt, disregarding the number of
years (or indeed the size of the country affected –a major factor in estimating
the effectiveness of Keynesian stimulus). So, having taken all the years of
good growth but high debt for several countries, they just kept the one year of
tanking economy and called it an episode. For instance, New Zealand was deemed
to have had an episode of high debt that resulted in -7.4% growth.
And that
had the same weighting as the 25 straight years of health growth in the UK
despite much higher debt. The reasoning behind that extraordinary choice is…
well, about to be published I should think.
Third,
having a nice set of by now completely unrepresentative numbers, they… forget
to include the last 5 countries in their average. Had they not done that, it
would have appeared as +0.2% rather than -0.1%. And of course, the sign change
made a big difference in the overall impact.
Now, this
time, I’m not going to say that we ran similar exercises in introductory
classes and that we would not go further until everyone understood. No, it’s
much worse. It’s not that a simple thing has not been noticed, it’s coming up
with very creative ways of causing grievous bodily harm to your dataset. We
would not even have discussed selectively removing data, or averaging between
samples of sizes differing by a factor of 1 to 25 (or 1 to 500 if you want to
take the size of the country into account).
It’s just a
whole new dimension. So next time you go talk to your banker, insist that stats
on your account eliminate any day that is not payday. Then average the one
month where you get your bonus with an equal weight than the other 11 put
together. Then remove the less flattering years. Then state that the result is
your typical daily income. I’m not sure that you’ll get a much bigger mortgage,
but you may become one of the most famous macro-economist of the time.
No comments:
Post a Comment