For the French local elections in March 2014, voter turnout was at a record low (around 64%) and the National Front won 11 town councils. The link between the two was discussed extensively after the vote (see here and here, for examples – in French): was the score of the National Front enhanced by the fact that traditional voters for mainstream parties did not bother to turn out? Or, as the leaders of the FN would argue, since more « outside the system » citizens don’t vote, is the National Front spontaneously at a disadvantage when voters’ turnout is low?
Answering that question seems rather simple: let’s look at voter turnout and the score of the National Front in a sample of towns, and a correlation will do the trick. Actually, no, it won’t. The answer is not that easy, because such a correlation is biased.
How can a simple correlation be biased? Because of a crucial statistical issue: endogeneity. The two questions above are actually badly worded. They are looking for a causal link from voter turnout to voting for the National Front, when, most probably, there is a simultaneous decision: either I don’t show, or I vote FN.
Endogeneity is a complex concept. Those interested in it can read our article. But it is not new. It could be shown, for example, that the impact of advertising expenditures on a movie’s revenue is badly over estimated if one does not take into account endogeneity in the analysis. In much the same way, the impact of the shortening of the working week on productivity would be over-estimated.
In our article, we estimate a model linking voter turnout and the score of the National Front. We use data from the 2012 elections to Parliament. Estimating the model with correlation techniques – thus biased – leads us to believe there is a small positive link between the two: 1% more abstaining voters is associated with 0,1% or 0,2% more for the FN.
If endogeneity is taken into account, the impact is three fold more: around 0,5% for the National Front if there is 1% more non-voters. These figures show the size of the diagnostic error if the wrong analysis tools are used.
Two last points:
Another reason why simple correlation analyses are irrelevant is that they don’t take into account the heterogeneity across towns. Our model assesses the impact of a dozen characteristics of the concerned areas on the National Front vote. We will come back to that in a forthcoming blog.
Endogeneity is also an issue in lots of quantitative marketing analyses. Deciding to launch a new product, understanding how client loyalty is structured: these analyses can be biased if endogeneity is not correctly taken into account. As always, an expert knowledge of statistical theory has a direct impact on how successful your operational decisions will be.