


Introduction
"Regression to the mean" is well understood in road safety circles. It's all the more shameful then that this statistical error has not been eliminated from the official reviews of speed camera effectiveness. But what is regression to the mean? Regression to the mean happens when a random grouping is treated. In the context of road accidents we obviously wish to "treat" the most dangerous places to make them safer. But accidents are distributed with a fair degree of randomness  if they treat a road which has suffered an untypical random grouping of accidents it should be obvious that with or without the benefit of the treatment it is likely that the random grouping won't exist next year and the figures will improve. This is an absolute cornerstone of the benefit claims we see for speed cameras. Most of the benefits claimed for most of the cameras most of the time are actually a regression to the mean benefit illusion. See the diagrams below: 
How does it work?
Regression to the mean is likely when a blip in the circumstances triggers a remedial treatment. Consider the following diagrams. In this case, as in road accidents, we would like to reduce the average level by applying a treatment. Naturally, in road safety we would like to treat the worst places first, but we must be very careful to ensure that we know if the places we treat are true long term danger spots or if they are simply showing some sort of random blip. There are certainly a lot of random blips, and even some genuine long term danger spots show random blips making them appear even more dangerous. Applying a safety treatment when the figures are unusually high will always create a regression to the mean benefit illusion. Obviously all four types are likely to exist across the road network. We freely admit that there will be cameras which show a genuine benefit  but they are in very special and unusual circumstances. More typically cameras are installed with no safety benefit, but very frequently with a benefit illusion. But cameras are affecting the way that drivers think, and that effect extends across the entire road network. We believe that cameras make the roads in general more dangerous. (click here) Even where an unusual camera installation actually produces a local benefit, it could still produce an increase in accidents overall. 
The official "Road Safety
Good Practice Guide"
The Road
Safety Good Practice Guide (published by the Department
for Transport) has this to say about regression to the mean:
5.119 This effect, sometimes called bias by selection, complicates evaluations at sites with high accident numbers (blackspot sites) in that these sites have often been chosen following a year with particularly high numbers occurring. In practice their accidents will tend to reduce in the next year even if no treatment is applied. Even if threeyear accident totals are considered at the worst accident sites in an area, it is likely that the accident frequencies were at the high end of the naturally occurring random fluctuations, and in subsequent years these sites will experience lower numbers. This is known as regressiontothemean. 
And in the context of the
rules for Speed Camera placement...
Notice above that they are talking about a site with "more than 8 injury accidents per year"? But the rules for fixed speed camera placement from "The Handbook" and reported in the report of the two year pilot are as follows: At least 4 killed or seriously injured accidents per km in last three calendar years (not per annum)Let's consider how that compares with the table above... Firstly, we'd expect 8 accidents per year to lead us to an effect on the high side of the probable range suggested (while 25 per year might give rise to an effect on the low side of the range suggested). Secondly, suppose we have a speed camera placed in accordance with the minimum criteria from The Handbook. 4 KSI per three years is equivalent to 1.333 KSI per year or just 1/6th of the rate considered above. Thirdly, it isn't the rate per year that matters for regression to the mean consideration. It's the number of accidents per time period  there's nothing special about a year as such. The table above would be perfectly sensible and equally accurate with 8, 16 and 24 accidents instead of 1, 2 and 3 years for a sample site. Fourthly, So we created the graph below and extrapolated the range downward. We also added the "total accidents" scale. The extrapolated section (i.e. below 8 accidents or 1 year) is paler. Fifthly, so using the extrapolated graph as a crude estimation tool we deduce that the probable regression to the mean error for a "minimum KSI" (i.e. 4 accidents) site under the current rules would be likely to be in the range 20% to 36%. And note that we're again considering the low end of the accident number scale which will tend to correlate with the high end of the error scale  it's more likely to be around 36% than around 20% for our "mimimum KSI" site. Sixthly, we should be very worried about the "per km" specification in the site size. Fixed speed cameras are highly unlikely to affect an entire kilometre of road. This just encourages genuinely random clusters to be considered and treated so increasing the regression to the mean error  some of the accidents had no right to be counted in the group  they were simply too far away. 
new
The Australian Government "Blackspot program" (click
here) has this to say:
"Regressiontomean
Evaluating the treatment applied is discussed, and various estimates are made of the effectiveness of the treatment. It's worth a read. Obviously there's a significant possibility that the post treatment results do not actually represent any benefit at all from the treatment applied, while equally the claim that the treatment has cut crashes by 76% (21 > 5) might be made. 
It's a setup
Whether by design or by accident the current rules encourage massive regression to the mean errors. The report of the two year pilot claimed an reduction in KSI of 35% at speed camera sites  but with just the rules for speed camera placement as they are and the regression to the mean error laid out here the entire benefit claimed could be due to regression to the mean. A regression to the mean error as described here could show a benefit illusion at "speed camera sites", but show absolutely no improvement in area wide figures. And this is exactly what we observe. See for example the table (click here) where we discover that counties out with the hypothecation scheme returned better overall road safety results than those within it. The following quote from the report of the two year pilot illustrates the deliberate obfuscation about the vital regression to the mean error: "We could not obtain data for the before period for individual sites other than at camera sites. It was therefore not possible to check fully for regression to the mean at the site level. The results for areas that bid unsuccessfully for participation in the pilot could be used as a comparison for what might have occurred in participating areas if they had not been treated. The PIA and KSI frequencies for these areas do not differ significantly from other similar areas that did not bid for pilot status at all. On this basis, there is no evidence in the present data for any substantial illusory benefit due to the regression to the mean effect."Those are weasel words. We expect having insufficient data to check for regression to the mean benefit illusions was a considerable relief to those who wished the report to show a benefit. We wrote to the report's authors requesting clarification of this point and others. You can read the correspondence (here). 
What is a black spot?
Years ago, perhaps in the 1960s, something like half of all accidents took place at "black spots". In those days we knew what a black spot was  it was a location where many drivers made the same mistake and had similar accidents. Perhaps a deceptive bend, a hidden dip to catch out overtaking traffic or a dangerous junction. But over the last 40 years we've applied very good engineering treatments to most of the "old fashioned" black spots  some are very easy to treat effectively  with the hidden dip example all they probably needed to do was install a double white centre line. These black spot treatments have largely been effective and accidents are now much more likely to occur away from specific danger spots. Some estimates put the traditional "black spot accidents" at under 20% of all modern accidents. So accidents are not predominantly now focused on dangerous features, and the definition of an accident black spot has tended to follow the trend. Now, sometimes, a 20 mile stretch of A road might be regarded as dangerous and might even termed a black spot. There are two ways of measuring the danger of a road. You could count the number of accidents, or you could count the number of accidents per vehicle using the road. Both are valid. A little used road with 3 accidents per year might represent more danger to each driver than a busy stretch of A road with 50 accidents each year. Safety treatments are generally better applied to roads with a higher accident count because more people benefit. 
Regression to the mean:
further reading, history and solutions
History: Francis Galton documented the phenomenon in 1886. Galton measured the height of 930 adult children and their parents and calculated the average height of the parents. He noted that when the average height of the parents was greater than the mean of the population, the children tended to be shorter then the parents. Likewise, when the average height of the parents was shorter than the population mean, the children tended to be taller than their parents. Galton called this phenomenon regression towards mediocrity, and it is now known as regression to the mean. (from here) A solution: Bayes beforeafter procedure described (here) and (here) Further reading: Accident reduction factors and causal inference in traffic safety studies: a review Gary A. Davis 1999 (click here) (it will probably be necessary to rename this file to to a .pdf) Classification Society of North America Newsletter #47. A nice overview with quotes and references. (click here) new The Regressive Fallacy (click here) 
new
Examples
We complain to Professor Heydecker about regression to the mean benefit illusion in the 2003 report: (Safe Speed Heydecker letters) Professor Heydecker admits to regression to the mean in 2003 report: (BBC Radio 4: More or Less) 2004 official report depends on regression to the mean error: (Safe Speed PR126) We challenge Professor Heydecker over 2004 report: (Safe Speed Heydecker letter, 2004) Numberwatch web site (edited by Professor John Brignall) agrees: (Numberwatch on Heydecker) 
Conclusions
Regression to the mean is a well known and well documented source of error when analysing road safety treatments. The rules for speed camera placement very specifically encourage regression to the mean benefit illusions, yet the official claims for speed camera effectiveness deliberately obfuscate the issue and fail to apply compensation. It's also notable that the rules for speed camera placement specifically encourage placing them at short term clusters of accidents. The rules are based on accidents "in the last three years". If they had wished to avoid placing the cameras at random clusters they could have made the figure 7 years, or required a measure of year on year consistency. The fact that they didn't make any attempt to avoid "wasting" the effects of a camera on a random cluster speaks volumes about the motivation. Of course, regression to the mean is just one of the errors. You can read about the entire catalogue (here) 
Comments
Safe Speed encourages comments, further information and participation from our visitors. See our (forums). Read about our comments policy (here). Many pages (including this one) have a specific associated forum topic. You can (view) or (add comment) to the forum topic for this page. Posting in the forum requires simple registration. 
Calling for real road safety, based on truth