Friday, March 05, 2010

Why is Stephen Joyce using old data?

According to Stephen Joyce, on Wednesday, March 3, 2010;

Young drivers make up 14.5 percent of New Zealand's population and 16 percent of all licensed drivers, but in 2008 they were involved in around 37 percent of all serious injury crashes.

The latest annual road crash statistics to the end of January 2010 show that young drivers were involved in 32 percent of all injury crashes.

From the same release;

"Across the Tasman, for example, young Australians have a road fatality rate of 13 per 100,000 of population, while young New Zealanders have a fatality rate of 21 per 100,000 of population."

In NZ there were 47 15-24 year-old driver fatalities in the most recent year.

According to the HLFS estimated working age population at Dec 2009 there were 627,300 15-24 year-olds.

That means the fatality rate for young drivers was 13.3 per 100,000. The fatality rate for all 15-24 year-old drivers, passengers, pedestrians, cyclists and motorcyclists is 17.7 per 100,000.

If he is going to use annual statistics, why not the most recent? Because they are not as bad?


Eric Crampton said...

The comparison of youth accident rates on the two sides of the Tasman is utterly useless without a baseline comparison. The roads here are more dangerous than the ones there. What we really need is a ratio OF ratios: get the fatality rate ratio for say 40 year olds, then compare that ratio to the youth ratio.

Anonymous said...

"If he is going to use annual statistics, why not the most recent? Because they are not as bad?"

Or it could possibly be that the summary eventually published was the tip of a larger policy document. Which might have considered other factors. Particularly in the case of something like road injuries, where other information such as hospital data and driver licensing stats would all have to be collected, collated, analysed and then written up in one coherent document. All relevant datasets need to be purchased from the relevant data supplier, each of which has their own reporting period and different systems to ensure data outputs conform to the purchase request while preserving the security of individual confidentiality.

Analysis needs to compare like with like and also takes time, so I'm not overly surprised that they used year-end annual statistics. If they started the work halfway through 2009, it's not unreasonable to use calendar year stats (for one thing running by year means you only have to filter by a year-date function, rather than month-year, when dealing with multi-million observation datasets).

I'm no national voter (so won't discount the possibility that he made a snap decision last week and looked for stats to support his case) but cross referencing half a dozen datasets to ensure that the gold statistic hasn't been confounded in some way takes time.

Just because a Google search can find a more recent statistic doesn't mean they're hiding something - it might mean they did a bit more work than write a press release.