


Binocular prepped for the Champion Hurdle at Sandown in February 2010 and may do so again this year
Our resident blogger Simon Rowlands points out the shortcomings of some types of conventional trends analysis...
"The fact is that “trends” often seem to exist when they are looked at crudely and weaken or disappear altogether when the data is looked at in a more sophisticated manner. This is a point well worth bearing in mind at this time of year more than most."
Mark Twain is most often cited as the source of the phrase "...there are lies, damned lies, and statistics..." Whether or not it was he who first said it, it is a good job that the author of Huckleberry Finn and Tom Sawyer did not then cast his sceptical eye over racing, else he might have added "...and, finally, there are trends."
It is not that all trends are worthless, any more than that all statistics are worthless, just that rather a lot of them are. Their value depends not only on the information that is used but on how that information is interrogated and the conclusions that are drawn from it. Trends frequently fall down in this area.
Views about trends are never so much in evidence than in the lead up to the Cheltenham Festival. There is some justification in this. No race meeting in Britain - and possibly in the world - exerts the kind of pull that the Festival in March does. Horses are targeted at it from months, and sometimes years, in advance. Only good horses win there - well, almost only - and any systemic biases are likely to be crucial.
With that in mind, I thought I would do a quick reprise of some of the possible pitfalls of so-called trends analysis with six "golden" rules. My apologies to anyone who has read something similar from me before, but bad trends analysis seemingly won't go away.
1. Beware of small samples. If you look at most data on a microscopic basis you could easily convince yourself that a trend exists. I backed three winners out of four bets yesterday, but it would be wrong (VERY wrong!) to take that as representative of my betting as a whole.
2. Beware of big samples. The way to offset small samples is not to go so far back into history that the information you are using has little or no relevance to what is about to happen now. Trends, if they exist at all, tend to change over time. Keep things as contemporary as seems suitable.
3. Do not consider only winners. This is a common mistake. While winners are what most people get paid out on, a trend (if it exists) should potentially apply to all horses under consideration. Winners represent only a small fraction of the runners in races.
4. Take into account not just whether a horse wins or loses, but the degree to which it wins or loses. Should a non-winner that was beaten a short head be accorded the same significance as a non-winner that was tailed off? No.
5. Compare realisation with expectation. Horses of a certain age group might have won a given race 80% of the time, but that is neither here nor there if 80% of the time is exactly what you could expect from that age group's representation.
6. Do not apply filtering techniques unless they are strictly justified. The flawed reasoning is that you should rule horses out successively according to whether they "pass" certain criteria until only one or a few "qualifiers" are left. As an example of the folly of this, one criterion could be that a horse is rated a certain figure or higher, another could be that it had been born in a certain month. It is very unlikely that the criteria are of equal worth, and yet crude filtering implies that they are.
I was reminded of the pitfalls of conventional trends analysis by a recent internet discussion as to the worth of a horse prepping for the Champion Hurdle in a given month, a discussion that seems more significant than usual in view of the number of horses already having been put away for the Festival.
Seven of the nine winners of the Champion Hurdle since 2000 had their previous race in February. Then again, more horses - both winners and losers - had run most recently in February than in any other month. However, that is still a strike-rate roughly twice what could be expected.
Sixteen of the twenty-seven placed horses in the Champion Hurdle in the same period had prepped in February. That is roughly one and a half times what could be expected from that age group's representation.
However, if you consider all horses over the period - and the degree to which losers were beaten, not just simply that they were beaten - things become less cut and dried still. In terms of percentage of rivals beaten, which is usually a better way of looking at such matters, horses running in the Champion Hurdle that had prepped in February came out top again, but only just and not to a degree that should be a cause for excitement or a betting strategy.
The figures were: February 52.8%; December 51.8%; January 50.1%; and other months well behind. There is little in it between those three months, which provided over 90% of all runners, in other words.
The fact is that "trends" often seem to exist when they are looked at crudely and weaken or disappear altogether when the data is looked at in a more sophisticated manner.
This is a point well worth bearing in mind at this time of year more than most.
No comments:
Post a Comment