The ThoroughMetrics Blog: June 2008

Sunday, June 29, 2008

Sample Size

Part of my purpose here at the ThoroughMetrics Blog is to help you evaluate the available research on the thoroughbred industry. As I've said before, the majority of it is flawed in one way or another, and insufficient sample size is one of the most frequent offenders.

One blog that I read fairly regularly is Scot Gillies' Five Cross Files at The Bloodhorse. Scot has a lot of interesting thoughts on the breeding industry, and he's also been very generous with his time, providing me with really detailed feedback on some of the research I'm working on. That said, in his latest post, Scot falls prey to the temptation of basing conclusions on ridiculous small sample sizes. The article is about undiscovered 'breed to race' sires. He does warn the reader that some his selections are lightly bred, and that he's tried to keep the small number of offspring in mind when ranking them. However, he goes on to include Raj Waki and Intidab in his top six. Raj Waki has only 29 starters, but "over 10% stakes winners". Intidab has 27 starters, but "tops 11% black type winners". The problem here is that "over 10%" of 29 is 3, and "topping 11%" of 27 is also 3. A sample size of 3 is not good news, and there is almost certainly no predictive value to the statistics these sires have compiled. We simply have no idea if they're any good or not. When you're thinking about whether a small sample size is meaningful, there's a good 'mental test' you can use. If you removed 1 or 2 from the total, would the result still be impressive? If I remove 2 from 3...well we wouldn't be talking about Raj Waki with 1 out of 29 stakes winnering offspring. When you have small sample size, the results have to be mind blowing to have any predictive value.

In fairness, drawing conclusions based on insufficient sample sizes can be very, very tempting. When I read Scot's article, my first thought was "he should have included Repent instead of one of those others". I made the exact same error as Scot! Repent has less than 100 offspring so far and just a handful of stakes winners. It's too soon to really tell if he's going to be a star at stud.

Friday, June 27, 2008

More About Blood-Ex

Along with horse racing, and financial markets, I’m also an obsessive fantasy baseball player. Last year, I participated (and finished 8th out of 4,000+ people) in a new game that used a stock market type trading mechanism. Because it was the first year the game had been run, nobody knew exactly how the ‘market’ would function. Two lessons that I learned there, that I think apply to Blood-Ex as well:

1. There’s no way to know exactly what will happen once trading starts, or what strategies will be successful.
2. Whatever happens, you’ll be in a better position to succeed if you’ve already given some thought to the possibilities prior to the start of trading.

I think the second point is especially true, because ‘immature’ or new markets are often the least efficient, and provide the greatest opportunities for profit. Once Blood-Ex has thousands of participants, including some with large sums of money available to them, the inefficiencies in the market will begin to disappear, and opportunities will be harder to find.

In the meantime, here are four things to think about as you prepare for the start of trading:

1. The 5% commission on sales is huge. It’s very possible that it will turn all short term trading strategies into losers. It should be less of an issue for those who plan to buy and hold shares. Also, it appears that you will not have to pay the commission when horses you own shares of are sold at auction. So if you’re holding shares of a valuable horse, it will generally be better to simply hang on to them until they’re sold than to sell them on Blood-Ex.

2. Most male horses end up their careers worthless as bloodstock. While trading in fillies and mares may have some resemblance to stock trading, with relatively modest price gains and losses, trading in colts and stallions is going to look like options trading…a few big winners, and a lot of 100% losses. Because of that, any long-term ‘portfolio’ of shares should hold very small positions in colts, and diversify across many of them. Because of the lower volatility, it should be ok to hold fewer, more concentrated positions of shares in fillies and mares.

3. Pay attention to the spread, or the difference between the best bid (offer to buy) and the best ask (offer to sell). Generally speaking, the spread will act as an additional transaction cost. If you want to make sure you get shares in a horse, you’ll have to make your offer to buy at a price equal to the best ask. But if you try to sell those same shares, even if the price hasn’t moved at all, you’ll need to sell them at the level of the best bid to guarantee yourself a buyer. So you’ll lose money on the round trip. That said, if you’re patient there are situations where the spread can work to your advantage. Instead of placing your buy orders at the best ask, place them just above the best bid, and wait to see if someone is willing to lower the asking price of their shares. And instead of placing your sell orders at the best bid, place them just below the best ask, and wait to see if an eager buyer is willing to pay what you’re asking. Until trading starts we really have no idea how wide the spreads on Blood-Ex are going to be. Most likely they’re going to be relatively narrow on well known horses which are actively traded. But for more obscure horses, spreads could be 15-20% of more. If that’s the case, there’s an opportunity to overcome the impact of the commission simply by placing buy orders on horses at a price just above the best big and placing sell orders on those you acquire just below the best ask. However, if you try this, you should probably not leave your orders open during the horse’s races, when the price (and perceived value) of the horse is likely to move sharply in one direction or the other.

4. Pay attention to how responsive the market is to news. Initially there may be relatively few participants, and prices may react slowly to news. This is likely to be most true for horses that aren’t stars. If you follow the news closely, you may be one of the first to learn of an injury or a particularly fast workout. Once the market has more participants, it’s likely that price moves in reaction to news will be so fast that there isn’t an opportunity for most to profit from it. One aspect of this that should be particularly interesting to see is how prices will move during the running of a race. As far as I can tell, trading in horses will not be frozen during their races, although it wouldn’t surprise me if this is done to avoid wild price swings.

Wednesday, June 25, 2008

Blood-Ex

I'll be writing a LOT more about this over the next few months, but wanted to post something to let everyone know about it right away. The world's first online bloodstock trading exchange is close to it's official launch. Blood-Ex will start with UK horses being listed, but plans to expand to the US and other countries later this year. Basically owners will put up shares in a horse's breeding rights for sale on the exchange, and traders will be able to buy and sell those shares. Each share is equal to 1/150th of 1% of the horses breeding rights. It comes with no obligations to pay for expenses, and no rights to any share of the horse's earnings on the track. Owner's who post shares to the exchange sign an agreement that they will eventually put the horse up for public auction in order to set a final value for the shares of the horse. For someone like myself who is fascinated both by horse racing and financial markets, this is exciting stuff, and I'll certainly be an active participant, and will post my thoughts on Blood-Ex here on an ongoing basis

Tuesday, June 17, 2008

ABE Examples

I thought it would be interesting to look at some examples of ABE’s for auction buyers. These are from the 1999 Fasig-Tipton Saratoga Selected Yearlings Sale. I only included the four buyers listed in the data who bought at least five horses at the sale. Several of these were acting as agents on behalf of other buyers. Keep in mind that the sample size’s here are microscopic. To draw meaningful conclusions from the data, you’d need to compile data across multiple sales. And even then, ABE suffers from the biases and problems I mentioned in the previous post.

Baden P. Chase- Horses:6, Total Cost:$1,190,000, Earnings:$279,759, ABE=0.24
Gatsas Thoroughbreds LLC- Horses:5, Total Cost:$282,000, Earnings:$364,099, ABE=1.29
John C. Oxley- Horses:5, Total Cost:$1,780,000, Earnings:$533,010, ABE=0.30
Todd A. Pletcher- Horses:7, Total Cost:$850,000, Earnings:$874,726, ABE=1.03

At some point I'd like to do a study across multiple sales looking at this data, and determining whether ABE is a 'repeatable skill' with predictive value...in other words, whether successful buyers are just getting lucky or not.

Tuesday, June 10, 2008

Auction Buyer Effectiveness

To the best of my knowledge, nobody has ever tried to measure the effectiveness of buyers at thoroughbred auctions and to compare the success of different buyers. Until now, that is. I'll be writing about some of my thoughts on how best to measure this over the next few weeks. I also plan to study whether buyer success is consistent over extended time periods. I'll start by introducing a very crude measure of buyer success. I'll call it 'Auction Buyer Effectiveness' (ABE). To calculate a buyer's ABE, simply add up the total earnings of the horses purchased, and divide by the total amount spent. As I said, this is a VERY crude measure, and it remains to be seen whether it provides any really valuable information. I believe we can come up with some better measures that address some of ABE's shortcomings, but for now will simply list a few of the problems with ABE:

1. Ignores variability in total expenses caused by different lengths of career.
2. Ignores impact of expenses of buying younger horses vs. older horses.
3. Like any measure based on earnings, may be skewed by state bred races.
4. Like any measure that uses a total or mean (rather than median) may be skewed by a single outstanding success.
5. Likely to be systematically higher for lower priced horses than higher priced horses.

If I have time, I'd like to take a look at whether even a crude measure like ABE shows correlation from one year to another within various price tiers.

Sunday, June 8, 2008

Pinhooking

In the course of doing research, I'm finding that I put together data from a number of sources, since the 'keepers of the data' for the industry (BRIS and Equineline) choose not to make the raw data of their database available at any price. In the course of putting data together for studies, I'm finding a lot of data that's interesting, without being valuable enough to justify asking people to pay for it.

I did a study on horses sold at yearling auctions compared to those sold at two year old auctions. Because I selected data from sales in consecutive years, there are actually some horses that were sold as yearlings in 1999, and then sold again as two year olds in 2000. I thought it might be interesting to look at those:

Capeless, 1999: $140,000, 2000: $50,000, Earned: $35,750
Cat Tracks, 1999: $150,000, 2000: $250,000, Earned: $85,262
De Rose Colony, 1999:$100,000, 2000: $250,000, Earned: $58,200
Fax and Go, 1999: $40,000, 2000: $675,000, Earned: $0
Fistfite, 1999: $150,000, 2000: $250,000, Earned: $158,183
Illusionary, 1999: $300,000, 2000: $360,000, Earned: $176,274
Lady Katie, 1999: $65,000, 2000: $30,000, Earned $152,728
Lady Victoriate, 1999: $130,000, 2000: $90,000, Earned: $48,405
Let's Behave, 1999: $110,000, 2000: $285,000, Earned: $294,029
Perfect Stranger, 1999: $90,000, 2000: $150,000, Earned: $159,690
Red Carpet, 1999: $375,000, 2000: $825,000, Earned: $37,760
Songandaprayer, 1999: $470,000, 2000:$1,000,000, Earned: $380,480

What can we learn from this? Probably not that much. The sample size is just too small, and even with a larger group it's not clear what we could conclude, since many horses bought with the intention of being resold may not have been successfully sold. The two things that stand out here are that the originally buyers of Fax and Go did an unbelievable job reselling a horse who ultimately never earned a penny on the track, and that while Songandaprayer was resold for a huge profit, his later buyers obviously got their money's worth if they held him until his stud career took off.

Monday, June 2, 2008

Survivorship Bias

One of the things that I'd like to do with this blog is help create more 'educated consumers' of the vast amount of information that's available on the thoroughbred industry. An awful lot of that information is presented to bolster arguments that may or may not be supported by fact. Sometimes the information is intended to mislead, but more often the authors are simply unaware of some of the easier statistical errors to make when forming theories.

One of the easiest mistakes to make is the error of 'survivorship bias'. This is a well known phenomenon in the financial industry, where it can distort the appearance of past performance of indexes or funds. An example would be if a new ‘index’ is created which tracks the prices of a number of the largest companies in a specific industry. To show how the index would have performed historically, the historical price of the index is reported, going back many years. Invariably, these calculations show unrealistically strong performance. The problem is that by using the current largest firms, any firms that went out of business, or simply did poorly enough to shrink substantially were left out of the index. So the index by definition includes the firms that have done the best in the past. This doesn’t give any indication of how it might do in the future, or how it would have done if the largest firms from some time in the past had been used to create the index.

So what does this have to with the thoroughbred industry, and how would it impact research on racing or pedigree? Here’s an example, where I almost designed a study with the same flaw. I’ve mentioned before that I want to study what factors might predict potential future ‘breakout performance’ for claiming horses. I have access to a database of several thousand races of past performance data, and was thinking I could use that data for the study. I’d look at former claimers who had allowance or stakes wins, and identify some patterns in their history, and then use the past performance data to test the performance of all horses that exhibited the same patterns. The problem is that by using past performance data to test the patterns, I’d be automatically excluding all the horses that had the same pattern, but then didn’t make it back to the races, and I’d be reducing the impact on the overall data sample of horses that weren’t good enough to run often after exhibiting the pattern. The bias this would introduce to the data would have made my findings almost useless. It’s subtle problems like this in most existing research that lead me to believe that there’s a need for better research in the industry.