There is an urban legend of mathematical modelling of soccer matches. It is the legend of the mathematical genius, the Einstein of gambling, who has worked out the formula for beating the bookmakers and winning money. If only, the legend goes, you can find the tips that this person can provide, the source of the magic equation, you can become rich beyond your wildest dreams.
After I published the book Soccermatics last year, a few people seemed to believe I might hold the magical equation. I would get messages on Twitter and emails to my work address asking me if I could help them with tips and advice. I was a professor of mathematics who had studied soccer, maybe I knew the secret?
A simple way to find value in the betting market
In one section of the book, I did manage to beat the bookies. But it wasn’t because I found a magical formula that predicts who will win soccer matches.
The basis of my model was far from complicated. It didn’t come from me working out the strength of the teams based on past performance, advanced metrics, expected goals or anything else.
The way I did it was much simpler. I looked at the odds and found a very small but significant bias in how they were set. Bookmakers and bettors hadn’t paid enough attention to predicting the draw in soccer.
Maybe it is because of the popularity of the Over/Under markets. Maybe it is because bettors don’t like betting on a draw. But, whatever the explanation, it turned out that draws in the Premier League were not properly priced.
Below is a plot of the real frequency of draws in four seasons of the Premier League (2011/12, 2012/13, 2013/14, 2014/15) and the prediction of draws implied by the bookmaker’s odds.
This figure is created by taking the odds provided by four leading bookmakers (including Pinnacle), converting odds to implied probabilitiesand then looking at the difference between the probability of a home win and an away win.
It turns out that when two well-matched teams meet (i.e. the probability of a home win is only slightly bigger than the probability of away win) then draws are under-priced (circles above red line). When matches are skewed so there is a strong a favourite (i.e. the probability of one team or the other winning is larger than the other) then draws are over-priced (circles below red line).
Want it made simpler? If two teams are about as good as each other then the draw could be a value bet. If one team is much stronger than the other, don’t bet on the draw (betting on the favourite is normally the smartest move in this case).
Testing out the theory of under-priced draws
That was what I found by plotting the odds. I then took that observation and made some money from it. Below are profits for this model for the 2015/16 season.
I tripled my money over the season. Well, actually I didn’t bet throughout the season. But I had doubled my money by Christmas.
Soccermatics came out in May 2016, just as the Premier League was coming to a close. I monitored how it went for my model the season after. Here is the result.
Not so good. There was a small profit to be made in the first few weeks, but then it flatlined for the rest of the season. Not losing money is a small achievement in itself, where the odds are in the bookmaker’s favour, but obviously making money is the objective for most bettors.
Lessons learn from using my model
There are four lessons to be learnt from my model.
Firstly, I didn’t make money by creating a magic formula. Although I did write down a single equation that I then used to decide my bets (it is footnote 17 for chapter 12 in the book if you don’t want to read the rest of it) this equation came from an analysis of the odds.
The basis of my model was far from complicated. It didn’t come from me working out the strength of the teams based on past performance, advanced metrics, expected goals or anything else. It came from a small error in how the odds were being set.
If you want to create your own model of sporting outcomes you need to use the odds as the starting point.
Secondly, I wasn’t just lucky. The original model was consistent with the previous four years of bookmaker’s odds. I downloaded my odds from Oddsportal and then double-checked my model against those on football-data.co.uk. I then made a prediction and applied it to the next year and it continued to work.
There is a lot of randomness in betting and it is possible to win for quite a long period of time with luck alone. But this was a long-term trend that was profitable.
Thirdly, nothing lasts forever. In moments of self-aggrandising I like to think that my book led to a market correction. Maybe the traders at Pinnacle and other bookmakers read my book and thought “we’ve been pricing draws wrong. See those odds for Liverpool at home against Manchester United at the weekend….move the draw odds up by 0.1.” That’s all it takes and my small margin disappears.
This is just one explanation, though. Another is that managers realised that in those big matches between equally good teams they should go for the three points (this is also something I look at in the book). There are other explanations too. The fact is, I will never know for sure, but the odds bias I found has gone.
My fourth and final conclusion is: I am a total idiot. I spent three months developing a betting model. I found a way to win. But instead of placing all my free capital on the model, I published a book with the secret in it, only to see the profits disappear.
Yes, I got paid for writing the book. Yes, I have enjoyed talking about soccer and engaging in the analytics community, but the money would have been nice too.
There is no secret equation for predicting the outcome of soccer matches. Not an equation that ignores the odds, in any case. If you want to create your own model of sporting outcomes you need to use the odds as the starting point.
Wisdom of the crowd tells us that the betting market can be hard to beat, but sometimes it makes a few small mistakes. It is these you have to look for.
In part two of this article I will see if I can find one of those cracks using a combination of an expected goals model and potential biases in recent odds.