Using Pythagorean expectation for soccer betting

Predominantly used for betting on US sports such as baseball and basketball, can Pythagorean expectation be used for betting on soccer? Analytics expert, Mark Taylor explains why this could be a potentially profitable strategy when betting on long-term markets.

67454532

One of the most famous equations in mathematics and geometry is Pythagoras’ theorem, which relates to the length of the three sides of a right-angled triangle.

Pythagorean expectation explained

Over two thousand years later, renowned baseball analyst, Bill James reworked the equation as a basis for his Pythagorean expectation, which attempted to explain a team’s likely true winning percentage in terms of the points or runs they score and allow, rather than merely their actual winning percentage.

Win % = (points or runs scored ^X) / (points or runs scored ^X + points or runs allowed ^X).

Scoring events are more numerous than wins and better explain a side’s true abilities, since scores might not always arrive at the most advantageous time in a single match.

A team may score when they are already comfortably ahead or concede when they lead by a narrow margin, and this distribution of when and where goals are scored or conceded may unsustainably inflate or depress their winning record, and in turn their league position, over relatively small samples.

In short, a team that outperforms their Pythagorean expectation may be considered “lucky” and one that under performs as “unlucky”, neither state is guaranteed to persist.

This concept has spread from baseball to include basketball and American football, and more latterly soccer.

Using Pythagorean expectation in soccer betting

Unlike the predominately US based sports, where draws or ties are very rarely a factor, football has a number of unique challenges when applying a Pythagorean approach.

Draws are the biggest challenge when using Pythagorean expectation in football, followed by the goal scoring environment in which matches are played. With the high variety of leagues within soccer, some are likely to be higher scoring than others.

Also, teams may sometimes find themselves forced to play with fewer than 11 players due to red cards, which invariably distorts the scoring events.

Many of these issues have been addressed notably in this paper by Howard Hamilton.

James’ initial exponent was 2, in keeping with the origins of his equation, but altering the value of the exponent, X, helped to reduce the root mean square error between James’ predicted number of wins and actual wins and a value of 1.83 is often used in baseball instead of 2.

Football has taken the same approach and the discrepancy between Pythagorean expectation and actual outcomes in this sport approaches a minimum at an exponent of 1.35 rather than 2.

Win percentage is straightforward in sports with little or no drawn games, but in football this figure will include a substantial number of draws. Therefore, win percentage is often equated to the percentage of possible points that may be won, to account for a side being able to pick up a point in a drawn game despite not scoring in that match.

In a 38-game season with three points for a win, there are a possible 114 points at stake. So, a side whose scoring record implied a true win percentage of 50% could expect, under this adaptation, to end a season with 57 league points.

Further refinements include, varying the exponent for the components of the denominator and numerator in the equation, as well as including a term that comprises the number of goals scored and allowed in the exponent to allow for variations in the goal-scoring environment.

The most widespread use of Pythagoras in soccer is to determine if the number of points won in one season is justified by their scoring records, and therefore is likely to be a good indicator of performance in a subsequent campaign.

Teams who both score and allow few goals will be more likely to be involved in draws than those who score and concede in greater numbers.

The extent to which the choice of exponent reduces the RMS error between the expectation and reality of actual points is shown by the gradual decrease in RMSE, as the models become more refined.

An exponent of two gives a RMSE of nearly 10 points per team for the 2014/15 Premier League, this figure decreases to six points if 1.35 is used and further to just over 4.4 points if goal environment is included in the exponent.

Gauging points total over a season

The most widespread use of Pythagoras in soccer is to determine if the number of points won in one season is justified by their scoring records, and therefore is likely to be a good indicator of performance in a subsequent campaign.

A team’s goal difference serves a similar purpose in attempting to separate the possibly unsustainable element of luck from repeatable skill.

Newcastle’s 65 points in 2011/12 was nearly 10 in excess of that expected by a typical Pythagorean expectation for a side that scored 56 goals and conceded 51.

Newcastle’s numerous single goal victories and a few heavy defeats were unlikely to be repeated for a side that could only score five more goals than they conceded. Therefore, it was unsurprising to see them achieve fewer points in 2012/13.

The table below charts the number of Premier League campaigns where a current member of the top flight exceeded or fell short of their Pythagorean expectation. The majority of teams show the equal splits that you expect if luck was the major cause of a team’s over or under performance.

Team Number of overperforming seasons Number of underperforming seasons Average points above or below expectation
Arsenal 11 12 0.0
Aston Villa 10 13 0.3
Crystal Palace 3 3 0.0
Chelsea 13 10 0.4
Everton 7 16 -0.6
Leicester City 4 5 -1.8
Liverpool 5 18 -1.7
Manchester City 8 10 -1.1
Manchester United 17 6 3.1
Newcastle United 11 10 0.8
Norwich City 5 2 3.3
Southampton 4 12 -1.4
Stoke City 5 2 1.7
Sunderland 7 7 -0.5
Swansea City 1 3 -1.7
Tottenham Hotspur 14 9 1.2
Watford 0 2 -2.4
W.B.A 3 6 -0.6
West Ham United 12 7 1.1

Manchester United are notable possible exceptions, as their final league points total has exceeded their scoring and conceding record in 17 of their 23 Premier League seasons.

A near constant factor in this record was of course, Sir Alex Ferguson, and additional research does appear to support the case for United’s extraordinary ability to scoring winning goals in the dying minutes of a match.

Of the ten teams that have exceeded expectation the most in Premier League history, eight saw their total points fall in the next season, while nine of the ten most underperforming teams against their Pythagorean expectation gained more points in the subsequent season.

So, there is tentative evidence that United, under Ferguson, if not currently, may have been partly responsible for most of their own apparent overachievement.

On the other hand, Liverpool have underachieved in 18 of 23 seasons, but evidence of a persistent trend is less compelling.

Eight of their underperforming years were by three points or less and an average underperformance of 1.7 points per season is nearly half of United’s average overperformance.

Narrative driven validation of a side’s ability to consistently perform above or below their Pythagorean expectation may be enticing, but in many cases a side’s subsequent league performance is more in keeping with their previous Pythagorean expectation than their previous actual point total.

Profitable Pythagorean expectation trends

Over the Premier League era, actual points won in one season are slightly better correlated to Pythagorean expectation from the previous year than are actual points from the previous season.

The most eye-catching over performance came in 1992/93, when Norwich finished third with 72 points in the then 42-game Premier League, despite scoring 61 goals and conceding 65. 16 games were won by a single goal margin, but their Pythagorean expectation was just 55 points, and in 1993/94 they dropped to 12th, winning 53 points.

Of the ten teams that have exceeded expectation the most in Premier League history, eight saw their total points fall in the next season, while nine of the ten most underperforming teams against their Pythagorean expectation gained more points in the subsequent season.

Chelsea, Tottenham and Liverpool appeared “lucky” in 2014/15, overperforming their Pythagorean expectation by 9, 9 and 7 points respectively. So, with more usual levels of luck, they could expect to gain fewer points in 2015/16.

Whereas, “unlucky” Leicester, Southampton and Everton could expect to exceed their points total from 2014/15 in the 2015/16 Premier League.

Think you’ve found an edge? Premier League outright markets are available throughout the season at Pinnacle Sports. Sign-up for an account with Pinnacle Sports to get the best odds and highest limits available anywhere online. Alternatively, if you are new to world of sports betting, take a look at our very helpful betting articles page.

Source: pinnacle.com