
Writer, analyst, podcaster, Spurs fan. Three out of four is not bad. If there is a data angle, I will find it.
Can AI Beat The Bookmakers
Premier League betting is a global phenomenon, with punters everywhere testing their knowledge each week and taking on football betting sites. Despite the unpredictability, many rely on their insights, but what if AI could enhance those predictions? At OLBG, we're exploring how two predictive models could forecast Premier League match outcomes and scores.
With those predictions, we are going to bet £1 stakes on each of the 380 Premier League matches for the outcome and score.
£760 bets across the two market and two models means an eventual total outlay of £1,520 - can we end the season with profit to show for it?
The Super Models
Doing their best to get the better of the bookmakers are two AI-models.
In the red corner is the offering from OpenAI's ChatGPT
in the blue corner is the offering from X's Grok
Each have been asked the same question - can you predict the upcoming set of Premier League fixtures?
Although to make things slightly different there level of inputs into the model are different, so as to give us the ability to see which of the two can perform better.
With the 2024/25 having come to a close, we can start to look at how successful the two models have been to date.
Results In Business
380 Premier League matches have taken place in the 2024/25 season and if we look solely at how the two models have performed in terms of purely predicting results, their scorecard is as follows:
Model | Matches | TRUE | FALSE | Correct % | Incorrect % |
---|---|---|---|---|---|
ChatGPT | 380 | 186 | 194 | 48.95% | 51.05% |
Grok | 380 | 175 | 205 | 46.05% | 53.95% |
We can already see that there is a difference between the two models with 380 games in the record books and it is ChatGPT that has come out on top with 186 outcomes correct across the season - 11 better off than its Grok rival with 175.
This means that ChatGPT is 48.95% correct in terms of outcomes and Grok sits at 46.05% and considering the consensus is that you need 55% of all Premier League predictions to be correct across the season to be profitable in flat stakes betting - due to the nature of the odds involved, it is fair to say that the models have fallen short of the necessary benchmark.
Then again, for all the analysis regarding percentage terms, both models needs to also churn out profitability. It is no good if the model simply predicts Manchester City and Liverpool to win each week, these two AI-based boffins need to get the closer games correct to have any chance of turning this into a profitable venture.
Which is where the second table comes into play and when we look at what £380 on each model would return at the end of the season
Model | Stake | Return | P/L | ROI % |
---|---|---|---|---|
ChatGPT | 380 | 337.78 | -42.22 | -11.11% |
Grok | 380 | 337.05 | -42.95 | -11.30% |
In terms of ChatGPT's offering, an outlay of £380 has returned £337.78 - a loss of £42.22 giving us a negative ROI of 11.11%. By comparison, the £380 spent on Grok has returned slightly less at £337.05 - a loss of £42.95 giving us a negative ROI of -11.30%.
Both models missing out on a 55% strike rate and also returning an 11% negative yield across the duration of the campaign. But what happens if we were to spend the same outlay on correct score betting?
What's The Score
Here the same rules apply - the same flat stake of £1 for each game and this time we are not betting on the outcome that either of the models has provided but the correct score that is has served up.
The most important thing to remember here is that we do not need the same level of strike rate as the outcome predictions and this is because of the greater odds that are priced in on correct score bets.
Therefore, if we look at the data that comes from correct score bets, it is as follows:
Model | Stake | Correct Score Return | P/L | ROI % | YES | Correct Score % |
---|---|---|---|---|---|---|
ChatGPT | 380 | 335.50 | -44.50 | -11.71% | 35 | 9.21% |
Grok | 380 | 389.50 | 9.50 | 2.50% | 39 | 10.26% |
Here we can see that predicting solely correct scores could be where the money is and with the strike rate being 9.21% for ChatGPT, it is coming up short in terms of profits at the end of the season.
However, the same story cannot be said for Grok and this is where it has managed to exceed expectations and secure a small profit when it comes to correct score. A 10.26% hit rate - the equivalent of one correct score per game week has tipped matters out of the red and into the black.
The models have currently made 35 (ChatGPT) and 39 (Grok) successful predictions (not always the exact same across the two samples) and the final results are:
ChatGPT: £44.59 Loss, -11.71% ROI
Grok: £9.50 Profit, 2.50% ROI
As you can see this is where Grok starts to show its own talents. A number of long shots leaves Grok with a £9.50 profit to bank after all 380 scores have been predicted - a much healthier return than the loss that ChatGPT recorded.
Overall Results
To wrap up the analysis, we need to look at the total spend of both models for outcomes and correct score bets. When combining the data together, are the two models turning a profit or not
Model | Stake | Combined Return | P/L | ROI % |
---|---|---|---|---|
ChatGPT | 760 | 673.28 | -86.72 | -11.41% |
Grok | 760 | 726.55 | -33.45 | -4.40% |
Grand Total | 1520 | 1399.83 | -120.17 | -15.81% |
When we combine the total spends of each of the two models, we can see the following:
ChatGPT's model has spent £760 and returned £673.28 - a negative ROI of 11.41% (£86.72 down)
Grok's model has spent £760 and returned £726.55 - a negative ROI of 4.40% (£33.45 down)
And if we also account for overall total spend across the two models, the result is as follows:
Total Spend £1,520, return £1,399.83 - a negative ROI of 15.81% (£120.17 down)
Both models finish in the red, which means a total loss of 15.81% - just over one sixth down on initial return on investment.
Considering there was minimal input bar asking ChatGPT or Grok to predict the scores each week, it's a respectable return and with insightful prompts attached to it, there is no reason why progress cannot be continued next time around.
If we were to re-ask the question as to whether AI can predict Premier League results and beat the bookmakers, we would have to say technology came out on top but it was a close run fight until the final day of the season.
Can AI come out on top for 2025/26?