Handicappers Don't Make Money by Betting Sports

Published: May 18, 2018

Most of this was written a while back, I wouldn't code some of the simulations the same way now, but everything still works so it's being left as is.

The packages needed for examples below.

library(gridExtra)
library(tidyverse)
theme_set(theme_minimal())

Don't Buy Sports Betting Picks From Losers

Handicappers that sell their picks to the general public don't win and are a bad bet for gamblers.

note: the only long term documented exception to this rule I know of is RAS but as they say "one point does not a trend make."

It's unfortunate that this even needs to be said. All of the common sense arguments against touts seem totally logical to me. Why if someone could win often enough to be a profitable bettor would he sell picks? If you find a pot of gold at the end of the rainbow do you call everyone over to share in your new fortunes? Maybe you do-- but professional gamblers' wouldn't. For the same reason the you can't find a book that will teach you how to become a winning bettor you won't find real professionals selling their winning picks. Once the information becomes public it loses its value.

But I'm not here to try and convince you with words why touts are long term losers, we will run some simulations to see how common it is for a coin-flipper to win and exactly how long until the law of large numbers catches up with him.

Simulating Bettors

For the experiment we will create a number of bettors and watch their performance over a set number of bets.

To start off we will make some assumptions:

  1. All bets are -110.
  2. Each bettor has a 50/50 chance of winning any given bet.
  3. Each bettor has an infinite bankroll

The sample function in R makes it easy to run the simulations. First we will define the parameters, i.e. the number of bettors, number of bets, win/loss amount.

Next, the bet simulation is accomplished by repeatedly sampling from the vector c(win, loss). The likelihood of either being selected is the same by default. All of the simulations are put inside a matrix and arranged by bettor.

If there's 100 bettors and they place 100 bets that's 10,000 simulated bets. Since the number of samples to be taken is bigger than the sample pool (win/lose) the setting replace = must be set to TRUE. All of the sims are put into a matrix and the the shape of the matrix is the number of bets by the number of bettors. We stick that matrix inside a dataframe.

bets <- 100
bettors <- 100
win <- 100
loss <- -110

df <-
  as_data_frame(
    matrix(
      sample(c(win, loss),
             bettors * bets,
             replace = TRUE),
      nrow = bets))

Summing up the bets

To keep a running total of the bets across time we will use the mutate_all() function. Then we will add a bet_number column by using the 1:n() function. Finally, use the gather() function to manipulate the data into a tidy format

df <- df %>% 
  mutate_all(funs(cumsum)) %>% 
  mutate(bet_number = 1:n()) %>% 
  gather(bettor, balance, -bet_number)

Graphing the Results

Set the bet number as the x-axis and the balance as the y-axis. This layout tracks how the each bettor performs over the number of bets.

ggplot(df, aes(x = bet_number, y = balance)) +
  geom_line(alpha = .55) +
  labs(x = "Bet Number", y = "Balance ($)") +
  scale_y_continuous(labels = scales::dollar)

bettor performance-1

Everyone Get's Lucky Sometime

The x-axis is the bet number. For an example "25" is the 25th bet in the sequence. We're interested in the far right of the graph that represents the amount of money the bettor ends up with after the sequence of bets. While each line represents a unique bettor our focus is the overall distribution (the average is the slightly darker line in the middle of the wave like shape).

After 100 bets quite a few bettors end their sequence up money. Remember that these bettors are coin-flippers and are just as likely to win as lose. However, since the bettors lose more money than they win (-110 vs +100) it's more likely that a bettor will end up negative. We can figure out how often we need to win to breakeven with some simple math.

110/(100 + 110) = .5238

In order to just breakeven after a sequnce of n bets we need to win 52.4% of our bets betting into -110 lines. The line between winning and losing is razor thin. The question then becomes how long until your luck runs out. Since we're going to changing parameters like number of bets, win and loss percentage and number of bettors we're going to turn our code into a reusable function.

bettor_sim <-
  function(bets = 100, bettors = 100,
           win = 100, loss = -110, winPct = .5) {
  df <- as_data_frame(
    matrix(
      sample(c(win, loss),
             bettors * bets,
             replace = TRUE,
             prob = c(winPct, (1 - winPct))
             ),
      nrow = bets))
  
  df %>% 
    mutate_all(funs(cumsum)) %>%
    mutate(bet_number = 1:n()) %>%
    gather(bettor, balance, -bet_number) %>%
    ggplot(aes(x = bet_number, y = balance)) +
    geom_line(alpha = .35) +
    scale_y_continuous(labels = scales::dollar) +
    labs(x = "Bet Number", y = "Balance ($)")
  }

bettor_sim(bets = 5000)

bettor sim output-1

The last row of the dataframe is the ending balance so it is possible to look at a distribution of the bettor results. We will have to alter the function a little to return the actual data instead of a plot.

bettor_sim_data <-
  function(bets = 100, bettors = 100,
           win = 100, loss = -110, winPct = .5) {
  df <- as_data_frame(
    matrix(
      sample(c(win, loss),
             bettors * bets,
             replace = TRUE,
             prob = c(winPct, (1 - winPct))
             ),
      nrow = bets)) %>%
    mutate_all(funs(cumsum)) %>%
    mutate(bet_number = 1:n()) %>%
    gather(bettor, balance, -bet_number)
  
  df
  }

bettor_sim_data(bets = 100, bettors = 500) %>%
  filter(bet_number == 100) %>% 
  ggplot(., aes(balance)) +
  geom_histogram(fill = "bisque", color = "white",binwidth = 500) +
  scale_x_continuous(labels = scales::dollar)

bettor sim dist-1

Still a loser

The mean is slightly below zero after 100 bets meaning that most bettors broke even in their bets but the cost of juice brought them into the red. There's still quite a few bettors in the positive, some as high as $2,000. At 100 a bet 2k is 20 units. Quite a score. When the number of bets increases the number of bettors to the right of zero starts to decrease.

First let's increase the number of bets from 100 to 500, then to 2500, and finally 5000. (while 5000 bets sounds like a lot, if you placed 3 bets a day on average for 350 of the 365 days in a year you would hit five thousand in less than five years. In reality you will hit this number much sooner).

# Plot 1: 500 bettors 100 bets
p1 <- bettor_sim_data(bets = 100, bettors = 500) %>%
  filter(bet_number == 100) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "bisque", color = "white",binwidth = 500) +
    geom_vline(aes(xintercept=0), col ="black") +
    scale_x_continuous(labels = scales::dollar) +
    labs(title = "Bets: 100")

# Plot 2: 500 bettors 500 bets - remember to change bet_number in filter
p2 <- bettor_sim_data(bets = 500, bettors = 500) %>%
  filter(bet_number == 500) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "chartreuse3", color = "white",binwidth = 2000) +
    geom_vline(aes(xintercept=0), col ="black") +
    scale_x_continuous(labels = scales::dollar) +
    labs(title = "Bets: 500") 


# Plot 3: 500 bettors 2500 bets - remember to change bet_number in filter
p3 <- bettor_sim_data(bets = 2500, bettors = 500) %>%
  filter(bet_number == 2500) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "cyan2", color = "white",binwidth = 2500) +
    geom_vline(aes(xintercept=0), col ="black") +
    scale_x_continuous(labels = scales::dollar) +
    labs(title = "Bets: 2500") 

# Plot 4: 500 bettors 5000 bets - remember to change bet_number in filter
p4 <- bettor_sim_data(bets = 5000, bettors = 500) %>%
  filter(bet_number == 5000) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "darkslateblue", color = "white",binwidth = 5000) +
    geom_vline(aes(xintercept=0), col ="black") +
    scale_x_continuous(labels = scales::dollar) +
    labs(title = "Bets: 5000") 


grid.arrange(p1, p2, p3, p4, nrow = 2, ncol = 2)

bettor sim multi-1

The black vertical line in each plot marks the $0 point. Everything to the left of the line is a negative result. As the number of bets rise the number of players that end in the negative increases. By 2,500 bets almost all of the bettors are negative. After 5,000 bets all of the bettors are far below zero costing them thousands of dollars. These results assume that each bettor wins 50% of his bets.

What if touts hit at the rate they claim to?

Before we get ahead of ourselves looking at a 70% win rate let's be a little more realistic and see what a 55% winner looks like. Rather than rerunning the same four functions from above and combining them again (with new win percentage parameters) we will turn the code above into a resuable function so we can easily rerun our simulations with new parameters.

bet_hist_dist <- function(bets, bettors, winPct, color,bw) {
  bettor_sim_data(bets = bets,
                  bettors = bettors,
                  winPct = winPct) %>%
    filter(bet_number == bets) %>%
    ggplot(aes(balance)) +
    geom_histogram(fill = color,
                   color = "white",
                   binwidth = bw) +
    geom_vline(aes(xintercept = 0), color = "black") +
    scale_x_continuous(labels = scales::dollar) +
    labs(title = paste("Bets", bets))
}

bet_hist_dist(100, 500, .50, "azure3", 500)

bettor sim breakeven-1

The function will plot histogram similar to the histograms above. There's a few extra parameters now; color and bw (which is short for binwidth). These two parameters make it easier when we combine multiple historgrams. Color is obvious but we turn bindwidth into a parameter because increasing the number of bets also increases the amount won and lost.vWe could just use a formula like bets * 3 for binwidth but setting the parameter lets us control the shape of the histrogram to fit in with the other histograms on the page.

Now that we have a working function that plots the data in a format it's time to rerun the function to explore how a 55% fares over different number of bets.

Instead of running the function four seperate times each with different parameters we can use the pmap function from the purrr package. Pass a list of lists containing the various parameters and the function to pmap. To keep things clean assign the list of lists to a object named params. Next assign the results of pmap to po which is now a list containing the four different histograms (the reason the number of histograms is four is because our params list has a list containing four items, pmap will iterate over each setting).

Select each list item using the [[ ]] notation and put all four of them in the grid.arrange function.

# The parameters that go to the bet_hist_dist function above.
params <- list(
  bets = list(100, 500, 2500, 5000),
  bettors = list(500),
  winPct = list(.55),
  color = list("bisque", "chartreuse3", "cyan2", "darkslateblue"),
  bw = list(500, 1000, 2500, 5000)
)

po <- pmap(params, bet_hist_dist)
grid.arrange(po[[1]], po[[2]], po[[3]], po[[4]], nrow = 2, ncol = 2)

bettor sim 55-1

Considering that it's nearly impossible to go broke hitting 55% after 2500 bets and after 5000 bets you'll be up on average 30k (remember this is on $100 units) running the same sim on 60%+ winner is redundent.

If you can hit 55% consistently and get down it's like having your own private ATM. Think about that the next time you seeing someone on twitter selling their "70% Weekend Special" for $199.

Variable Bet Sizing

So far we've been using a flat betting model where our bettors bet size never varies. While this might be easier to model it isn't very realistic. We could go with something like the Kelly criterion to determine the bet size but we'll opt for easy again and use "x" percentage of the bettor's bankroll. This means instead of using an infinite bankroll where a player can never run out of money we will start with a fixed sum. The bankroll size will determine the bet amount.

bettor_bankroll_sim <- function(bets=500, bet_size_p=0.05, bank_roll=1000, winPct=.5) {
  results <- rep(0, bets)
  for (i in 1:length(results)) {
    bet <- bet_size_p * bank_roll
    bet <- sample(c(bet, (bet+(bet*.1))*-1),1, prob = c(winPct, (1-winPct)))
    bank_roll <- sum(bank_roll, bet)
    results[i] <- bank_roll
  }
  results
}

bettor_bankroll_sim(500, .05, 1000) %>% head()

## [1] 1050.0000  992.2500  937.6762  886.1041  837.3683  879.2367

df <- bind_cols(as_data_frame(replicate(500, bettor_bankroll_sim())))
names(df) <- map_chr(1:500, function(.x) {
  paste0("bettor_", .x)
})
sum(tail(df, 1) > 1000)

## [1] 15

The last bit of code gets the number of bettors that, after 500 bets, have a higher bank roll than they started with. That's out of 500 bettors, hitting 50% winners. Let's look at how 55% winners do.

df <-
  bind_cols(as_data_frame(replicate(500, bettor_bankroll_sim(winPct = .55))))
names(df) <- map_chr(1:500, function(.x) {
  paste0("bettor_", .x)
})
sum(tail(df, 1) > 1000)

## [1] 368

Let's take a look at the average balance of a 55% winner after 500 bets and a bet size equal to 5% of the bettor's bankroll.

df %>% 
  tail(1) %>% 
  rowMeans()

## [1] 3945.657

Now let's plot it.

df %>% 
  tail(1) %>% 
  gather(bettor, balance, 1:500) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "azure3", col="white", binwidth = 1500) +
    scale_x_continuous(labels = scales::dollar) +
    theme_minimal()

Simulated 55 percent bettor

Plotting the final balance of each bettor gives a striking visual of how profitable being a 55% winner can be. This is only after 500 bets starting with $1,000. Let's do something silly and see what it looks like when we start hitting 60% and we will bump the number of bets up to 1,000.

df <-
  bind_cols(as_data_frame(replicate(
    500, bettor_bankroll_sim(bets = 1000, winPct = .60)
  )))

df %>% 
  tail(1) %>% 
  gather(bettor, balance, 1:500) %>% 
  ggplot(., aes(balance)) +
    geom_histogram(fill = "azure3", col="white") +
    scale_x_continuous(labels = scales::dollar) +
    theme_minimal()

Simulated 60 percent bettor

You're not seeing that wrong. The plot shows the majority of bars between $0 and $20 million dollars. We purposely didn't set the binwidth in order not to spoil the surprise. Since that's such an extreme spread let's look at the how the average bettor did after 1,000 bets.

df %>% 
  tail(1) %>% 
  rowMeans()

## [1] 2941267

The mean is approx 2.7 million dollars. Take that in... On average if you hit 60% of your bets, betting into -110 lines, while betting 5% of your bankroll after starting with $1,000, you will end up 2.7 million dollars richer after 1,000 bets.

How does the best and worst bettors' fare?

df %>% 
  tail(1) %>% 
  t() %>% 
  summary()

##        V1          
##  Min.   :   14045  
##  1st Qu.:  298185  
##  Median :  855186  
##  Mean   : 2941267  
##  3rd Qu.: 2452648  
##  Max.   :52071785

The bettor with the worst 'luck' ends up 12x his starting balance while the bettor with the best outcome is sitting on roughly 64 million dollars.