The Loser’s Curse: Decision Making and Market Efficiency in the NFL Matt’s Keeper League Draft

I suck at fantasy football.

Now that that’s out of the way, some background: my friend Matt Shintaku kindly invited me to be a part of his fantasy football league in 2021. It’s a draft keeper league – basically, every year you can keeper eight players from your 26 player roster, at the cost of using a pick one round higher than where you took them. So if you drafted Lamar Jackson in the fifth round, you may keep him the following year in exchange for your fourth round pick. Outside of that, the draft is completely standard: pick the player you want when it’s your turn, or trade your pick.

One of my all-time favorite Economics papers is Thaler and Massey analyzing the NFL draft and coming to the conclusion that teams were overconfident and trading up too much (among a lot of other interesting conclusions). I’ve been thinking about doing something similar with our fantasy draft, and now that we have four drafts worth of data, I figured it might be a good time to take a look at how we’ve actually done.

First, here’s a plot of all of our draft picks and the fantasy points they accrued in the season in which they were drafted:

Code

ggplot(raw) +
  geom_point(aes(x=pick_no, y = points, color=status), alpha = .5) +
  labs(x = 'Pick', y ='Fantasy Points') + 
  theme_bw() +
  theme(legend.position = 'bottom') + 
  guides(color=guide_legend(title='Pick Type'))

Keepers obviously distort the picture somewhat; by construction, keepers should be outperforming their draft slot in expectation, which we see to some degree in this plot (especially the keepers close to the end). For the rest of the project, we’ll just use the non-keeper picks.

First, we’ll follow Thaler and Massey’s idea to model pick value as an exponential curve. Annoyingly in our context, the first overall pick has been quite a bad pick, so indexing to that pick makes the exponential fitting work poorly. Instead, we’ll just model things in terms of absolute points. We’ll use the following hierarchical model to model the number of points scored by pick $i\in \{1,...,312\}$, denoted by $y_i$ as:

\[\begin{split} \mu_i &= \beta + \exp\left(\mu_a + \mu_b \times i\right) \\ \sigma_i &= \sigma_a + \sigma_b \times i \\ y_i &\sim \mathcal{N}(\mu_i, \sigma_i) \end{split}\]

This is not a particularly nuanced Bayesian model; we’re simply modeling the expected value of the pick as an exponential function, and the standard deviation as a linear one. This model is sufficiently flexible to handle the non-linearity that is clearly present for the early picks (left-hand side of the graph), and can handle the possibility of more or less variance in the value of the picks as the draft continues. We’re primarily going to be interested in the $\mu_i$ expression, but understanding the variance is helpful for contextualizing how important gaining expected points in a trade is.

We’ll fit the model in stan – the stan model is as follows:

data {
  int<lower=0> N;
  vector<lower=0>[N] y;
  vector<lower=1, upper=312>[N] x;
}
parameters {
  real mu_a;
  real<upper=0> mu_b;
  real sigma_a;
  real sigma_b;
  real<lower=0> intercept;
}
model {
  mu_a ~ normal(6,2);
  mu_b ~ normal(-.03,.1);
  y ~ normal(intercept + exp(mu_a + mu_b * x), sigma_a + sigma_b * x) T[0,]; 
}
generated quantities {
  array[312] real pick_value = normal_rng(intercept + exp(mu_a + mu_b * picks), sigma_a + sigma_b * picks);
}

Note that we’re putting some (fairly weak) priors on the $\mu$ parameters; this is to ensure that the Hamiltonian Monte Carlo simulation doesn’t go too far afield and cause the exponent to blow up. Using cmdstanr, we can fit the model as so:

Code

model <- cmdstan_model('/Users/hlukas/git/fantasy/sleeper/data_analysis/draft_stan_model.stan')

raw_no_keepers <- raw %>% 
  filter(status != 'Keeper')

data_list <- list(
  N = nrow(raw_no_keepers), 
  x = raw_no_keepers$pick_no,
  y = pmax(raw_no_keepers$points,0),
  picks = c(1:312)
)

fit <- model$sample(
  data = data_list,
  seed = 123,
  chains = 4,
  parallel_chains = 4,
  iter_warmup = 4000,
  iter_sampling = 4000,
  show_messages = FALSE,
  show_exceptions = FALSE
)

out <- fit$summary()
samples <- fit$draws(format = 'df') %>% 
  tibble()

Inspecting the diagnostic:

Code

fit$diagnostic_summary()

$num_divergent
[1] 0 0 0 0

$num_max_treedepth
[1] 0 0 0 0

$ebfmi
[1] 1.0180230 1.0076873 0.9698419 0.9698432

All clear there. Looking at the summary output:

Code

out %>% 
  select(variable, q5, mean, median, q95, sd, rhat) %>% 
  filter(variable %in% c('intercept', 'mu_a', 'mu_b', 'sigma_a', 'sigma_b')) %>%
  kable(
    col.names = c(
      'Variable', 'Q5', 'Mean', 'Median', 'Q95', 'SD', 'R Hat'
    )
  )

Variable	Q5	Mean	Median	Q95	SD	R Hat
mu_a	5.2662260	5.3784368	5.3782650	5.4908425	0.0683239	1.001278
mu_b	-0.0160566	-0.0116183	-0.0114510	-0.0076729	0.0025784	0.999952
sigma_a	86.7827550	94.2083632	94.0338500	102.2472000	4.7304756	1.000238
sigma_b	-0.0254999	0.0258107	0.0254513	0.0779757	0.0315054	1.000332
intercept	70.9216150	101.0184737	103.3615000	122.9521500	16.2532651	1.000206

This gives rise to what I’m calling the Shintaku Equation:

\[\text{Pick $p$ value} = 70.922 + \exp(5.266 - p \times .016)\]

If we just use the mean values (and $90^{th}$ percentile credible intervals), we can compare to the raw data we plotted earlier to see how reasonable the curve looks.

Code

out_for_plot <- out %>% 
  mutate(pick_no = as.numeric(str_extract(variable, '\\d+'))) %>% 
  filter(!is.na(pick_no))

ggplot() +
  geom_point(
    data = raw_no_keepers,
    aes(x=pick_no, y = points), 
    color = 'dodgerblue3', 
    alpha = .5
  ) +
  geom_line(
    data = out_for_plot,
    aes(x=pick_no, y = mean),
    color = 'red',
    linetype = 'dashed'
  ) +
  geom_errorbar(
    data = out_for_plot,
    aes(x = pick_no, ymin = q5, ymax=q95),
    alpha = .2,
    color = 'red'
  ) + 
  labs(
    x = 'Pick', 
    y ='Fantasy Points'
  ) + 
  theme_bw()

We can leverage the samples to simulate trades. We have to do this in two steps:

Generate $\mu_i$ and $\sigma_i$ for each draw from the posterior (we have 16,000 of these in total, 4,000 from each of 4 chains).
Once we have the parameters for a given draw from the posterior, we can draw samples from this distribution (in the example, we’ll draw 100 each time, giving us 1,600,000 simulations per pick in a trade).

Code

samples <- fit$draws(format = 'df') %>% 
  tibble() %>% 
  tail(4*4000) %>% 
  select(intercept, mu_a, mu_b, sigma_a, sigma_b, .chain, .iteration)

all_implied_values <- bind_rows(
  lapply(
    c(1:312), function(x){
      samples %>% 
        mutate(pick_no = x)
    }
  )
) %>% 
  mutate(pick_mean = intercept + exp(mu_a + pick_no * mu_b),
         pick_sd = sigma_a + pick_no * sigma_b)

# define a comparison function
comparison <- function(team_a_gives, team_b_gives, n_sims=100){
  
  dist <- bind_rows(
    lapply(c(1:n_sims), function(sim_id){
      all_implied_values %>% 
        filter(pick_no %in% c(team_a_gives, team_b_gives)) %>% 
        mutate(sim_no = sim_id,
               sim = rnorm(n=16000 * length(c(team_a_gives, team_b_gives)), pick_mean, pick_sd),
               given_by = ifelse(pick_no %in% team_a_gives, 'team_a', 'team_b')) %>% 
        select(.chain, .iteration, sim_no, given_by, sim)
    })
  ) %>% 
    group_by(.chain, .iteration, given_by, sim_no) %>% 
    summarise(total = sum(pmax(sim,0))) %>% 
    ungroup() %>% 
    pivot_wider(names_from = 'given_by', values_from = 'total') %>% 
    mutate(diff = team_a - team_b,
           winner = ifelse(diff > 0, 'Team B', 'Team A'))

  ggplot(dist) +
    geom_histogram(aes(x=diff, fill=winner), color = 'white', bins=40) + 
    geom_vline(xintercept = 0, linetype = 'dashed', color = 'red') +
    scale_y_continuous(expand = c(0,0)) +
    labs(
      x = 'Points Increase for Team B', 
      y = 'Simulations',
      title = str_interp('Team B Wins In ${round(sum(dist$winner == "Team B") * 100 / nrow(dist), 2)}% of Simulations')
    ) +
    theme(legend.position = 'bottom') +
    theme_bw()
}

We’ll try it out by proposing a trade of the first overall pick from Team A for the 100th overall pick owned by Team B:

Code

comparison(1, 100)

This is pretty remarkable; there’s so much variance in the historical outcomes of the picks that the trade works out for Team A 15% of the time! The more realistic trades we might consider are all going to wind up being pretty close. Let’s look at the one of the first trades in our league’s history – picks 45 and 244 in exchange for 56 and 113:

Code

comparison(c(45, 244), c(56, 113))

Even a trade that seemed fairly lopsided at the time with the benefit of hindsight is only a relatively minor margin!

In conclusion: will this help me in fantasy football? Almost certainly not. But is it fun to pretend to be Richard Thaler? Absolutely.