By using both last year's record AND draft pick values in the same model you are violating one of the most sacred rules in regression. Draft pick values, by defition, are (overwhelmingly) a function of last year's record. The 2 variables are therefore highly collinear.

The same problem exists for M-T surplus values and last year's record. The former is a function of the latter. Using all 3 variables in the same model would be even worse.

It would be like using height and arm length in a regression model that estimates how high someone can do the "vertical" at the combine. Height and arm length are too closely related to each other. It's dividing up the dependent variable's variance among variables that represent the same thing.

Here's perhaps a better idea: Create a new variable "Delta Wins" representing the change in the # of wins from year to year for each team (DeltaW=WINSn-WINSn-1). Use that as your dependent variable. Now you can account for the previous year's record without the collinearity problem. Do 3 separate single-variable regressions using conventional pick values, then M-T surplus values, and finally previous year's wins.

DeltaWINSn vs. Conventional draft value

DeltaWINSn vs. M-T surplus value

DeltaWINSn vs. WINSn-1 (for comparison)

I'd do it myself but you did the legwork and got the data! You'd probably need several years of data to see significance.

Again, I love your site.

]]>it reads as if you used the theoretical draft order to compute your surplus value variable. In reality, picks don’t go #1…#32. Cleveland, for example, had 2 1st round picks this year. You’d have to compute the real surplus value team by team, year by year for several years to get a good number.

No, I did use the value of the actual picks, not the originally-owned picks.

Assuming you did this and the .07 number is valid, this is no small result. It is very large when compared to other stats conventionally accepted as very predictive of a team’s success. Regressing previous year’s wins on next year wins yields an r-squared of … .06.

You're misunderstanding (I think). The regression included surplus draft value *and* last year's record as input variables. The last season record was highly significant, but the draft value was not.

Assuming you did this and the .07 number is valid, this is no small result. It is very large when compared to other stats conventionally accepted as very predictive of a team's success. Regressing previous year's wins on next year wins yields an r-squared of ... .06. Surprising but true, and using 2002-2006 seasons it's significant at the p=0.05 level. To me, this means that the 7 players picked up in a draft account for as much or more variance in the following year's record as the other 45 players already on the team. Or, more precisely, their surplus value accounts for as much variance as all the other players.

Another example, if you regress a team's previous year's pass efficiency onto it's following year's wins (CurrentWins vs. LastYrPassEff), you get an r-squared of only .04). Assuming the bulk of most teams' passing offenses remain in place year to year, passing game proficiency only accounts for 4% of the variance in next year's records. Would you rather have the NFL's #1 passing offense or the NFL's best draft class? Your methodology says the draft is more important.

Why is there such a small r-squared for each of these variables? My guess is randomness. The better team doesn't always win in the NFL. Due to the salary cap, there is so much parity in the NFL and it doesn't take much "luck" for an inferior team to win. So there is a large part of noise/luck/randomness in win-loss records. Probably about a 1/4 of a team's record is random by my studies. Moreover, the best statistical win-prediction models rarely ever achieve better than a 70% correct score.

]]>The coefficient is less likely to be accurate, for a given limited sample size, if the variation within the sample is large however. (Right? I'm thinking of the jars and this point seems obvious, although I haven't done the exercise of running any regressions for them.)

In any case, my instinct is still that we still haven't shown much about the practical significance of M-T value.

We are looking for the significance of *expected* draft value (i.e., M-T value or draft chart value) by measuring its correlation with future wins, knowing that expected draft value doesn't directly affect future wins. Expected draft value affects actual draft value, which -- along with many other factors -- affects future wins.

Some of the problems of doing things this way are that (a) the difference between teams' expected draft values each year are relatively small; (b) the difference between a team's expected draft value in a given year and its actual draft value that year can be huge due to random luck (e.g., Ryan Leaf); (c) even apart from random luck, expected draft value (in the M-T or value chart sense) is only one of several variables that affect actual draft value, and these other variables are hard to control for (i.e., some teams have consistently better scouting departments than others), (d) a team's future wins is subject to random luck (strength of schedule, etc.), and (3) even apart from random luck, a team's future wins is affected by a great many variables other than a team's actual draft value in Year N, and these other variables are hard to control for (i.e., some teams are consistently better than others at signing free agents, some teams are consistently better than others at coaching, and so on).

It occurs to me, however, that I may not be saying anything different from what's in the final (revised) paragraph of your post.

]]>This is probably the first and last time I'll ever get a chance to say this to you, but I think you're wrong.

the small coefficient is the result of statistical insignificance

You've got it backwards. The statistical insignificance is a result of the small coefficient (where "small" of course depends on the sample size and the variation within the sample).

Assuming the usual assumptions are in place, regression coefficients are unbiased, which means that their expected value is the same as the true (unknown) value of the coefficient.

If you ran your 100-sample jar experiment 1000 times with the first set of jars and averaged the regression coefficients, you'd get one or something very close to it.

If you ran your 100-sample jar experiment 1000 times with the second set of jars and averaged the regression coefficients, you'd also get one or something close to it.

[NOTE: I actually wrote a quick program with the R statistical package to verify this.]

That's what unbiased means. The difference between the two situations is that you get a wider spread of guesses in the first case than in the second. But the point is that the variability in the data does not systematically make the coefficient smaller.

Back to the M-T draft stuff....

The estimate of the coefficient on M-T draft value was .32. That might be too big, and it might be too small, but we have no reason to suspect that it'd be one rather than the other. Further, whether it's too big or too small isn't related to the sample size or the variability in the data.

The standard error on that coefficient was about .44, which means that we shouldn't be terribly surprised to learn that the true coefficient is not .32 but is .8 or even 1.2 (or, of course, -.2 or -.7). We should be very, very, very confident that the true coefficient is less than 2. If it were 2, then a #1 for #32 swap would net about .09 wins per year over the next three. That's not nothing, I guess, but bear in mind that that was a very optimistic estimate and that it's just as likely that the true effect is NEGATIVE .05 wins per year.

That, IMO, justifies the "no practical significance" conclusion.

]]>Even if they were statistically significant, they’re small enough that it’s clear that they have no practical significance, as the following thought experiment shows.

I don't think the regression says very much one way or the other about practical significance. The thought experiment purporting to show the minuscule effects of swapping the #1 and #32 picks is based on the rather small coefficient of the M-T value input, but the small coefficient is the result of statistical insignificance, not (necessarily) practical insignificance.

I'll pause to mention that I don't really know what I'm talking about here. So I welcome corrections wherever my shooting from the hip goes wrong.

Suppose there are five jars (labeled one through five) that each spit out a random number. The first jar spits out a random number between -1000 and +1000 (evenly distributed); the second jar spits out a random number between -999 and +1001, the third between -998 and +1002, and so on.

If you take, say, twenty samples of output from each jar and run a regression analysis trying to predict a jar's output using its jar number as input (finding the correlation between jar number and output), the coefficient for jar number will be pretty small. Because the standard deviation of each jar's output is so huge, it will take a very large sample size before the jar number coefficient becomes statistically significant (such that the regression analysis would "know" that the fifth jar spits out higher numbers than the first jar).

Now consider a second set of five jars, where the first spits out a random number between -1 and +1, the second spits out a random number between 0 and +2, the third between +1 and +3, and so on.

Here, even with just twenty samples of output from each jar, if we did a regression analysis using jar number as input, the jar number coefficient will be quite significant. The regression would "know" very quickly that the fifth jar spits out higher numbers than the first jar (and would even know roughly by how much).

With both sets of jars, jar number has the same practical significance. The fifth jar will produce output that is, on average, four points higher than the first jar. If we are bidding on the right to receive a jar's next numerical output in dollars, in both sets of jars we should be willing to pay $4 more for Jar #5 than for Jar #1.

For a given limited sample of jar outputs, however, a regression analysis will treat the difference between Jar #5 and Jar #1 as being way less significant in the first group of jars than in the second group of jars.

It seems to me that rookie draft picks in the NFL are very much like the first group of jars. The difference between Peyton Manning and Ryan Leaf absolutely dwarfs the difference between the [i]average[/i] #1 pick and the [i]average[/i] #2 pick -- which means that a regression analysis isn't going to discover the difference between the average #1 pick and the average #2 pick -- even if there is a very real difference -- without an insanely large sample size.

The fact that the M-T value coefficient is [i]statistically[/i] insignificant in predicting wins based on a regression analysis over a limited sample of data does not, as I understand it, say anything about whether M-T value is insignificant as a practical matter.

]]>Couldn't he trade pick #32 for pick #43?

I agree with JKL in that there other important findings in the M-T study besides their draft surplus value. The false consensus and overconfidence effects described should encourage teams to think of the available talent in terms of tiers rather than absolute rankings.

BTW, the M-T paper was discussed at the Sabermetric Research blog a few months ago where another flaw was discussed.

]]>On your thought experiment, there is no chance a team would actually make that trade, nor would they have to.

Granted. That's why it's a thought experiment. The point is that M and T think the Raiders should make that trade, even if for some reason they couldn't get a better offer.

a team could employ M-T principles and gain a not insignificant advantage until the market corrected itself and recognized the fallacy of the current charts

The funny thing is that the only teams in position to use this information to their advantage are the crummy teams. If Tony Dungy reads the Massey-Thaler paper tonight and buys into it completely, there is really no way he can take advantage of the market inefficiency he has just discovered.

]]>As an analogy, if I think Team A has 50% chance to win, I do not have to pay even money if the conventional thought and consensus is that they are a 7 point underdog. I take advantage of that by demanding the 7 points before I take Team A.

A team employing M-T principles could make a trade similar to the NY Giants and San Diego swap from a few years ago, and get the 4th overall, and the 2nd and 3rd rounders, plus a future pick. The net effect here is closer to 1 extra win over a 3 year period, which is not insignificant. And this would be a socially acceptable "fair market" trade.

Even though I don't think M-T is exactly correct, because they are better than the out of whack NFL charts, a team could employ M-T principles and gain a not insignificant advantage until the market corrected itself and recognized the fallacy of the current charts.

]]>