What Best Predicts a First Round MLS Playoff Winner?
Can you name one thing that all four MLS playoff quarterfinal winners had in common this past season? They all played fewer games than the teams that they defeated:
Once again, the curse of the CONCACAF Champions League comes into play. Only one MLS team which has made the group stage has won a playoff series (Houston 2009) in six tries. This year, Columbus, Real Salt Lake, and Seattle all are now included in that 1-5 record. I want to talk a bit more about what are the best predictors of a team advancing. I looked at all 32 quarterfinal matchups since the current two-leg aggregate format has been in place (2003), and I compared each team in a variety of statistics. It turns out that the best predictor of the ones I looked at is actually fewer games played, just beating out goal difference.
All records used were regular season only, and teams who didn't have a 2 or 3 year record, I used the years they had. For results from 1996-9 (such as with the coach records), shootouts were counted as draws. PPG was used for all comparisons. Momentum was a comparison of the final 5 regular season games. For this table, each "even" listing is counted like it's usually done in sports: half a win and half a loss. If you have any other ideas to look at, please comment below. Labels: concacaf champions league, mls playoffs |
Comments on "What Best Predicts a First Round MLS Playoff Winner?"
I don't think this could possibly affect the outcome of these games, but sadly for the MLS, the team with the lower season average attendance advanced in all four series as well.
What about number of injuries and red card suspensions?
Also, I know you only have 32 observations, but it would be cool to do this in a regression setting where you could run a true horse race between the different candidates.
How about team playoff experience? Coaching playoff experience?
Sadly, even the "fewer games" category is not statistically significant when compared to the 50/50 chance most teams have of advancing. Using this site (http://www.stat.ubc.ca/~rollin/stats/ssize/b1.html) and entering the following values:
p0 = 0.5
p1 = 0.672
1 sided Test
and default alpha and power levels
Yields a sample size of 51. This means this proportion of win percentage would have to be viewed over 51 matches to determine that fewer games was a good predictor vs. a coin flip.
What may be more interesting is plotting win percentage vs. the actual number of fewer games. One could then run a Pearson correlation test to determine if correlation did exist, and if it did then run a linear regression to determine the strength. This could even be run for various metrics (win percentage, goal differential, etc.).
Sorry to be such a stickler, but I come from a viewpoint that most sports statistics actually aren't statistically significant. And it's a good thing they aren't - it's the random, 50/50 nature of any one sporting event that makes them interesting.
Zach,
Very good points.
If you only count the scenarios where there is a team who as actually played fewer games then the percentage is 18/25 or 72%.
At that percentage, to get 95% statistical confidence we'd need 30 series, which is pretty close to where we are. However, to get 90% confidence, we'd only need 22 series.
It's obviously not as significant as you'd hope (yet), but I think it's a solid trend.
Also, scaryice, I'd love to look at how the Elo ratings I developed (http://mlselo.f2f2s.com) are at predicting playoff performance.
If you can share the matchups, winners, and who had fewer games played, I can match up their values at the end of the regular seasons.
MLS is experiencing growing pains. San Jose and Colorado play for the Eastern Conference final? How absurd. They should have simply changed the named to MLS Cup semifinals. New York won the East, Galaxy and West. The playoffs should eliminate all the conference names, unless they plan to have four from each conference qualify.
MLS needs to help San Jose acquire a real stadium, and not one with an opened end as it is currently planned (please!). The Quakes have a rich successful history in MLS. Build it and they will come. The Quakes only bring about 9000 per game, but that's because Buck Shaw only holds 10000. That's 90% filled. More people would attend a professional soccer game not played at an amateur stadium.
Phillip -
Looking at team playoff experience, the team with more previous playoff games is 17-14-1.
That's kind of unfair since the early years had more playoff games, so I also looked at just the total number of years a team made the playoffs.
The teams with more previous years in the playoffs were 12-11-9.
I forgot when writing the post that Houston did win last year, so I edited the CCL record to 1-5 rather than 0-6. It's still a curse, just less of one. :)
Zach, I don't mind you being a stickler. I really appreciate comments like yours, since I'm not so knowledgeable when it comes to actual statistical analysis (go figure).
Ryan, here's the list of matchups along with the difference in games played (winners on the left):
2008 NY vs HOU -14
2007 CHI vs DC -10
2010 COL vs CLB -10
2010 LA vs SEA -8
2008 CHI vs NE -7
2010 DAL vs RSL -7
2009 RSL vs CLB -6
2003 SJ vs LA -4
2005 CHI vs DC -4
2005 COL vs DAL -4
2008 RSL vs CHV -4
2009 LA vs CHV -3
2010 SJ vs NY -3
2003 NE vs NY -1
2003 KC vs COL -1
2004 NE vs CLB -1
2004 KC vs SJ -1
2008 CLB vs KC -1
2003 CHI vs DC 0
2004 DC vs NY 0
2004 LA vs COL 0
2005 NE vs NY 0
2006 NE vs CHI 0
2006 COL vs DAL 0
2007 KC vs CHV 0
2006 DC vs NY 1
2009 CHI vs NE 1
2005 LA vs SJ 2
2006 HOU vs CHV 2
2007 HOU vs DAL 2
2007 NE vs NY 3
2009 HOU vs SEA 5
Ryan -
I agree. The trend makes sense, and I'd love to be able to point to it being statistically significant after next year's season.
I think this points to the challenges of MLS being a salary capped league vs. Europe not being one. The elite European teams who regularly qualify for UCL have two things going for them:
1) Qualifying for Champions League - finishing top of the table - is also the thing that determines who's the champion in their domestic league.
2) The fact that they can spend as much money as they like (ignoring the soon-to-be-phased in Fair Play ruels), which means their Starting XI is better than everyone else and typically their talent on the bench is too.
Combine these two, and you see why this is likely less an issue in UEFA leagues than it is in MLS. MLS is playing a delicate balancing match here - trying to keep costs from running away, trying to continuously improve the league internally, and improve it's stature within CONCACAF.
Let's see how much of this LA team returns next year, and see how they do if they get serious about US Open Cup and do well in CCL.
Interesting conundrum for my Sounders though - do they ditch the effort for the US Open Cup three-peat, play scrubs in CCL, and focus on getting their first MLS playoff win and (hopefully) an MLS Cup?
Scaryice -
Thanks for the encouragement. My stats nerdiness sometimes comes off as abbrasive. Nonetheless, what you've highlighted is very cool data. Do you have the goal differential data for each of the series/matches? I could combine that with the matche differential data you already provdided to make a nice correlation/regression analysis.
scary,
Thanks for the stats. The higher rated team in Elo wins 21/32 times:
2007 HOU vs DAL 2 ELO-diff: 92.41
2006 HOU vs CHV 2 ELO-diff: 56.99
2008 CLB vs KC -1 ELO-diff: 56.84
2006 DC vs NY 1 ELO-diff: 53.85
2003 CHI vs DC 0 ELO-diff: 53.84
2005 NE vs NY 0 ELO-diff: 43.24
2009 CHI vs NE 1 ELO-diff: 42.04
2007 NE vs NY 3 ELO-diff: 40.12
2008 CHI vs NE -7 ELO-diff: 34.88
2003 NE vs NY -1 ELO-diff: 34.07
2010 LA vs SEA -8 ELO-diff: 27.33
2006 NE vs CHI 0 ELO-diff: 23.31
2003 SJ vs LA -4 ELO-diff: 20.61
2003 KC vs COL -1 ELO-diff: 16.81
2004 LA vs COL 0 ELO-diff: 12.96
2005 COL vs DAL -4 ELO-diff: 12.82
2009 LA vs CHV -3 ELO-diff: 12.48
2004 DC vs NY 0 ELO-diff: 12.3
2010 COL vs CLB -10 ELO-diff: 8.94
2009 HOU vs SEA 5 ELO-diff: 8.71
2004 KC vs SJ -1 ELO-diff: 4.91
2008 RSL vs CHV -4 ELO-diff: -30.67
2010 DAL vs RSL -7 ELO-diff: -39.07
2010 SJ vs NY -3 ELO-diff: -41.22
2006 COL vs DAL 0 ELO-diff: -51.11
2004 NE vs CLB -1 ELO-diff: -61.74
2009 RSL vs CLB -6 ELO-diff: -68.02
2007 KC vs CHV 0 ELO-diff: -70.75
2007 CHI vs DC -10 ELO-diff: -77.37
2005 CHI vs DC -4 ELO-diff: -81.97
2005 LA vs SJ 2 ELO-diff: -99.84
2008 NY vs HOU -14 ELO-diff: -105.34
What's interesting is that there is only one case where a team played more games and had a lower Elo rating and still managed to win the series (LA in 2005)
Alright, here's the same list with the first number being the difference in games played, and the second the goal differential of the series:
2008 NY vs HOU -14 3
2007 CHI vs DC -10 1
2010 COL vs CLB -10 0
2010 LA vs SEA -8 2
2008 CHI vs NE -7 3
2010 DAL vs RSL -7 1
2009 RSL vs CLB -6 2
2003 SJ vs LA -4 1
2005 CHI vs DC -4 4
2005 COL vs DAL -4 0
2008 RSL vs CHV -4 1
2009 LA vs CHV -3 1
2010 SJ vs NY -3 1
2003 NE vs NY -1 2
2003 KC vs COL -1 2
2004 NE vs CLB -1 1
2004 KC vs SJ -1 1
2008 CLB vs KC -1 2
2003 CHI vs DC 0 4
2004 DC vs NY 0 4
2004 LA vs COL 0 1
2005 NE vs NY 0 1
2006 NE vs CHI 0 0
2006 COL vs DAL 0 0
2007 KC vs CHV 0 1
2006 DC vs NY 1 1
2009 CHI vs NE 1 1
2005 LA vs SJ 2 2
2006 HOU vs CHV 2 1
2007 HOU vs DAL 2 2
2007 NE vs NY 3 1
2009 HOU vs SEA 5 1
Thanks for posting those. I am on my honeymoon right now, but I will do some analysis of the data when I get back and will send a link to the results.
Any way you can post the goal differential and coach experience differential data as well?