What can THOME tell us about 2018?

Just recently, I added a rear-facing element to the THOME Projections website: 2018 "projections."

Back in 2018, leading up to the season, I did a version of these projections that I kept to myself. They weren't quite as robust as the version of the algorithm I have since developed. Basically, they didn't have the fifth step listed on the FAQs page. After determining an Expected Win Percentage, I simply used that as each team's winning percentage for the purposes of the projected standings.

So there was no Monte Carlo simulation of the season, and that meant two things: No normal distribution of outcomes from where I derive the Vegas Number Confidences; and the number of wins throughout the league did not come close to matching the number of losses. I think there were something like 2,490 wins and 2,370 losses. There should be 2,430 of each. The current, as of this writing, 2019 Projected Standings show 2,436 wins and 2,424 losses, which is much closer to a possible reality.

But more importantly for picking which season-long wagers to place, I did not have confidences for each Over/Under number. I simply had to go with the teams that were further away in their win totals than what Vegas had. I picked eight teams this way, went 5-3 over the season, for a return on investment of about 21%. Not bad at all for a six-month investment.

But what if the THOME system that exists today existed 12 months ago? What would that have looked like? I refactored the algorithm over the last couple weeks to allow for this type of simulation. Let's start by looking at the Projected Standings:

A few things jump out at me here. First, the number of teams where THOME's projected record was within five wins is 11, including nailing the St. Louis Cardinals on the nose. The number of teams where THOME was off by 10 or more is 14, including a delta of more than 20 for Boston (they were very good) and Baltimore (they were very bad).

So almost the same number of very close hits as very wild misses. Not bad. Now looking at the Vegas Number Confidences, and what would have happened if you'd placed these bets:

Overall, THOME recommended the correct play for 20 out of 30 teams. I think it bears repeating that in a full two third of the teams in Major League Baseball, THOME knew which way to play. If you had played all 30 recommendations, assuming an average odds of -110 and an even risk across all wagers, you would have seen a return on investment of 27%.

If I could say that you could hand me any amount of money, and in six months I'll hand it back to you with a 27% bonus on top, you'd be a fool not to do it.

Now, that having been said, most people aren't going to play all 30 of these. And admittedly, the teams where THOME felt most confident did not play out as well as the entire group. If you played just eight teams with a > 90% confidence, with the same assumptions as before, your ROI drops to 19.3%. Which is still making you very good money at the end of the season.

Also, take a close look at the hits and misses in that group. The misses were small: Cleveland and Arizona by four wins, and Miami by just two, for an average of 3.33. The hits were by generally bigger margins: SF by eight, Toronto by eight, Yankees by five, Angels by four, and Mets by four, for an average of 5.8.

If the Marlins had won two more games, 5-3 becomes 6-2 and a 19.3% ROI becomes 43.2%. That's a huge swing on just two games in a 162 game season.

At any rate, I think these "historical" pages will stand as a bit of proof that the system works. And hopefully you'll join along with me for the 2019 season. Let's get that bread.

Using BaseRuns to handle cluster luck

One of the main tenets of Joe Peta's kickass work Trading Bases (by which THOME is largely inspired) is that you need to deal with what he calls "cluster luck," or the occurrences of a team stringing together hits in such a way as to score more runs than would otherwise be expected.

Hits turn into runs at an average rate somewhere near 2:1. That is, most teams, most of the time, have about twice as many hits as they have runs scored. But this isn't always the case. In a classic example I'm citing from memory but definitely stole from somewhere else, if a team tallies 9 hits in a game, how many runs will they score? 4? 5?

If the 9 hits came one in each inning, they very likely scored 0 (barring walks, homers, etc.). If all 9 hits came in just one inning, then even 9 singles scored 6-8 runs, and the number could be much higher than even 9 if some runners reached base via something other than a hit.

The point being, a hit is not a hit is not a hit. And a run is not a run is not a run. A team that scores 800 runs in a season sounds like they killed it. That's an average of nearly 5 runs a game. That should be a playoff team (unless their pitching gave up many more runs than the offense could score). But what if they scored 20 runs in 40 games each, and were shut out the other 122 times? Now we have an historically pathetic record. Yes, I also realize that's not freakin' possible... but you should get the point.

So, cluster luck. Peta's right, you have to take account of it, most especially when evaluating +EV positions on season win total O/U wagers. But he doesn't go into the details of his calculations in Trading Bases. He discusses, at a high level, how you need to deal with it, and how this affected the outlook of some team's 2011 season based on their 2010 run scoring... but I need specifics!

Enter BaseRuns.

Allow me to allow FanGraphs to describe this sucker:

BaseRuns is a formula designed to estimate how many runs a team would be expected to score (or allow) given their underlying offensive (or defensive) performance. In other words, BaseRuns is a context-neutral run estimator used to evaluate teams.

FanGraphs, BaseRuns

The important part here is context-neutral. This essentially does away with the variance of cluster luck and gives us an appropriate Runs Scored and Runs Allowed for a team based on their underlying peripherals.

After we figure that for every team, we can discover a true context-neutral (but league and scoring environment-sensitive) expected winning percentage by utilizing the Pythagenpat variation of Bill James' original Pythagorean Win-Loss estimation. 

After dealing with roster changes using projected WAR. But that's a story for another time.