If Dimitris Bertsimas is right, look for the Boston Red Sox to capture the American League East title in September, one game ahead of the Tampa Bay Rays, whose predicted wins would make them the team to beat for the AL Wild Card spot.
In a burst of post-preseason prognostication, the MIT economist has his favorite team, the Red Sox, finishing the regular season with 101 wins, followed by Tampa Bay with 100 wins, the Yankees at 93, Baltimore at 83 wins, and Toronto with 80.
The foundation for this stick-your-neck-out exercise is a modeling approach Bertsimas and colleague Allison O'Hair developed while pulling together a "you, too, can advise Theo Epstein" exercise for one of his classes.
"I'm a big believer that quantitative analytics can have a major impact on businesses, including sports teams," Bertsimas notes.
Where British-born mathematician Keith Devlin has described baseball as looking "like rounders played by men in pajamas who seemed to wear very scratchy underpants that required constant adjustment and who had an unusual propensity for spitting," Bertsimas sees in each player "a vector of numbers" from which "we can make accurate predictions of how many runs they will score."
Those predictions can then be translated into team statistics that feed into his Opening Day predictions of the teams most likely to play into October.
Bertsimas and O'Hair, one of his PhD students, put their own spin on techniques Oakland Athletics general manager Billy Beane used to guide the team with the third lowest payroll in major-league baseball in 2002 to that year's playoffs. In addition, the duo drew on approaches developed by baseball-stats guru Bill James.
While much of the credit for pioneering the extensive use of player stats has gone to Mr. Beane and the Oakland As, the approach has a deeper historical pedigree, according to Dr. Devlin, who teaches at Stanford University.
The Brooklyn Dodgers used some number-crunching as far back as the 1940s and '50s, he explained in a column for the Mathematics Association of America penned in 2004, just after he attended his first major-league ball game. Managers used the data to inform player trades, set batting orders, and to swap players in and out of rosters based on their performance against opposing teams.
Earl Weaver's stack of index cards
During Earl Weaver's 14-year run as manager of the Baltimore Orioles beginning in the late '60s, the Hall of Fame manager frequently consulted player stats he kept in a ubiquitous stack of index cards, according to Devlin.
As for the Athletics, the statistical approach the team raised to an art form was first introduced by an engineer-turned-National Public Radio reporter whose lingering mathematical antennae started to twitch as he began noticing the benefits to a team of players' high on-base percentages at ballgames he attended.
After devising his model, Bertsimas tested it against the Red Sox's 2010 season. Based on player stats from the beginning of the 2009 season through 2010's spring training, the model had the BoSox winning 90 games; they ended the 2010 season at 89 wins. Close enough for modeling work.
So far, the American League East is the only division Bertsimas and O'Hair have subjected to their digital crystal ball. And they stop short of predicting who will head a World Series victory parade in the fall, although, ahem, some other analysts have picked the BoSox.
With 162 games in a regular season, the model has plenty of data points to chew on as it produces its forecast, he notes.
However, "In a five-game series, the worst team in baseball will still beat the best team in baseball 15 percent of the time," he says. "Any general manager worth his salt sees his job as getting the team to the playoffs, but once they get there, luck plays a much larger role."