Monday, August 20, 2007

Can You Feel It? - How does one pitcher affect the other?

Baseball games frequently develop a “feel.” A slugfest feels much different to watch than does a pitching duel. If we’ve watched enough baseball games, we feel like we can sense the type of game we’re watching while it’s in progress and get a general sense of where it’s going to go.

Is this a very subtle sense that a baseball fan can develop, or is it just an illusion? Do we just think a given game has a given familiar feel, when in reality it’s just a randomly ordered set of random events?

Statheads will typically call this an illusion. The events of a baseball game are overwhelmingly unaffected by one another. This is a necessary and fairly accurate assumption to make. If we say each event is independent, then each event is suitable for analysis, as it is not all tied up in the context of its game and season.

If each event is independent, games can have no feel. Or at least, the feel is only experienced by the viewer and is not actually part of the game.

To begin to investigate this, I looked at a type of game that most certainly has a feel: a no-hitter. In many of the no-hitters I know about, the winning team (the team not being no-hit) also had a poor offensive showing. That is to say, while its pitcher was dominating the opposition and not allowing hits, the team’s offense was also sputtering. This would be evidence of a feel.

Since 1957 (an arbitrary year, dictated by available data), 128 games have completed 9 innings with one team not accumulating any hits (some got hits in extra innings). In those games, the other team (the one getting hits) averaged a paltry 3.9688 runs per game. Over that span of time, Major League teams averaged 4.3849 r/g. So, in those no-hit games, the other team’s offense was suppressed by 9.4912%.

Though that certainly suggests that no-hitters have a feel that spreads to both offenses, the 128 game sample size is too small to be certain. Let’s take a step back from the rare no-hitter and look at the much more common shutout, a game of any length in which one team does not score.

To do this, I looked at data from the 1996 season through the 2006 season. According to my research, there were 26,389 regular season game played in that span, 2,573 of which were shutouts. The following is the data about the shutouts separated by season. ShoR is r/g for the winning team in the shutouts, ShoG is the number of shutouts, NormR is r/g/team in all games, and NormG is the total number of games played.


96

97

98

99

00

01

02

03

04

05

06

ShoR

4.9077

4.3886

4.5861

4.6963

4.5862

4.5286

4.6145

4.3774

4.8964

4.5423

4.6834

ShoG

195

211

244

191

203

227

275

257

251

260

259

NormR

5.0377

4.767

4.7936

5.0846

5.1423

4.7756

4.6183

4.7299

4.8138

4.5936

4.8578

NormG

2266

2266

2430

2428

2428

2429

2426

2429

2428

2430

2429

In each season except 2004, the winning team in a shutout averaged fewer runs than an average team that season. In 2000 in particular, the difference is pronounced.

Between 1996-2006, the winning team in a shutout averaged 4.6152 r/g, while teams averaged 4.8368 r/g overall. That means that in this rather large sample size, the pitching staff shutting the opponent out got 4.5821% less support from its team’s offense than it would have expected.

The 2,573 shutouts provide a little over ½ a season’s worth of data (a current MLB season has 2430 games, but the shutouts were only providing data for one of the two teams). The 4.8368 r/g overall is extremely offensive, evidence of the era of offense that sandwiched the millennium, due to steroids, smaller parks, smaller strike zones or whatever. The 4.6152 r/g in the shutouts is much more moderate, though it is still offensive in a larger historical context.

Is this evidence of a game’s feel?

It is clear that suppressed offense on one side indicates a likelihood of suppressed offense on the other side. “Why,” however, is still unclear. Undeniably, some of it is due to a shared environment. A shutout is more likely to occur in Networks Associates Coliseum in Oakland with the wind blowing in than in an average environment. The difficult hitting environment that is helping one pitcher toss a shutout is also holding back his team’s offense. Or perhaps, the umpire has a large strike zone, causing the same effect as the difficult ballpark.

On the other hand, there is reason to think these games should snowball into blowouts. If a team is carrying a lead into the later innings, as a team throwing a shutout almost always will be, their last few at-bats will always come against their opponent’s middle relief, the weak spot of a team’s pitching staff. If events in the game are truly unrelated, we would then be more likely to see a 2-0 lead turn into a 4-0 lead, a 5-0 lead to turn into an 8-0 lead, and a 12-0 lead to balloon to a 16-0 lead. In a typical game, on the other hand, a team has less than a 50% chance of being ahead late, so thus less than a 50% chance of getting to bat against the weak pitching. From this point of view, we would expect runs to be held down more effectively in the non-shutouts, but this is not the case.

I think it is safe to say that games undoubtedly develop a feel. I did the same type of analysis for instances when a team scores 10 or more runs in a game (from 1996 to 2006). The opponents of a team that scored 10+ runs averaged 5.2618 r/g, an 8.7865% increase from the expected 4.8368 r/g overall.

What we cannot know (or at least, what I can’t tell you) is how much of that feel is created by the shared suppressive environment and how much is created by the intangible factors we might suspect. I do think, though, that there is some extent to which an opposing pitching staff will raise its game to try to meet the challenge set by a dominating opposing pitcher. Also, it is very possible that a team’s offense relaxes a little once it gets a lead and sees that its pitcher is dominating. On the other hand, when a pitcher is given a big lead by his team, he may try to throw lots of strikes, surrendering runs in the hopes of avoiding walks and big innings. Or perhaps, he just relaxes too much when spotted a lead.

Let’s look at one last thing. We know that a team in a nondescript game (from 1996-2006) averaged 4.8368 r/g. Over that span, a team on the right end of a shutout averaged 4.6152 r/g. Since 1957, a team whose pitchers gave up no hits through 9 innings averaged 4.3849 r/g. The more extreme the lack of offense on one side becomes, the less the other side scores.

So what about perfect games? Nineteen times in history has a pitcher completed 9 innings without allowing a baserunner. In that set, the team not being completely shut down has scored just 2.5789 r/g. The sample size is so small that this number is basically meaningless, but it still makes you think…

Sunday, August 19, 2007

Excitement Factor takes on the Playoffs

I thought it would be interesting to apply the principles of the Excitement Factor to playoff series to see how exciting a series was as a whole. The way to do this is not to see how exciting each individual game was and sum up the totals, but to track the probability of a given team winning the whole series throughout all the action.

The first set of things to be determined was the probabilities of winning the series after each game. So, if you’re up 2-1 in games, what are your chances of winning the whole thing?

Assuming that each team has a 50% chance of winning each game (a fair simplification to make), this part is fairly simple with some statistical tact. For an example, let’s determine the chances for a team up 2-0 in the series. (For this next part, “W” means “win” and “L” means “loss” and order matters.) Up 2-0, here’s how you can win the series: WW; WLW, LWW; LLWW, WLLW, LWLW; LLLWW, LLWLW, LWLLW, WLLLW. There’s one way to win in 4 games and you have a (1/2)2 chance of doing it, so 1*1/4 = .25; there are 2 ways win in 5 games and you have a (1/2)3 chance of doing either one of them, so 2*1/8 = .25. If you complete the rest of them in this fashion, you’ll find that, with a 2-0 series lead, your chances of prevailing are .8125. The following table shows the probabilities for every single situation in a 7-game series.

This is a good start, but integrating these series win expectancies in to the games’ play-by-play account can be difficult. Again, I think this is most easily explained by example. Say, once again, you’re up 2-0. At the time of the first pitch we know your chances of winning the series are .8125. Game 3 is going to lead you to one of two situations: leading 3-0 or leading 2-1. If you lead 3-0, your chances jump to .9375, if you lead 2-1, they fall to .6875. So, to find your probability of winning the series at a moment in Game 3, you multiply your chances of winning Game 3 by .9375 and add to that your chances of losing Game 3 multiplied by .6875. This covers all the possible ways you can win the series, the set that includes winning Game 3 and the set in which you lose it. For the opening pitch of Game 3 (when the team’s chance of winning is .5), you could just trust me that your chances of winning are .8125 or you can check for yourself: .5*(.9375)+(1-.5)*(.6875) = .46875+.34375 = .8125. This can of course be done as the game situation changes and the probabilities of winning the given game and the series as a whole move up and down. Then, as with the original excitement factor, it’s just a matter of summing up the absolute value of every change that the series factor (excitement factor deluxe, from now on) undergoes.

Keep in mind, this deluxe version follows the same principles as the original excitement factor. Just as scoring lots of runs was important in having a high excitement factor, playing lots of games is important in the excitement factor deluxe because it allows for more mobility. The nature of a playoff series is also nice in that the more games the teams play the closer the series is, since blowouts end quickly. In the original excitement factor, action in the late innings was important; in the excitement factor deluxe, the later games are more important. This has a very easy mathematical explanation that agrees with the viewing experience. Let’s consider a leadoff single in the bottom of the 6th of a tie game. This raises the home team’s win expectancy from .577 to .627. In the original excitement factor, that’s worth .05. Moving to the deluxe, that same event in Game 1 would shift the probability of winning the series from .577(.6563)+(1-.577)(1-.6563) = .5241 to .627(.6563)+(1-.627)(1-.6563) = .5397, a shift of .0156. In Game 7, you don’t have to multiply the probability by all that mumbo jumbo to account for the rest of the games in the series; if you win the game you win, if not, you lose. So, in Game 7, that single would shift the probability from .577 to .627, a .05 boost. The possibility of future games dilutes the importance of the current one, so as fewer games remain, the importance grows.

The following table shows the excitement factors deluxe for the World Series from 2002 through 2006. To remind you, 2002 went the maximum 7 games, 2003 went 6, 04 and 05 were 4-game sweeps, and 06 took 5 games. As you can see, the only instance where the EFD isn’t in line with the number of games is 2005, but the White Sox and Astros played some unbelievable games in that series, though the former wound up winning all of them.

This of course isn’t a large enough sample size to draw conclusions from, but there’s something evident that we would expect to be true and I think it’s worth pointing out. The correlation between games and EFD is not linear. The 4-game 04 Series was 2.639, the 6-game 03 Series was 6.448, and the 7-game 02 Series was 9.368. As we said, the games are not of equal importance. Playing deeper into the Series not only adds more games, but it adds games that are more important/exciting.

So how about those classic ALCSs between New York and Boston, semifinal matchups that certainly resound more clearly than their ensuing finals? Let’s take a look at how exciting those really were. The EFDfor 2003 (Aaron Boone's series) was 10.597. For 2004 (the Red Sox's historic comeback) it was 6.859.

The 2003 ALCS pretty much dwarfs the one from 2004. Of course, 2004 was the one where the Red Sox made the greatest playoff series comeback in baseball history, but that wasn’t enough. One truth we found in the original excitement factor is that, within a game, one huge comeback doesn’t compare very well to multiple smaller comebacks. Evidently, this holds true for series. Amazingly, in 2004, the Red Sox won a series in which they had, at one point, a 1.9% chance of prevailing. However, that series only saw the Red Sox fall really far, then rise really high really quickly. That is not as exciting as things can get. In 2003, the ALCS went 1-0 Red Sox, 2-1 Yankees, 2-2, 3-2 Yankees, 3-3, and the Yankees ultimately won 4-3.

So, the excitement factor has met its first real obstacle, and it has conquered it.

Excitement Factor

On Monday, September 18th, the Dodgers and Padres played one of the most exciting games in recent memory. The Dodgers twice overcame 4-run deficits, including an almost humorously ridiculous run of back-to-back-to-back-to-back home runs to lead off the bottom of the 9th to tie the game. Though the Padres scored a run in the 10th, the Dodgers again made hearts leap in the bottom of the inning with a two-run, walk-off, come-from-behind home run.
Clearly, that game was Bayer-worthy. Some have taken to calling it the game of the millennium, some going as far as to call it one of the best games in baseball history. The question is, how exciting was the game? Excitement is a strictly emotional response to the action on the field and thus is very hard to evaluate numerically. However, there are some subjective, emotional facts we know to be true. Excitement is generated when the outcome of the game is in doubt. Even more so, a game is exciting when an outcome seems apparent and then events take place to completely alter the complexion of the contest.
Fortunately, we are able to measure and track the probability of a given outcome. Win expectancy percentages are available for every potential game situation (any combination of score, inning, outs, and base runners). This allows us to track the chances of a team winning throughout the entire game.

Here, we see a graph in which the win expectancy is shown as a function of the event number. Any occurrence that moves a base runner or makes an out is an event. This mostly consists of plate appearances, but includes wild pitches and stolen base attempts as well.
The graph, though it looks nice and curvy, is actually made up of a good deal of straight lines. Obviously, the sum of these lines is equal to about +/-.5 every time, as a team enters with about a 50% chance of winning and finishes with chances of either 0% or 100%. However, an interesting thing happens when instead of just adding up the lines, we add up the absolute value of each line. That is to say, when a team increases its chances of winning by 10%, that is worth .1, and when their chances decrease by 10%, that is also worth .1. This accurately allows us to see how much the probability of an outcome changed over the course of the game, the essence of excitement.
The above graph is the progression of the Brewers’ chances during their game with the Dodgers on September 4th. That game seemed to me to be pretty standard. The Dodgers scored a run in the top of the 1st, with the Brewers responding in the 2nd and taking a 4-1 lead in the 5th. The Dodgers rallied to within one, but Milwaukee pulled away to a 6-3 win. By taking the absolute value of each line and adding them up, this game has an Excitement value of 2.5.
On September 19th, the Twins took an early 6-0 lead on the Red Sox and though the lead was cut in half, the Twins coasted to an easy 7-3 victory. Though the final score doesn’t suggest a blowout, in many ways this game was, as the outcome was never really put in doubt. Below is the graph of the Twins’ chances throughout the game.

This pounding provided an understandable low Excitement rating of 2.08.
So how good was the Dodgers-Padres game? Well first, admire the below graph of the Dodgers’ evolving win expectancy.


What we see are large dips and spikes as the teams made runs at each other, one taking a commanding position, only to watch as the other turned what looked like imminent defeat into a contest. This game had a staggering Excitement level of 8.56.
A flaw with this system is apparent by how it analyzes the Dodgers’ 9th. The game would have rated exactly the same if instead of the first three homers, the Dodgers had walked three times. The oversight here is that home runs are intrinsically exciting. I believe the system accurately suggests that in terms of putting the game in doubt, a walk is as good as a homer, but watching those balls fly into the night successively was just so special and exhilarating and Excitement by Win Expectancy can’t capture that.
Now, until a greater number of games is given an Excitement rating, this number has little meaning, as there’s no real frame of reference (there aren’t even units). We do know that it’s not unusual for a game to be rated between 2 and 2.5, so the 8.56 rating is quite impressive, despite our relative lack of understanding of how it really stacks up.
However, I found one thing particularly interesting during this study. The game between the White Sox and Cardinals on June 22nd saw 5 hits and 1 run, coming on a 7th-inning Jim Thome homer, and can certainly be classified as a pitching duel. Conventional opinion will tell you that a pitching duel is just as exciting as a slugfest. A good subjective point can be made to support this case, as a pitching duel keeps the run total low and thus the score very close throughout the course of the game. Below is the graph of the Cardinals’ chances during that game.

This game ranked lower than any of the other three discussed, with an Excitement rating of 1.98, despite the fact that it was tied into the late innings and never put out of reach. A simple conclusion can be drawn from this: runs are exciting. Though the first six innings of this game set the stage for an exciting conclusion, they in themselves provided no excitement. This all makes sense, though, as we already know that excitement is generated when the course of a game shifts from its apparent outcome. When runs are not scoring, nothing is changing.
This system can help us to understand why baseball and football dominate the American sports landscape. In both sports, scoring happens fairly regularly, but is not excessive. An early touchdown in football or a pair of early runs in baseball put a team at an advantage and can be made to hold up, but by no means put the game away. In basketball, the baskets at the moment they are scored are basically meaningless because they account for such a small portion of the final score. In hockey (or soccer for that matter) goals come so rarely that the excitement (though a good save is exciting in its own way) comes infrequently, so when it does come, it greatly shifts the win expectancy. In “goal sports”, it’s very difficult to get the back and forth that create uber-excitement like we witnessed in LA this September.
(All graphs taken from fangraphs.com)

Charlie Manuel


I try to like Charlie Manuel. I try because most people think he's a moron just because he looks and sounds like a moron. I'll admit that his slow, stammering drawl isn't exactly inspiring, but it isn't substantive.

However, the Manager of the Year talk around him seems misguided. First off, the Base Ball Writers Association of America is a joke, and Manager of the Year is probably their worst award. It could pretty much be called "Manager who made the playoffs with the most injuries." Now, if the Phillies make the playoffs, that means Manuel will probably win.

Part of a manager's job is to get the players to play hard every day. Manuel seems to be good at that, so kudos to him, but it's hard to give him too much credit for the fact that a lot of his players are having great seasons.

Strategically, though, he leaves a lot to be desired. The Phillies are 8-19 in 1-run games so far this season. This is a fun stat to blame on the manager, but any real anlaysis of these games shows that these are largely luck, and what isn't luck is determined by the quality of the bullpen. The Phillies' bullpen was decent in Manuel's first two years and they were 43-46 in 1-run games. This year, with a a poor bullpen for about 4 months, the 8-19 record is somewhat a result of bad luck but the bullpen is also to blame.

That isn't Manuel's fault, but some of his late-game strategy is infuriating. Whenever the Phils lead late in the game, he disembowells the offense with pretty much useless defensive changes, switching Burrell out for Bourn/Roberson and Dobbs/Helms for Nunez. Despite these defensive upgrades, we manage to blow a lot of leads, because it's up not to Roberson or Nunez to keep runs off the board, but to the reliever. Thus, when we do blow leads, which happened pretty frequently, we then have trouble taking it back because two of our offensive threats (Burrell: .267/.417/.489; Dobbs: .286/.329/.487) are replaced with two serious downgrades (Nunez: .246/.321/.299; Bourn: .284/.361/.394).

In fact, his use of Nunez in general is really frustrating. Nunez can be a useful player; he can't hit a lick but he's a good fielder. He can be useful in support of ground ball pitching. Unfortunately, of our starters, only one induces ground balls on more than 45% of balls in play: Kyle Kendrick (48.6%). Yet, when Kendrick pitches, Manuel uses the Dobbs/Helms platoon. Nunez starts, instead, when Jamie Moyer, a flyball pitcher (38.5 GB%), is on the mound. This is a baffling decision, cutting away at the offense for a defensive replacement that isn't necessary.

His bullpen usage isn't great either, though there are few managers in the Majors who optimize their bullpen. If you're going to take Brett Myers, one of your two best pitchers, out of the rotation, you'd damn well better use him in more important situations than in the 9th inning with a three run lead, when the team's chances of winning are about 98%. It's somewhat excusable when Willie Randolph uses Billy Wagner in that fashion because Wagner was not yanked out of the rotation, thus cutting his innings by about 60%. If Myers is going to throw 75 innings instead of 200, Manuel has to be smarter about using him in the most dire of situations, not just save situations.

Basically, Charlie Manuel is "blah", completely unremarkable, not really sabotaging the Phils, but doing little to give the team a leg up.

Saturday, August 18, 2007


I went through a bit of a PhotoShop phase, which consisted of me pasting Ryan Howard into pictures...some are historical, some are from current events, and some are just ridiculous. I plan on sprinkling them throughout the blog. This one's my favorite of the bunch.

Intro

I am a Phillies fan and stathead living in the Philadelphia area. First off, yes, there are people living in the metropolitan area who like the Phillies more than the Eagles, though we try to keep it under wraps. On Friday night, the Phillies played a game in Pittsburgh in the middle of a pennant race while the Eagles played a preseason game at the Linc. The Inquirer's sports page the next day was dominated by coverage of the Eagles' impressive showing, with a column on the side about the Phils' 11-8 victory. But I'm used to it.
Baseball statheads are forced under small rocks in the desert, too, because most of the baseball-loving population is either too lazy or too stupid to learn about sabermetrics. And nobody likes being told they're wrong, so people curse statistics as some demonic witchcraft meant to corrupt the good ol' American game.
I, however, live in Philadelphia and love the Phillies (don't tell anyone); I also love everything hollistic about baseball, but embrace statistics as my window into the game (don't tell anyone). (Actually, tell people. It would be cool if people actually read my blog).
This blog, if I keep it up, will be about my opinions on the Phillies, opinions formed by a curious, pensive, passionate, and, yes, statistically-based mind.
I guess as a sort of intro or whatever, I'll explain the title of this blog. Tinkers-to-Evers-to-Chance was the famed Chicago Cubs double play trio in the early 1900s. Many rallies were killed by double plays flipped between these three players.
Well, the Phillies of the early 2000s have a prodigious 1B, 2B, and SS. Jimmy Rollins is an offensive stud playing one of the most demanding positions on the field at a very high level. he is overshadowed by fellow Phillies Howard and Utley and fellow SS, the Mets' Jose Reyes, but his value to a team rivals all of them. According to Baseball Prospectus, he's added 10.4 wins over a theoretical replacement player through August 17th so far in the 2007 season, most on the team. In 2006, he added 9.3, .1 less than Howard, the 2006 NL MVP, did. Rollins is an elite player that few take notice of because much of his value is due to strong defense and the fact that he plays a position that usually doesn't produce offense.
Howard and Utley make the right side of the Phillies' infield the most feared in baseball. Howard is a god to me, but is admittedly very flawed. His defense can be painful to watch, but his hitting is downright imposing. People complain about his strikeouts...screw that! A .260/.386/.570 (avg/obp/slg) line is hella valuable, especially considering he was injured for the first two months. Unfortunately, his hitting even appears to be flawed. In his MVP campaign, though he crushed righties and lefties, he did have a significant platoon split, hitting .331/.453/.711 against righties while hitting a somewhat less impressive .279/.364/.558 against lefties. His 2007 return to Earth has seen this issue become more of a problem: Righties-.287/.425/.605, Lefties-.221/.326/.519. Though still a power threat, he clearly isn't seeing pitches well off of lefties, as the 100-point drop in OBP would suggest. Still, though some things can be done to slow him down, he is one of the NL's unstoppable offensive forces (I'm pretty sure he's a superhero of some sort, too. For more, see 24).
Utley's the hard-nosed, gritty player of the trio. He doesn't always make things look easy (like Howard) or petty (like Rollins) but no 2B playing today compares to Utley. He's a solid but not spectacular fielder, but he's a hitting machine. Unlike with Howard, there's not much you can do to try to derail Utley's .336/.414/.581 line...bring in a lefty, perhaps you keep his power down, but he'll have no problem drawing a walk or going gap to gap. Utley hits better than Rollins and fields better than Howard, and in 2007, most people would probably give him the nod over the other two in the spectacular trio in a Phillies' MVP vote. I, personally, abstain (Actually, I'd vote for Rhino, but my vote probably shouldn't count).
Rollins-to-Utley-to-Howard is a privelage that baseball fans should be lucky enough to see over the course of the next half-decade or so. Thank God.