MLB.com provides, among other things, all of the pitch information for each MLB and AAA game in XML format. It’s the data that drives their wonderful little service Gameday. If you want to take a spin through what the data looks like, start here and poke around. What’s key, most folks agree, is the Pitch/FX information, but there’s also pitch-by-pitch logs for every game.
I’ve put together a little package which includes (1) a schema for a MySQL database to retain the information, and (2) a python script which will handle fetching and parsing the XML data found on MLB.com servers. If you’re interested in such a thing, you can download it from the github project page.
--type=[mlb, aaa] optional: which league to process. Default is ‘mlb’. Any of the categories found here (AA, etc) should work- I’ve just worked with MLB and AAA.
--verbose Shows every HTTP request
--delta Uses delta mode.
When delta mode is run, the script will store the last date it processed in the database. Upon next execution, it will start from where it left off. This is useful for running the thing nightly to grab the latest stuff.
A friend of mine was watching Nolan Ryan’s seventh no-hitter, broadcast today on MLB Network. One of the color guys – Tommy Hutton or Fergie Oliver – mentioned that the current Ranger hitter, Gary Pettis, was close to the all-time record for most seasons with 100+ strikeouts and a small handful of home runs. I don’t think the announcer clarified the number of home runs, but my friend sent me an email asking about such a record.
I pulled up ye olde Baseball Databank, one of the best resources around, and plugged in a query to show the players who put up seasons of at least 100 strikeouts and no more than ten home runs. Pettis indeed leads the bigs with six such seasons, actually passing the previous record holder, Omar Moreno, in 1990, the year before Ryan’s no-hitter. So it was all in the books by the time it was brought up on the air, but nevermind. What a glorious little record.
Five active players show up with two seasons meeting the criteria: Michael Bourn, Chone Figgins, Akinori Iwamura, Mark Teahen, and Michael Young.
Name
Year
HR/SO
Gary Pettis (6)
1985
1/125 (.0080)
1987
1/124 (.0081)
1989
1/106 (.094)
1984
2/115 (.0174)
1990
3/118 (.0254)
1986
5/132 (.0379)
Omar Moren (5)
1978
2/104 (.0192)
1980
2/101 (.0198)
1982
3/121 (.0248)
1977
7/102 (.0686)
1979
8/104 (.0769)
Royce Clayton (4)
2005
2/105 (.0190)
1995
5/109 (.0459)
2004
8/125 (.0640)
1997
9/109 (.0826)
Bobby Knoop (4)
1968
3/128 (.0234)
1964
7/109 (.0642)
1967
9/136 (.0662)
1965
7/101 (.0693)
Lou Brock (4)
1968
6/124 (.0484)
1973
7/112 (.0625)
1971
7/107 (.0654)
1963
9/122 (0.0738)
All of these guys with the exception of Knoop are speed guys: Brock you may have heard of, and Moreno, Pettis, and Clayton have 1,072 stolen bases among them.
Michael Bourn had a breakthrough season in 2009 with 61 steals in 73 attempts and 678 plate appearances, solidifying himself by May 21 as the Astros lead-off guy. Given that he also struck out 140 times and hit only three home runs, he stands the best chance of moving up this dubious list should he keep the pace.
To illustrate the depths of their futility vis-à-vis the longball, the average HR/SO ratio for people hitting more than ten home runs breaks down per decade like so:
Decade
HR/SO
2000s
.234
1990s
.241
1980s
.243
1970s
.253
1960s
.258
1950s
.346
1940s
.357
1930s
.402
1920s
.453
1910s
.247
posted: 29 January 2010, 5:45 pm by Wells
comments: 0
tags: baseball
All of my hard luck, at least I’m not the woman who tripped and fell into a Picasso painting at the MOMA in NYC.
The museum said the Picasso work was damaged Friday when a visitor lost her balance and fell onto the unusually large 6-foot, 4-inch work.
What she needs to do is take the high road here: she didn’t trip, she intentionally fell and altered the art in order to create a new piece entitled “Torn Picasso”. She needs a 500-word artist statement and an agent, not a lawyer. What she’s done here is shift paradigms and challenge world views. I applaud her and the New Art.
posted: 25 January 2010, 3:06 pm by Wells
comments: 0