Rate statistics for Baseball Databank

Baseball Databank is one of the best sources for free, reliable baseball information on the net, and they just recently released the data for the 2009 season. I put together a small MySQL script to pull together rate statistics for both pitchers and hitters. The script will create supplemental views with the following information: for batters, AB (though that’s in the standard Batting table, it’s helpful to have as to avoid unnecessary joins), PA, TB, OBP, SLG, OPS, ISO, and BABIP; for pitchers, IP, ERA (this is duplicated from the Pitching table), WHIP, K/9, BB/9, HR/9, H/9, K/BB, and LOB%. I’ve added a Qualified column on the batting view which indicates whether or not the person qualified for the batting title that year, according to the formulas laid out here. My totals for qualifiers don’t quite match up before 1957, but I swear I’ve been re-checking the math and it looks square to me. Let me know if you see something obvious (or not so obvious).

You’ll forgive the… ugly… formatting here; I’ve never found a comfortable way to properly indent SQL. It’s just a messy, toolsy language which gets uglier and toolsier the more nested you get. You do what you need to do.

I’d like at some point to add some league-averaged stuff such as OPS+ and ERA+, but for now it is as it stands. Here you go.

DROP VIEW IF EXISTS AVG_Batting;
 
CREATE VIEW AVG_Batting AS
SELECT playerID, yearID,
SUM(AB + BB + IFNULL(HBP, 0) + IFNULL(SF, 0) + IFNULL(SH, 0)) AS PA,
SUM(AB) AS AB,
SUM(H - HR - 3B - 2B) + (SUM(2B) * 2) + (SUM(3B) * 3) + (SUM(HR) * 4)  AS TB,
SUM(H + BB + IFNULL(HBP, 0)) / SUM(AB + BB + IFNULL(HBP, 0) + IFNULL(SF, 0)) AS OBP,
(SUM(H - HR - 3B - 2B) + (SUM(2B) * 2) + (SUM(3B) * 3) + (SUM(HR) * 4)) / SUM(AB) AS SLG,
(SUM(H + BB + IFNULL(HBP, 0)) / SUM(AB + BB + IFNULL(HBP, 0) + IFNULL(SF, 0))) +
	((SUM(H - HR - 3B - 2B) + (SUM(2B) * 2) + (SUM(3B) * 3) + (SUM(HR) * 4)) / SUM(AB)) AS OPS,
(SUM(2B + (3B * 2) + (HR * 3))) / SUM(AB) AS ISO,
(SUM(H - HR)) / (SUM(AB - SO - HR + IFNULL(SF, 0))) AS BABIP,
(CASE
		WHEN yearID >= 1957 AND (SUM(AB + BB + IFNULL(HBP, 0) + SF + IFNULL(SH, 0)) >= (SELECT FLOOR(G * 3.1) FROM Teams WHERE teamID = Batting.teamID AND yearID = Batting.yearID)) THEN 1
		WHEN yearID BETWEEN 1945 AND 1956 AND (SUM(AB) / (SELECT G FROM Teams WHERE teamID = Batting.teamID AND yearID = Batting.yearID) >= 2.6) THEN 1
		WHEN yearID BETWEEN 1938 AND 1944 AND (SELECT lgID FROM Teams WHERE teamID = Batting.teamID AND yearID = Batting.yearID) = 'NL' AND (SUM(G) >= 100) THEN 1
		WHEN yearID BETWEEN 1938 AND 1944 AND (SELECT lgID FROM Teams WHERE teamID = Batting.teamID AND yearID = Batting.yearID) = 'AL' AND (SUM(AB) >= 400) THEN 1
		WHEN yearID BETWEEN 1920 AND 1937 AND (SUM(G) >= 100) THEN 1
		when yearID < 1920 AND (ROUND(SUM(G) / (SELECT G FROM Teams WHERE teamID = Batting.teamID AND yearID = Batting.yearID) * 100) >= 60) THEN 1
		ELSE 0
		END) AS Qualified
FROM Batting
GROUP BY yearID, playerID;
 
DROP VIEW IF EXISTS AVG_Pitching;
 
CREATE VIEW AVG_Pitching AS
SELECT playerID, yearID,
ROUND((IPouts / 3), 1) AS IP,
ROUND(((ER * 9) / (IPouts / 3)), 2) AS ERA,
ROUND(((BB + H) / (IPouts / 3)), 2) AS WHIP,
ROUND(((9 * SO) / (IPouts / 3)), 1) AS K9,
ROUND(((9 * BB) / (IPouts / 3)), 1) AS BB9,
ROUND(((9 * HR) / (IPouts / 3)), 1) AS HR9,
ROUND(((9 * H) / (IPouts / 3)), 1) AS H9,
ROUND((SO / BB), 2) AS KBB,
(H + BB + IFNULL(HBP, 0) - R) / (H + BB + IFNULL(HBP, 0) - (1.4 * HR)) AS LOB
FROM Pitching
GROUP BY yearID, playerID;

Leave a comment

RSS feed for comments on this post