MLBのデータを引っ張るためのPHPライブラリを書いてみたけど、

MLBのデータを引っ張ってくるためのRubyライブラリ「Gameday_API」に影響されて、
とりあえずバッターの成績を引っ張ってくるようなPHPライブラリを書いてみた。
なんかこんな感じで使えるようになってます。



イチロー選手の情報を取得する

require_once 'Gameday/Batter.php';
$batter = new Gameday_Batter(400085);
var_dump($batter);
結果↓
object(Gameday_Batter)[1]
protected 'gid' => string '2011_06_18_phimlb_seamlb_1' (length=26)
protected 'pid' => int 400085
protected 'position' => string 'RF' (length=2)
protected 'type' => string 'batter' (length=6)
protected 'firstName' => string 'Ichiro' (length=6)
protected 'lastName' => string 'Suzuki' (length=6)
protected 'jerseyNumber' => int 51
protected 'height' => float 180
protected 'weight' => float 77.1
protected 'bats' => string 'L' (length=1)
protected 'throws' => string 'R' (length=1)
protected 'birthday' =>
object(DateTime)[18]
public 'date' => string '1973-10-22 00:00:00' (length=19)
public 'timezone_type' => int 3
public 'timezone' => string 'Asia/Tokyo' (length=10)
protected 'team' => string 'sea' (length=3)



イチロー選手の直近の成績を取得する

require_once 'Gameday/Batter.php';
$batter = new Gameday_Batter(400085);
var_dump($batter->getLatestStat());
結果↓
object(Gameday_Stat)[35]
protected 'date' =>
object(DateTime)[36]
public 'date' => string '2011-06-18 00:00:00' (length=19)
public 'timezone_type' => int 3
public 'timezone' => string 'Asia/Tokyo' (length=10)
protected 'gameNumber' => int 1
protected 'runs' => int 0
protected 'hits' => int 1
protected 'steal' => int 0
protected 'singles' => int 1
protected 'doubles' => int 0
protected 'triples' => int 0
protected 'walks' => int 0
protected 'steelFails' => int 0
protected 'strikeOuts' => int 1
protected 'errors' => int 0
protected 'hitByPitches' => int 0
protected 'lastAtBat' => string 'Strikeout' (length=9)
protected 'average' => float 0.274
protected 'atBats' => int 5
protected 'homeruns' => int 0
protected 'rbi' => int 0
protected 'ops' => null



イチロー選手の2011年の成績を取得する

require_once 'Gameday/Batter.php';
$batter = new Gameday_Batter(400085);
var_dump($batter->getSeasonStatByYear(2011));
結果↓
object(Gameday_Stat)[35]
protected 'date' =>
object(DateTime)[36]
public 'date' => string '2011-06-18 00:00:00' (length=19)
public 'timezone_type' => int 3
public 'timezone' => string 'Asia/Tokyo' (length=10)
protected 'gameNumber' => int 1
protected 'runs' => int 0
protected 'hits' => int 1
protected 'steal' => int 0
protected 'singles' => int 1
protected 'doubles' => int 0
protected 'triples' => int 0
protected 'walks' => int 0
protected 'steelFails' => int 0
protected 'strikeOuts' => int 1
protected 'errors' => int 0
protected 'hitByPitches' => int 0
protected 'lastAtBat' => string 'Strikeout' (length=9)
protected 'average' => float 0.274
protected 'atBats' => int 5
protected 'homeruns' => int 0
protected 'rbi' => int 0
protected 'ops' => null

object(Gameday_SeasonStat)[35]
protected 'year' => int 2011
protected 'empty' =>
object(Gameday_StatLite)[49]
protected 'average' => float 0.276
protected 'atBats' => int 192
protected 'homeruns' => int 0
protected 'rbi' => int 0
protected 'ops' => float 0.638
protected 'menOn' =>
object(Gameday_StatLite)[47]
protected 'average' => float 0.27
protected 'atBats' => int 100
protected 'homeruns' => int 0
protected 'rbi' => int 21
protected 'ops' => float 0.687
protected 'risp' =>
object(Gameday_StatLite)[46]
protected 'average' => float 0.317
protected 'atBats' => int 63
protected 'homeruns' => int 0
protected 'rbi' => int 21
protected 'ops' => float 0.81
protected 'loaded' =>
object(Gameday_StatLite)[45]
protected 'average' => float 0.25
protected 'atBats' => int 4
protected 'homeruns' => int 0
protected 'rbi' => int 3
protected 'ops' => float 0.5
protected 'vsLHP' =>
object(Gameday_StatLite)[44]
protected 'average' => float 0.302
protected 'atBats' => int 86
protected 'homeruns' => int 0
protected 'rbi' => int 7
protected 'ops' => float 0.688
protected 'vsRHP' =>
object(Gameday_StatLite)[43]
protected 'average' => float 0.262
protected 'atBats' => int 206
protected 'homeruns' => int 0
protected 'rbi' => int 14
protected 'ops' => float 0.642
protected 'date' => null
protected 'gameNumber' => null
protected 'runs' => int 38
protected 'hits' => int 80
protected 'steal' => int 18
protected 'singles' => int 65
protected 'doubles' => int 13
protected 'triples' => int 2
protected 'walks' => int 22
protected 'steelFails' => int 4
protected 'strikeOuts' => int 23
protected 'errors' => int 3
protected 'hitByPitches' => int 0
protected 'lastAtBat' => null
protected 'average' => float 0.274
protected 'atBats' => int 292
protected 'homeruns' => int 0
protected 'rbi' => int 21
protected 'ops' => float 0.656



だたそれでもやっぱり使いづらい

データを詳しく見てみると予想外に本当に様々なデータが落ちていて、
その気になれば特定の試合の流れをそのまま再現なんてことも出来るんだけど、
しょせんはXMLに収められたデータでしかないので、少し複雑なことをしようとすると、
ページのパースとか、複数ページの読み込みとかが必要になってあっという間にパフォーマンスが落ちます。

例えば選手の情報を取得するといっても「選手情報一覧」というのがあるわけではないので、
直近の試合を突き止め、その試合の選手のプロフィールを取得する必要があります。
選手のID(イチロー選手は400085)を検索する方法もないし。

2010年の日別の打撃成績一覧を出そうと思ったら、
シーズンがいつからいつまでかを把握するのを含めて全部で300回以上のアクセスが必要で、
データ取得するのに3分くらいかかります。

さすがにこれでは実用に向かない…
上に上げたようなデータ程度をちょっと調べるくらいなら良いんですけどもね。


そうそう上手くはいかないかー



引き続き、模索中。。。