The Korean Journal of Applied Statistics (2015) DOI: http://dx.doi.org/10.5351/KJAS.2015.28.6.1133 28(6), 1133-1146
Analysis of Horse Races: Prediction of Winning Horses in Horse Races Using Statistical Models
Hyemin Choea Nayoung Hwanga Chankyoung Hwanga Jongwoo Songa;1
Department of Statistics, Ewha Womans University
(Received September 14, 2015; Revised October 13, 2015; Accepted October 19, 2015)
Abstract
The Horse race industry has the largest proportion of the domestic legal gambling industry. However, there is limited statistical analysis on horse races versus other sports. We propose prediction models for winning horses in horse races using data mining techniques such as logistic regression, linear regression, and random forest. Horse races data are from the Korea Racing Authority and we use horse racing reports, information of racehorses, jockeys, and horse trainers. We consider two models based on ranks and time records. The analysis results show that prediction of ranks is affected by information on racehorses, number of wins of racehorses and jockeys. We place wagers for the last month of races based on our prediction models that produce serious profits.
Keywords: horse race, linear regression, stepwise regression, random forest, logistic regression, important variables