..
Worked examples for talk: Producing and evaluating machine learning models.
Lecture slides: CV.pdf
Files:
The files below are telegraphic examples used to generate the graphs and numbers in the presentation. Once can in principle work through them using R ( https://cran.r-project.org ), RStudio ( https://www.rstudio.com ), and the referenced packages. They are not complete tutorials, but used to generate the number for the included presentation slides.
For a free video lecture on gradient boosting (one of the methods used) please see here: http://www.win-vector.com/blog/2015/11/free-gradient-boosting-lecture/ .
For a description of the vtreat package (used for data preparation) please see here: http://www.win-vector.com/blog/2016/06/a-demonstration-of-vtreat-data-preparation/ .
CV.pdf : lecture slides.
project.Rproj : RStudio project file (see https://www.rstudio.com ).
installH2O.R : Instructions to install h2o deep learning kit.
kdd2009.Rmd : R knitr/r-markdown neural net fitting/scoring.
kdd2009.html : HTML rendering of above file.
KDD2009vtreat.Rmd : R knitr/r-markdown demonstration fitting/scoring.
KDD2009vtreat.html : HTML rendering of above file.
kdd2009tree.Rmd : R knitr/r-markdown decision tree fitting/scoring.
kdd2009tree.html : HTML rendering of above file.
kdd2009xgboost.Rmd : R knitr/r-markdown demonstration fitting/scoring.
kdd2009xgboost.html : HTML rendering of above file.
orange_small_train.data.gz : Example data.
orange_small_train_churn.labels.txt : Example data.
CV.pdf
KDD2009vtreat.Rmd
KDD2009vtreat.html
README.txt
installH2O.R
kdd2009.Rmd
kdd2009.html
kdd2009tree.Rmd
kdd2009tree.html
kdd2009xgboost.Rmd
kdd2009xgboost.html
orange_small_train.data.gz
orange_small_train_churn.labels.txt
project.Rproj