Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We have dabbled with RevoScaleR before , In this exercise we will work with H2O , another high performance R library which can handle big data very effectively .It will be a series of exercises with increasing degree of difficulty . So Please do this in sequence .
H2O requires you to have Java installed in your system .So please install Java before trying with H20 .As always check the documentation before trying these exercise set .
Answers to the exercises are available here.
If you want to install the latest release from H20 , install it via this instructions .
Exercise 1
Download the latest stable release from h20 and initialize the cluster
Exercise 2
Check the cluster information via clusterinfo
Exercise 3
You can see how h2o works via the demo function , Check H2O’s glm via demo method .
Exercise 4
down load the loan.csv from H2O’s github repo and import it using H2O .
Exercise 5
Check the type of imported loan data and notice that its not a dataframe , check the summary of the loan data .
Hint -use h2o.summary()
Exercise 6
One might want to transfer a dataframe from R environment to H2O , use as.h2o to conver the mtcars dataframe as a H2OFrame
- work with different data import techniques,
- know how to import data and transform it for a specific moddeling or analysis goal,
- and much more.
Exercise 7
Check the dimension of the loan H2Oframe via h2o.dim
Exercise 8
Find the colnames from the H2OFrame of loan data.
Exercise 9
Check the histogram of the loan amount of loan H2Oframe .
Exercise 10
Find the mean of loan amount by each home ownership group from the loan H2OFrame
Related exercise sets:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.