BatchGetSymbols is now parallel!
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
BatchGetSymbols is my most downloaded package by any count. Computation time, however, has always been an issue. While downloading data for 10 or less stocks is fine, doing it for a large ammount of tickers, say the SP500 composition, gets very boring.
I’m glad to report that time is no longer an issue. Today I implemented a parallel option for BatchGetSymbols. If you have a high number of cores in your computer, you can seriously speep up the importation process. Importing SP500 compositition, over 500 stocks, is a breeze.
Give a try. The new version is already available in Github:
devtools::install_github('msperlin/BatchGetSymbols')
It should be in CRAN soon.
How to use parallel
Very simple. Just set you parallel plan with future::plan
and use input do.parallel = TRUE
in BatchGetSymbols
. If you are not sure how many cores you have available, just run the following code to figure it out:
future::availableCores() ## system ## 16 #devtools::install_github('msperlin/BatchGetSymbols') library(BatchGetSymbols) # get tickers from SP500 df.sp500 <- GetSP500Stocks() tickers <- df.sp500$tickers future::plan(future::multisession, workers = 10) # use 10 cores (future::availableCores()) # dowload data for 50 stocks l.out <- BatchGetSymbols(tickers = tickers[1:50], first.date = '2010-01-01', do.parallel = TRUE, do.cache = FALSE) ## ## Running BatchGetSymbols for: ## tickers = MMM, ABT, ABBV, ABMD, ACN, ATVI, ADBE, AMD, AAP, AES, AMG, AFL, A, APD, AKAM, ALK, ALB, ARE, ALXN, ALGN, ALLE, AGN, ADS, LNT, ALL, GOOGL, GOOG, MO, AMZN, AEE, AAL, AEP, AXP, AIG, AMT, AWK, AMP, ABC, AME, AMGN, APH, APC, ADI, ANSS, ANTM, AON, AOS, APA, AIV, AAPL ## Downloading data for benchmark ticker ## ^GSPC | yahoo (1|1) ## Running parallel BatchGetSymbols with 10 cores (16 available) ## ## Progress: ──────────────────────────────────────────────────────────────────────────────────────── 100% Progress: ────────────────────────────────────────────────────────────────────────────────────────────── 100% Progress: ────────────────────────────────────────────────────────────────────────────────────────────── 100% Progress: ──────────────────────────────────────────────────────────────────────────────────────────────── 100% ## ## ## MMM | yahoo (1|50) - Got 100% of valid prices | Good job! ## ABT | yahoo (2|50) - Got 100% of valid prices | OK! ## ABBV | yahoo (3|50) - Got 67.7% of valid prices | OUT: not enough data (thresh.bad.data = 75.0%) ## ABMD | yahoo (4|50) - Got 100% of valid prices | OK! ## ACN | yahoo (5|50) - Got 100% of valid prices | Good job! ## ATVI | yahoo (6|50) - Got 100% of valid prices | OK! ## ADBE | yahoo (7|50) - Got 100% of valid prices | Good stuff! ## AMD | yahoo (8|50) - Got 100% of valid prices | Feels good! ## AAP | yahoo (9|50) - Got 100% of valid prices | Good stuff! ## AES | yahoo (10|50) - Got 100% of valid prices | OK! ## AMG | yahoo (11|50) - Got 100% of valid prices | Feels good! ## AFL | yahoo (12|50) - Got 100% of valid prices | Good stuff! ## A | yahoo (13|50) - Got 100% of valid prices | Nice! ## APD | yahoo (14|50) - Got 100% of valid prices | Feels good! ## AKAM | yahoo (15|50) - Got 100% of valid prices | Nice! ## ALK | yahoo (16|50) - Got 100% of valid prices | Nice! ## ALB | yahoo (17|50) - Got 100% of valid prices | Youre doing good! ## ARE | yahoo (18|50) - Got 100% of valid prices | Got it! ## ALXN | yahoo (19|50) - Got 100% of valid prices | OK! ## ALGN | yahoo (20|50) - Got 100% of valid prices | OK! ## ALLE | yahoo (21|50) - Got 58.2% of valid prices | OUT: not enough data (thresh.bad.data = 75.0%) ## AGN | yahoo (22|50) - Got 100% of valid prices | Nice! ## ADS | yahoo (23|50) - Got 100% of valid prices | Good stuff! ## LNT | yahoo (24|50) - Got 100% of valid prices | Got it! ## ALL | yahoo (25|50) - Got 100% of valid prices | OK! ## GOOGL | yahoo (26|50) - Got 100% of valid prices | OK! ## GOOG | yahoo (27|50) - Got 100% of valid prices | Good job! ## MO | yahoo (28|50) - Got 100% of valid prices | Got it! ## AMZN | yahoo (29|50) - Got 100% of valid prices | Looking good! ## AEE | yahoo (30|50) - Got 100% of valid prices | Youre doing good! ## AAL | yahoo (31|50) - Got 100% of valid prices | Got it! ## AEP | yahoo (32|50) - Got 100% of valid prices | OK! ## AXP | yahoo (33|50) - Got 100% of valid prices | Well done! ## AIG | yahoo (34|50) - Got 100% of valid prices | Nice! ## AMT | yahoo (35|50) - Got 100% of valid prices | Youre doing good! ## AWK | yahoo (36|50) - Got 100% of valid prices | Mais contente que cusco de cozinheira! ## AMP | yahoo (37|50) - Got 100% of valid prices | Good job! ## ABC | yahoo (38|50) - Got 100% of valid prices | Looking good! ## AME | yahoo (39|50) - Got 100% of valid prices | Got it! ## AMGN | yahoo (40|50) - Got 100% of valid prices | Looking good! ## APH | yahoo (41|50) - Got 100% of valid prices | Well done! ## APC | yahoo (42|50) - Got 100% of valid prices | Well done! ## ADI | yahoo (43|50) - Got 100% of valid prices | Well done! ## ANSS | yahoo (44|50) - Got 100% of valid prices | Looking good! ## ANTM | yahoo (45|50) - Got 100% of valid prices | Feels good! ## AON | yahoo (46|50) - Got 100% of valid prices | Got it! ## AOS | yahoo (47|50) - Got 100% of valid prices | Well done! ## APA | yahoo (48|50) - Got 100% of valid prices | Good stuff! ## AIV | yahoo (49|50) - Got 100% of valid prices | Youre doing good! ## AAPL | yahoo (50|50) - Got 100% of valid prices | Looking good! glimpse(l.out) ## List of 2 ## $ df.control:Classes 'tbl_df', 'tbl' and 'data.frame': 50 obs. of 6 variables: ## ..$ ticker : chr [1:50] "MMM" "ABT" "ABBV" "ABMD" ... ## ..$ src : chr [1:50] "yahoo" "yahoo" "yahoo" "yahoo" ... ## ..$ download.status : chr [1:50] "OK" "OK" "OK" "OK" ... ## ..$ total.obs : int [1:50] 2335 2335 1581 2335 2335 2335 2335 2335 2335 2335 ... ## ..$ perc.benchmark.dates: num [1:50] 1 1 0.677 1 1 ... ## ..$ threshold.decision : chr [1:50] "KEEP" "KEEP" "OUT" "KEEP" ... ## $ df.tickers:'data.frame': 112080 obs. of 10 variables: ## ..$ price.open : num [1:112080] 83.1 82.8 83.9 83.3 83.7 ... ## ..$ price.high : num [1:112080] 83.4 83.2 84.6 83.8 84.3 ... ## ..$ price.low : num [1:112080] 82.7 81.7 83.5 82.1 83.3 ... ## ..$ price.close : num [1:112080] 83 82.5 83.7 83.7 84.3 ... ## ..$ volume : num [1:112080] 3043700 2847000 5268500 4470100 3405800 ... ## ..$ price.adjusted : num [1:112080] 65.8 65.4 66.3 66.4 66.8 ... ## ..$ ref.date : Date[1:112080], format: "2010-01-04" ... ## ..$ ticker : chr [1:112080] "MMM" "MMM" "MMM" "MMM" ... ## ..$ ret.adjusted.prices: num [1:112080] NA -0.006264 0.014182 0.000717 0.007046 ... ## ..$ ret.closing.prices : num [1:112080] NA -0.006264 0.014182 0.000717 0.007046 ...
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.