[This article was first published on isomorphismes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’ve googled How do I find out how big my workspace is too many times … here’s the explicit code to run and hopefully the next googler sees this post:
for (thing in ls()) { message(thing); print(object.size(get(thing)), units='auto') }
Fin. You can stop there.
Or for a bit of context… Here’s an example code to generate objects of variable sizes where you might not be sure how big they are:
require(boot) require(datasets) data(sunspot.year) system.time(boot.1 <- boot( sunspot.year, max, R=1e3, parallel='multicore', ncpu=4)) system.time(boot.2 <- boot( sunspot.year, max, R=1e4)) system.time(boot.3 <- tsboot( sunspot.year, max, R=1e5, parallel='multicore', ncpu=4)) system.time(boot.4 <- boot( sunspot.year, max, R=1e5, parallel='multicore', ncpu=8)) system.time(boot.5 <- boot( sunspot.year, max, R=1e6 parallel='multicore', ncpu=8)) print(boot.1) plot(boot.1) par(col=rgb(0,0,0,.1), pch=20) plot(boot.2) for (thing in ls()) { message(thing) print(object.size(get(thing)), units='auto') }
This code is doing a few things:
- resampling the sunspot dataset to try to estimate the most sunspots we “should” see in a year (with a very stylised meaning of “should”).
This is worth looking into because some people say global warming is caused by sunspots rather than eg carbon emissions multiplying greenhouse effects.
History only happened once but by bootstrapping we try to overcome this. - noodling around with multiple cores (my laptop has 8;
sudo apt-get install lscpu
). Nothing interesting happens in this case; still, multicore is an option. - timing how long fake reprocessings of history take with various amounts of resampling and various numbers of cores
- showing how big those bootstrap objects are. Remember,
R
runs entirely in memory, so big datasets or derived objects of any kind can cramp your home system or bork your EC2. - printing the size of the objects, as promised. On my system (which I didn’t run the exact code above) the output was:
> for (obj in ls()) { message(obj); print(object.size(get(obj)), units='auto') } b.1 89.1 Kb b.2 792.2 Kb b.3 7.6 Mb b.4 7.6 Mb b.5 7.6 Mb b.6 792.2 Kb obj 64 bytes sunspot.year 2.5 Kb
PS To find out how much memory you have (in linux or maybe Mac also) do:
$ free -mt total used free shared buffers cached Mem: 15929 12901 3028 0 214 9585 -/+ buffers/cache: 3102 12827 Swap: 10123 0 10123 Total: 26053 12901 13152
To leave a comment for the author, please follow the link and comment on their blog: isomorphismes.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.