Convering plots to data
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It is a problem which occurs ever so often in applied work, you have a plot, but you want the data. There are at least two programs which can help you there; PlotDigitizer and Engauge Digitizer. I got both on my openSuse machine. Both are available for Windows, for Mac there are only older versions of Engauge.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I tried these programs on a relatively simple problem. I saw a plot in a book and wanted to calculate that line myself. So I took my camera, photographed the plot and got to work.
Engauge Digitizer
Engauge has been there for quite a while. It is many features, but looks a bit outdated. It was not able to import my original figure (2992*2992 pixels, 694 KB) but had no problems after resizing to 500*500 pixels, 55.9 KB.It is clearly the program which can handle more exotic plots. For me it is not intuitive. For instance, it took me quite some time to figure out how to export the results. Initially I copied-pasted the results to a spreadsheet, later I managed to create a .csv after all. Engauge comes with a manual so everything can be resolved. Engauge has the ability to do point detection, to use that it is probably best to crop the figure as much as possible, Engauge has no qualms finding points in text, black blobs, axis labels and such. Probably in a colored plot automatic detection would work better, you have some settings to guide it.
PlotDigitizer
PlotDigitizer looks much more modern. It had no problems with the large photo, except that it could not scale that photo enough to fit on the screen. The modern interface allows manual adding/removing/moving of points. There is also a possibility to trace a line on screen and it will add points it detects there. PlotDigitizer exports to .xml. It is also possible to cipy-paste the results. While I see the advantage of a file including documentation, it would also be nice to get the data out of the file.The file I got needed some extra processing before I had the data.frame.
library(XML)
mytree <- xmlTreeParse('test12.xml')
mylist <- xmlToList(mytree)
mylist2 <- mylist[4:length(mylist)]
mydf <- do.call(rbind,mylist2)
convert <- data.frame(x=as.numeric(mydf[,'dx']),
y=as.numeric(mydf[,’dy’]))
Conclusion
The programs complement each other. Engauge is great for automated extraction, complex plots. However, it is not so easy for occasional usage. PlotDigitizer is easy to use, great if you want to manually select your points.To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.