R and the Geotechnical Exchange Format
[This article was first published on Bart Rogiers - Sreigor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Quite some time ago now, I wrote this function to read some *.gef files into R. “gef” stands for Geotechnical Exchange Format. Details on this data format can be found at http://www.geffiles.nl/, as well as several software tools. I had a long list of very specific *.gef files, so the function is not very generic, but it might provide you with a good starting point, if you need to do something similar. I might update it in the future though..Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
read.gef <- function(filename) { gef.lines <- scan(filename, what=character(), sep='\n') gef.lines.comments <- gef.lines[which(substr(gef.lines, 1,1)=='#')] gef.lines.data <- gef.lines[-which(substr(gef.lines, 1,1)=='#')] gef <- NULL nc <- length(grep('COLUMNINFO=',gef.lines.comments)) nr <- length(gef.lines.data) gef$data <- matrix(ncol=nc, nrow=nr) for(i in 1:length(gef.lines.data)) { for(j in 1:nc) { gef$data[i,j] <- as.numeric(remove.empty.strings(strsplit(gef.lines.data[i],' ')[[1]])[j]) } } gef$data <- as.data.frame(gef$data) for(i in 1:nc) names(gef$data)[i] <- gef.lines.comments[grep('COLUMNINFO=',gef.lines.comments)[i]] for(i in 1:nc) names(gef$data)[i] <- strsplit(strsplit(names(gef$data), ' ')[[i]][4], ',') gef$x <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) gef$y <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[4], ',')[[1]]) gef$surface <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('ZID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) if(length(grep('PARENT=',gef.lines.comments))==1) # file is child { gef$depth <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('PARENT=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) gef$z <- gef$surface-gef$depth } # else # file is parent # { # # } #cat(paste(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]]), ','),'\n')) gef$cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) gef$sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 2',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) gef$a.cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 3',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) gef$a.sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 4',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) return(gef) }
The function below is used in the code above:
remove.empty.strings <- function(stringArray) { newStringArray <- NULL for(i in 1:length(stringArray)) { #print(stringArray[i]) if(stringArray[i] != '') {newStringArray <- c(newStringArray, stringArray[i])} } return(newStringArray) }
To leave a comment for the author, please follow the link and comment on their blog: Bart Rogiers - Sreigor.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.