Moving From system() calls to Rcpp Interfaces
[This article was first published on rud.is » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Over on the Data Driven Security Blog there’s a post on how to use Rcpp
to interface with an external library (in this case ldns
for DNS lookups). It builds on another post which uses system()
to make a call to dig
to lookup DNS TXT
records.
The core code is below and at both the aforementioned blog post and this gist. The post walks you though creating a simple interface and a future post will cover how to build a full package interface to an external library.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
export PKG_LIBS=`Rscript --vanilla -e 'Rcpp:::LdFlags()'` | |
export PKG_CPPFLAGS=`Rscript --vanilla -e 'Rcpp:::CxxFlags()'` | |
R CMD SHLIB -lldns txt.cpp |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# yes, this (dyn.load) is all it takes to expose the function we | |
# just created to R. and, yes, it's a bit more complicated than | |
# that, but for now bask in the glow of simplicity | |
dyn.load("txt.so") | |
# this function should look more than vaguely familiar | |
# http://dds.ec/blog/posts/2014/Apr/firewall-busting-asn-lookups/ | |
ip2asn <- function(ip="216.90.108.31") { | |
orig <- ip | |
ip <- paste(paste(rev(unlist(strsplit(ip, "\\."))), sep="", collapse="."), | |
".origin.asn.cymru.com", sep="", collapse="") | |
# in essence, we replaced the `system("dig ...")` call with this | |
result <- .Call("txt", ip) | |
out <- unlist(strsplit(gsub("\"", "", result), "\ *\\|\ *")) | |
return(list(ip=orig, asn=out[1], cidr=out[2], cn=out[3], registry=out[4])) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
R: | |
install.packages("Rcpp") | |
Linux: | |
sudo apt-get install libldns-dev libbsd-dev | |
OS X: | |
brew install ldns | |
Windows: o_O |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// these three includes do a great deal of heavy lifting | |
// by making the necessary structures, functions and macros | |
// available to us for the rest of the code | |
#include <Rcpp.h> | |
#include <Rinternals.h> | |
#include <Rdefines.h> | |
#ifdef __linux__ | |
#include <bsd/string.h> | |
#endif | |
// REF: http://www.nlnetlabs.nl/projects/ldns/ for API info | |
#include <ldns/ldns.h> | |
// need this for 'wrap()' which *greatly* simplifies dealing | |
// with return values | |
using namespace Rcpp; | |
// the sole function that does all the work. it accepts an | |
// R character vector as input (even though we're only expecting | |
// one string to lookuo) and returns a character vector (one row | |
// of the DNS TXT records) | |
RcppExport SEXP txt(SEXP ipPointer) { | |
ldns_resolver *res = NULL; | |
ldns_rdf *domain = NULL; | |
ldns_pkt *p = NULL; | |
ldns_rr_list *txt = NULL; | |
ldns_status s; | |
ldns_rr *answer; | |
// SEXP passes in an R vector, we need this as a C++ StringVector | |
Rcpp::StringVector ip(ipPointer); | |
// we only passed in one IP address | |
domain = ldns_dname_new_frm_str(ip[0]); | |
if (!domain) { return(R_NilValue) ; } | |
s = ldns_resolver_new_frm_file(&res, NULL); | |
if (s != LDNS_STATUS_OK) { return(R_NilValue) ; } | |
p = ldns_resolver_query(res, domain, LDNS_RR_TYPE_TXT, LDNS_RR_CLASS_IN, LDNS_RD); | |
ldns_rdf_deep_free(domain); // no longer needed | |
if (!p) { return(R_NilValue) ; } | |
// get the TXT record(s) | |
txt = ldns_pkt_rr_list_by_type(p, LDNS_RR_TYPE_TXT, LDNS_SECTION_ANSWER); | |
if (!txt) { | |
ldns_pkt_free(p); | |
ldns_rr_list_deep_free(txt); | |
return(R_NilValue) ; | |
} | |
// get the TXT record (could be more than one, but not for our IP->ASN) | |
answer = ldns_rr_list_rr(txt, 0); | |
// get the TXT record (could be more than one, but not for our IP->ASN) | |
ldns_rdf *rd = ldns_rr_pop_rdf(answer) ; | |
// get the character version via safe copy | |
char *answer_str = ldns_rdf2str(rd) ; | |
// Max TXT record length is 255 chars, but for this example | |
// the Team CYMRU ASN resolver TXT records should never exceed | |
// 80 characters (from bulk analysis of large sets of IPs) | |
char ret[80] ; | |
strlcpy(ret, answer_str, sizeof(ret)) ; | |
Rcpp::StringVector result(1); | |
result[0] = ret ; | |
// clean up memory | |
free(answer_str); | |
ldns_rr_list_deep_free(txt); | |
ldns_pkt_free(p); | |
ldns_resolver_deep_free(res); | |
// return the TXT answer string which is ridiculously | |
// simple even for wonkier structures thanks to `wrap()` | |
return(wrap(result)); | |
} |
To leave a comment for the author, please follow the link and comment on their blog: rud.is » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.