Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data on more than 10,000
species of ants recorded worldwide are available through from California Academy of Sciences' AntWeb, a repository that boasts a wealth of natural history data, digital images, and specimen records on ant species from a large community of museum curators.
Digging through some of the earliest announcements of AntWeb, I came across a Nature News piece titled "Mashups mix data into global service" from January 2006. The article contains this great quote from Roderic Page "If you could pool data from every museum or lab in the world, you could do amazing things". The article also says "So far, only researchers with advanced programming skills, working in fields organized enough to have data online and tagged appropriately, have been able to do this." In many ways this really is motivation for why we develop interfaces to these rich data repositories. Our express intent is to facilitate researchers explore amazing opportunities that lie within such data by lowering techinical barriers to use. Right on the heels of our most recent package (ecoengine
), we are now happy to first release of an interface to AntWeb.
A stable version of our R package AntWeb is now available from CRAN. The API currently does not require a key for access but larger requests will be throttled on the server side. It is worth noting that much of these same data are also ported through the Global Biodiversity Information Facility and accessible through our gbif
package. This package provides a more direct interface to more of the ant specific natural history data.
Installing the package
A stable version of the package (0.5
) is now available on CRAN.
install.packages("AntWeb")
or you can install the latest development version (the master branch is also always stable & deployable and most up-to-date. Current version is 0.5.3
at the time of this writing).
library(devtools) install_github("ropensci/AntWeb")
Searching through the database
As with most of our packages, there are several ways to search through an API. In the case of AntWeb, you can search by a genus or full species name or by other taxonomic ranks like sub-phylum.
Data on ants
To obtain data on any taxonomic group, you can make a request using the aw_data()
function. It's possible to search easily by a taxonomic rank (e.g. a genus) or by passing a complete scientific name.
Searching by Genus
library(AntWeb) # To get data on an ant genus found widely through Central and South America data_genus_only <- aw_data(genus = "acanthognathus") leaf_cutter_ants <- aw_data(genus = "acromyrmex") unique(leaf_cutter_ants$meta.species) #> [1] "(indet)" "alw01" "alw02" "alw03" #> [5] "alw04" "ambiguus" "aspersus" "asperus" #> [9] "balzani" "coronatus" "crassispinus" "disciger" #> [13] "echinatior" "evenkul" "fracticornis" "heyeri" #> [17] "hispidus" "hystrix" "indet" "landolti" #> [21] "laticeps" "lobicornis" "lundi" "lundii" #> [25] "moelleri" "muticinoda" "niger" "nigrosetosus" #> [29] "nobilis" "octospinosus" "pubescens" "pulvereus" #> [33] "rugosus" "santschii" "silvestrii" "striatus" #> [37] "subterraneus" "versicolor" "volcanus"
Searching by species
# You can request data on any particular species acanthognathus_df <- aw_data(scientific_name = "acanthognathus brevicornis") head(acanthognathus_df) #> code taxon_name tribe subfamily #> 1 casent0280684 myrmicinaeacanthognathus brevicornis dacetini myrmicinae #> 2 casent0637708 myrmicinaeacanthognathus brevicornis dacetonini myrmicinae #> genus species country localityname #> 1 acanthognathus brevicornis Colombia Las Naranjas near Josc Maria #> 2 acanthognathus brevicornis Peru Tambopata Research Center #> localitycode collectioncode biogeographicregion last_modified #> 1 Josc Maria ANTC19540 Neotropic 2014-02-18 12:57:40 #> 2 JTL060117 TRC-S06-R1C04 Neotropic 2014-02-17 16:00:34 #> ownedby collectedby caste access_group locatedat medium #> 1 BMNH, London, U. K. D. Jackson 1w 1 BMNH pin #> 2 <NA> D. Feener worker 2 JTLC dry mount #> access_login specimennotes created family #> 1 23 BMNH(E)1017559 2014-02-18 12:57:40 formicidae #> 2 2 <NA> 2014-02-17 16:00:34 formicidae #> datecollectedstart datecollectedstartstr kingdom_name phylum_name #> 1 1977-08-08 8 Aug 1977 animalia arthropoda #> 2 2001-11-01 1 Nov 2001 animalia arthropoda #> class_name order_name image_count adm1 decimal_latitude #> 1 insecta hymenoptera 5 <NA> <NA> #> 2 insecta hymenoptera 0 Madre de Dios -13.14142 #> decimal_longitude habitat method determinedby #> 1 <NA> <NA> <NA> <NA> #> 2 -69.623 Mixed terra firme forest winkler J. Longino #> elevation latlonmaxerror microhabitat datedetermined #> 1 <NA> <NA> <NA> <NA> #> 2 252 100m ex sifted leaf litter 2013-09-12 #> datedeterminedstr #> 1 <NA> #> 2 12 Sep 2013 # You can also limit queries to observation records that have been geoferenced acanthognathus_df_geo <- aw_data(genus = "acanthognathus", species = "brevicornis", georeferenced = TRUE)
It's also possible to search for records around any location by specifying a search radius.
data_by_loc <- aw_coords(coord = "37.76,-122.45", r = 2) # This will search for data on a 2 km radius around that latitude/longitude
Image data
Most specimens in the database have images associated with them. These include high, medium, and low resolution version of the head, dorsal side, full profile, and the specimen label. For example we can retrieve data on a specimen of Ecitoninaeeciton burchellii with the following call:
# Data and images for Ecitoninaeeciton burchellii eb <- aw_code("casent0003205") eb$image_data$high[[2]] #> [1] "http://www.antweb.org/images/casent0003205/casent0003205_h_1_high.jpg"
If you're primarily interested in ant images and would like to keep up with recent additions to the database, you can also use the aw_images
function. This function takes two arguments: since
, the number of days to search backward, and a type
. Possible options for type are h
for head, d
for dorsal, p
for profile, and l
for label. If a type is not specified, all available images are retrieved.
# Retrieve only dorsal images for the last five days aw_images(since = 5, type = "d")
It's also possible to retrieve unique lists of any taxonomic rank using the aw_unique
function.
subfamily_list <- aw_unique(rank = "subfamily") nrow(subfamily_list) #> [1] 69 head(subfamily_list) #> subfamily #> 1 (apidae) #> 2 (bethylidae) #> 3 (braconidae) #> 4 (cynipidae) #> 5 (diapriidae) #> 6 (diaspididae) genus_list <- aw_unique(rank = "genus") nrow(genus_list) #> [1] 470 head(genus_list) #> genus #> 1 (aenictinae) #> 2 (amblyoponinae) #> 3 (apidae) #> 4 (attini) #> 5 (basicerotini) #> 6 (bethylidae) species_list <- aw_unique(rank = "species") nrow(species_list) #> [1] 10480 head(species_list) #> species #> 1 (basicerotini) #> 2 (indet) #> 3 (indet.) #> 4 (orizabanum #> 5 abbreviata #> 6 abdelazizi
If you work with existing specimens, you can also query directly by a specimen ID.
asphinctanilloides_amazona <- aw_code(code = "casent0104669") # This will return a list with a metadata data.frame and a image data.frame
If you have a multiple specimen IDs, as is often the case when working with research data, you can get data on all of them at the same time. The function automatically retuns NULL
values when no data are found and you can have these removed using plyr::compact
(this happens automatically when you use a function call like ldply
.)
specimens <- c("casent0908629", "casent0908650", "casent0908637") results <- lapply(specimens, function(x) aw_code(x)) names(results) <- specimens length(results) #> [1] 3
Mapping ant specimen data
As with the previous ecoengine package, you can also visualize location data for any set of species. Adding georeferenced = TRUE
to a data retrieval call will filter out any data points without location information. Once retrieved the data are mapped with the open source Leaflet.js and pushed to your default browser. Maps and associated geoJSON
files are also saved to a location specified (or defaults to your /tmp
folder). This feature is only available on the development version on GitHub (0.5.2
or greater; see above on how to install) and will be available from CRAN in version 0.6
acd <- aw_data(genus = "acanthognathus") aw_map(acd)
Integration with the rest of our biodiversity suite
Our newest package on CRAN, spocc
(Species Occurrence Data), currently in review at CRAN, integrates AntWeb
among other sources. More details on spocc
in our next blog post.
As always please send suggestions, bug reports, and ideas related to the AntWeb R package directly to our repo.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.