taxize v0.3.0 update – a new data source, taxonomy in writing, and uBio examples
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We just released v0.3
of taxize
. For details on the update, see the release notes.
Some new features
- New function
iplant_resolve()
to do name resolution using the iPlant name resolution service. Note, this is different from http://taxosaurus.org/ that is wrapped in thetnrs()
function. - New function
ipni_search()
to search for names in the International Plant Names Index (IPNI). See below for more. - New function
resolve()
that unifies name resolution services from iPlant's name resolution service (viaiplant_resolve()
), Taxosaurus' TNRS (viatnrs()
), and GNR's name resolution service (viagnr_resolve()
). - All
get_
functions now returning a new uri attribute that is a link to the taxon on the web. If NA is given back (e.g. nothing found), the uri attribute is blank. You can go directly to the uri in your default browser by doing, for example:browseURL(attr(result, "uri"))
.
Updating to v0.3
Since taxize
is not updated to v0.3
on CRAN yet at the time of writing this, install taxize
from GitHub:
devtools::install_github("ropensci/taxize")
Then load taxize
library("taxize")
International Plant Names Index (IPNI)
We added the IPNI as a new data source in taxize
in v0.3
. Currently, there is only one function to interact with IPNI: ipni_search()
. What follows are a few examples of how you can use ipni_search()
.
Search for the genus Brintonia
ipni_search(genus='Brintonia')[,c(1:3)] ## id version family ## 1 7996-1 1.3 Asteraceae ## 2 296073-2 1.3 Asteraceae ## 3 36551-2 1.3 Asteraceae ## 4 186337-1 1.3 Asteraceae
Search for the species Pinus contorta
head(ipni_search(genus='Pinus', species='contorta')[,c(1:3)]) ## id version family ## 1 262873-1 1.1.2.1.1.2 Pinaceae ## 2 262872-1 1.2.2.1.1.1 Pinaceae ## 3 30000492-2 1.1.2.1 Pinaceae ## 4 196950-2 1.4 Pinaceae ## 5 921291-1 1.4 Pinaceae ## 6 196949-2 1.5 Pinaceae
Different output formats (the default is minimal)
head(ipni_search(genus='Ceanothus')[,c(1:3)]) ## id version family ## 1 55268-3 1.1 Rhamnaceae ## 2 30006383-2 1.1.2.1.1.3 Rhamnaceae ## 3 55269-3 1.1 Rhamnaceae ## 4 33421-1 1.5 Rhamnaceae ## 5 60461578-2 1.1 Rhamnaceae ## 6 331948-2 1.4 Rhamnaceae head(ipni_search(genus='Ceanothus', output='extended'))[,c(1:3)] ## id version family ## 1 55269-3 1.1 Rhamnaceae ## 2 33421-1 1.5 Rhamnaceae ## 3 55268-3 1.1 Rhamnaceae ## 4 30006383-2 1.1.2.1.1.3 Rhamnaceae ## 5 60461578-2 1.1 Rhamnaceae ## 6 331948-2 1.4 Rhamnaceae
If you do something wrong, you get a message, and the actual output is NA
ipni_search(genus='Brintoniaasasf') ## Warning: No data found ## [1] NA
uBio examples
Until now, we have had functions to interact with uBio's API, but it probably hasn't been too clear how to use them, and they were a little buggy for sure. We have squashed many bugs in ubio functions. Here is an example workflow of how to use ubio functions.
ubio_search
Search uBio by taxonomic name. This is sort of the entry point for uBio where you can search by taxonomic name, from which you can get namebankID's that can be passed to the ubio_classification_search
and ubio_namebankID
functions
lapply(ubio_search(searchName = 'elephant'), head) ## $scientific ## namebankid namestring fullnamestring packageid packagename ## 1 6938660 Cerylon elephant Cerylon elephant 80 Cerylonidae ## basionymunit rankid rankname ## 1 6938660 24 species ## ## $vernacular ## namebankid namestring languagecode languagename packageid ## 1 8118711 Elephant fish 115 Creole, English 3 ## 2 8118714 Elephant fish 115 Creole, English 3 ## 3 8118726 Elephant fish 115 Creole, English 3 ## 4 8115700 Elephant fish 115 Creole, English 3 ## 5 8115663 Elephant fish 115 Creole, English 3 ## 6 8114377 Elephant fish 115 Creole, English 2463 ## packagename namebankidlink namestringlink ## 1 Pisces 132263 Mormyrus tapirus ## 2 Pisces 132258 Mormyrus tapirus ## 3 Pisces 181174 Mormyrus tapirus ## 4 Pisces 128971 Mormyrus tapirus ## 5 Pisces 128972 Mormyrus tapirus ## 6 Mormyridae 2299821 Mormyrus tapirus ## fullnamestringlink ## 1 Mormyrus tapirus Pappenheim, 1905 ## 2 Mormyrus tapirus Pappenheim, 1905 ## 3 Mormyrus tapirus Pappenheim, 1905 ## 4 Mormyrus tapirus Pappenheim, 1905 ## 5 Mormyrus tapirus Pappenheim, 1905 ## 6 Mormyrus tapirus Pappenheim, 1905 id <- ubio_search(searchName = 'elephant')$scientific$namebankid[1] ## Error: CHAR() can only be applied to a 'CHARSXP', not a 'NULL'
ubio_id
Get data on a specific uBio namebankID
. Use the id from the previous code block
ubio_id(namebankID = id) ## $data ## namebankid namestring fullnamestring packageid packagename ## 1 6938660 Cerylon elephant Cerylon elephant 80 Cerylonidae ## basionymunit rankid rankname ## 1 6938660 24 Species ## ## $synonyms ## NULL ## ## $vernaculars ## NULL ## ## $cites ## NULL ## ## $mappings ## NULL
ubio_classification_search
Return hierarchiesID
that refer to the given namebankID
ubio_classification_search(namebankID = 3070378) ## hierarchiesid classificationtitleid classificationtitle ## 1 2477072 84 NCBI Taxonomy ## 2 11166818 100 NCBI Taxonomy ## 3 17950600 104 uBiota 2008-03-20T10:36:50-04:00
ubio_classification
Return all ClassificationBank data pertaining to a particular hierarchiesID
ubio_classification(hierarchiesID = 2483153) ## Error: XML content does not seem to be XML: ''
ubio_synonyms
Search for taxonomic synonyms by hierarchiesID
ubio_synonyms(hierarchiesID = 4091702) ## Error: invalid subscript type 'list'
Examples of using taxize in writing
Let's say one is writing a paragraph in which you are using taxonomic or common names, and perhaps you want to have the number of taxa in a particular group. You can write a paragaph like:
I studied the common weed species _Tragopogon dubius_ (`r sci2comm('Tragopogon dubius', db='itis')[[1]][1]`; `r tax_name(query = "Tragopogon dubius", get = "family", db = "ncbi")[[1]]`) and _Cirsium arvense_ (`r sci2comm('Cirsium arvense', db='itis')[[1]][1]`; `r tax_name(query = "Cirsium arvense", get = "family", db = "ncbi")[[1]]`).
Which renders to:
I studied the common weed species Tragopogon dubius (yellow salsify; Asteraceae) and Cirsium arvense (Canada thistle; Asteraceae).
Notice how inside backticks you can execute code by starting with an r, then doing something like searching for common names for a taxon.
Another example:
We found that `r sci2comm('Tragopogon dubius', db='itis')[[1]][1]` was very invasive.
Renders to:
We found that yellow salsify was very invasive.
Another example:
There are `r nrow(downstream('Tragopogon', db = "col", downto = "Species")$Tragopogon)` species (source: Catalogue of Life) in the _Tragopogon_ genus, meaning there is much more to study :)
Renders to:
There are 142 species (source: Catalogue of Life) in the Tragopogon genus, meaning there is much more to study 🙂
el fin
Please do update to v0.3
, try it out, report bugs, and get back to us with any questions!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.