[This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
doctors prescribe medications for patients all the time. all the time. sometimes the results are beneficial, other times the drug has no discernible effect, but occasionally those substances actually cause harm. since the drug is already on the market, there needs to be a post-approval mechanism for detecting health hazards that might’ve slipped past the clinical trials. this is it. if a side-effect alarms a physician or patient enough, either party can make a (voluntary) submission to the fda or the manufacturer (who then must report that event). think of this as the central repository of skeletal xylophoning.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
these public use files are the first in my experience to admit possessing yet fail to release a proper data dictionary. the steps to learn about their contents: (1) read the full faers homepage, not too long. (2) download and unzip one of the recent quarterly files by hand, for example 2012 quarter four. (3) read yes read the faqs.doc and readme.doc files included in that microdata file. once you’re convinced these have what you need, let the download and import automation do the rest. this new github repository contains two scripts:
download and import.R
- figure out all zipped files containing quarterly microdata for both laers (legacy) and faers
- loop through each available quarter, download and unzip onto the local disk
- import each dollar-sign-delimited text file into an r data.frame object, cleaning up as you go
- save each object as a fresh yet familiar rda file in a convenient pattern within the working directory
year stacks.R
- find each quarterly data file for both laers (legacy) and faers on the local disk and sort them by year
- stack all similar-system files into single-year files that nearly match the fda-published annual statistics. but not exactly. even though the individual quarterly files do match their control counts. can’t win ’em all.
for more detail about the fda adverse event reporting system (faers), visit:
- the structured public labeling website, the basis of the data dictionary that they won’t share, whatever that means
- the national drug code directory, just in case you need to merge things with other things
- a stackexchange link with the openfda tag for your questions. they respond fast.
notes:
in pursuit of what’s hip and stylish, the fda has set up an api where users might query this database for up-to-the-minute case reports. but unless you’re setting up a bot to tweet adverse events as they happen or researching something that cannot wait for the quarterly file to be released – like google flu trends – the api seems too sexy for anyone other than right said fred. you probably ought to load the entire data set onto your computer and explore it on your own first.
confidential to sas, spss, stata, and sudaan users: heavy doses of those programs may cause statococcal infection. time to transition to r. 😀
To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.