Finding Economic Articles with Data and Specific Empirical Methods
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Do you want to find reproducible empirical economic studies that use a particular method or concept, like random forests or instrumental variable estimation? This becomes now even easier with my freshly updated shiny-powered app “Find Economic Articles with Data”:
https://ejd.econ.mathematik.uni-ulm.de
The app allows to search among more than 5800 articles with data and code supplement from several top economic journals. The previous version already allowed to search within the title and abstract for arbitrary phrases. While an research area like climate change
or financial crisis
can typically be well detected from the abstract, in applied papers the abstract only rarely provides information about the used empirical methods.
To improve the app, I counted the number of occurrences of special methodological phrases like random forest
in the full texts of more than 5200 articles. Often several phrases are mapped to a single keyword. For example, the keyword lab experiment
aggregates full text occurrences of the phrases laboratory experiment
, laboratory study
, lab experiment
and experimental laboratory
.
Here is a screenshot of a search result:
In that example, I search for electricity
, which is no special keyword, and the method keyword DID
that indicates a difference-in-differences approach. The search results show for each article the detected method keywords and number of occurrences in the full text. This gives a quick overview of an article’s methodology. A simple way to add such a keyword to your search query, is to click on it in the search results. Alternatively, go to the Help
panel for a list of all keywords.
Note that e.g. due to confidentiality agreements a substantial share of data supplements unfortunately doesn’t contain all data sets required to replicate the study. This can typically be checked by looking at the README file of the data supplement which I tried to link for most search results.
Here is the top 10 of method keywords ordered by the number of articles they are used in:
Rank | Keyword | No. of Articles | Share | Matches per Article | Matched Phrases |
---|---|---|---|---|---|
1 | equilibrium | 3000 | 59.2% | 16.7 | equilibrium |
2 | fixed effect | 2940 | 58% | 12 | fixed-effect, fixed effect |
3 | IV | 1870 | 36.9% | 6.5 | instrumental variable, _ instrument _ |
4 | panel data | 1570 | 31% | 2.5 | panel data |
5 | time series | 1280 | 25.2% | 3.1 | time series |
6 | nonparametric | 1140 | 22.4% | 3.7 | nonparametric, non-parametric |
7 | field experiment | 1080 | 21.2% | 4.3 | field experiment |
8 | natural experiment | 1010 | 19.9% | 2 | natural experiment |
9 | DID | 1010 | 19.8% | 4.7 | difference-in-difference, DID, DiD, DD, difference in difference, differences-in-difference |
10 | bootstrap | 1000 | 19.7% | 4.9 | bootstrap |
While it is unclear how many economic processes are actually in some form of equilibrium, economists just love this expression. It appears in roughly 60% of the (mostly empirical) articles at least once. On average equilibrium
is mentioned more than 16 times in the articles that mention it at least once.
Close behind is the keyword fixed effects
. Well, I guess many regressions just add some fixed effects as control variables.
Ranked third is IV
, which matches instrumental variable
or just ` instrument ` (with leading and trailing spaces). Even so the phrase ` instrument ` may sometimes be used in different contexts, the third rank reflects that economists really like the instrumental variable technique to identify causal effects.
We then see that panel data
seems a bit more popular than time series
and that more than 20% of articles at least mention something nonparametric
. Afterward, we have a tight race between field experiment
and natural experiment
which both are mentioned in around 20% of articles. In the same ballpark and likely with a considerable overlap are articles that mention DID
, i.e. difference-in-difference as a method of causal identification. And finally still more than 1000 articles refer to bootstraping
.
There are many more keywords than these top 10, e.g. covering areas like machine learning, the potential outcomes framework for causal identification, or macro-econometrics. Best search yourself…
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.