30 Day Chart Challenge- Endangered Species

Louise E. Sinks

15 hours ago

[This article was first published on Louise E. Sinks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It is Day 4 of the #30DayChartChallenge. More info can be found at the challenge’s Github page. Today’s theme is history. But this is a subtheme of “comparisions”, so I’d like to avoid doing a simple time series.

I decided to look at the endangered species list the US Fish and Wildlife Service maintains. They have a bunch of data spread over multiple tables. I decided to look at the 5 year review data. A 5 year review is the assessment to decide if a species remains list or delisted. The dataset also contains the year the species was first listed. So I’d like to compare how many species have been listed vs. delisted.

The key to the different listing types is found here.

library(tidyverse)
library(gt)
library(skimr)
library(waffle)

Today, I’m was going to load the data directly from the website. I’ve been downloading it and reading it in from a local folder, but I thought it would be nice to download directly. However, the data uses a “blob:” url, which is not donwloadable directly. There is a way around this but then you have to process some JSON data. I”ll come back to this later, but for now, I’m just going to use a csv.

endangered_df <- read_csv("five_year.csv", show_col_types = FALSE)

endangered_df_sub <- endangered_df %>%
  select(name = `Common Name`, 
         status = `ESA Listing Status`, 
         date = `Listing Date`,
         rec = `5YSR Recommendation`)

Let’s see what kind of categories we have.

endangered_df_sub <- endangered_df_sub %>%
  mutate(status = factor(status), rec = factor(rec))

Skim this bad boy.

skim(endangered_df_sub) %>% gt()

skim_type	skim_variable	complete_rate	character.min	character.max	character.empty	character.n_unique	character.whitespace	factor.ordered	factor.n_unique	factor.top_counts	numeric.mean	numeric.sd	numeric.p0	numeric.p25	numeric.p50	numeric.p75	numeric.p100	numeric.hist
character	name	1	3	51	0	1159	0	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA
factor	status	1	NA	NA	NA	NA	NA	FALSE	8	E: 1173, T: 316, DM: 35, DNS: 3	NA	NA	NA	NA	NA	NA	NA	NA
factor	rec	1	NA	NA	NA	NA	NA	FALSE	7	No : 1389, Del: 49, Dow: 40, Del: 27	NA	NA	NA	NA	NA	NA	NA	NA
numeric	date	1	NA	NA	NA	NA	NA	NA	NA	NA	1993.5	12.4735	1967	1987	1993	1999	2017	▂▃▇▂▃

rec

summary(endangered_df_sub$rec)

                    Delist: The listed entity does not meet the statutory definition of a species 
                                                                                                8 
Delist: The species does not meet the definition of an endangered species or a threatened species 
                                                                                               49 
                                                                   Delist: The species is extinct 
                                                                                               27 
                                                                                    Downlist to T 
                                                                                               40 
                                                                              No change in Status 
                                                                                             1389 
                                                                        Revision of listed entity 
                                                                                                2 
                                                                                      Uplist to E 
                                                                                               18

The recommendations don’t always match the current status. I’m assuming the recommendations will be enacted/adopted eventually, so I am using them as the correct current status.

We have 7 levels in recommendations. We need to consolidate them. I’m going to combine “Delist: The listed entity does not meet the statutory definition of a species” and “Delist: The species does not meet the definition of an endangered species or a threatened species” into a level called delisted. The delisting because the species is extinct will be made into a level called extinct later.

endangered_df_sub <- endangered_df_sub %>%
  mutate(condensed = fct_collapse(rec, delisted = c("Delist: The listed entity does not meet the statutory definition of a species",
    "Delist: The species does not meet the definition of an endangered species or a threatened species")
  ))

I’m going to count both “Downlist to threatened” and “uplist to Endangered” as endangered. I don’t know the original listing level, so it doesn’t make too much difference to me.

endangered_df_sub <- endangered_df_sub %>%
  mutate(condensed = fct_collapse(condensed, endangered = c("Downlist to T",
    "Uplist to E")  ))

Now, I’m pulling in the status for the entries that have “No change in Status” as the recommendation. I’m using a case_when and listing every combination. I could get this done if fewer lines if I used or statements (E or T is endangered), but I left it more granular in case I wanted to come back and change the levels. Maybe later I do care about the different between threatened and endangered and want to break them out separately.

endangered_df_sub <- endangered_df_sub %>%
  mutate(condensed = case_when(
    condensed == "No change in Status" & status == "E" ~ "endangered",
    condensed == "No change in Status" & status == "T" ~ "endangered",
    condensed == "No change in Status" & status == "RT" ~ "delisted",
    condensed == "No change in Status" & status == "D3A" ~ "extinct",
    condensed == "No change in Status" & status == "DM" ~ "delisted",
    condensed == "No change in Status" & status == "DP" ~ "delisted",
    condensed == "No change in Status" & status == "DR" ~ "delisted",
    condensed == "No change in Status" & status == "DNS" ~ "delisted",
    condensed != "No change in Status" ~ condensed)
    )

Now I’m going to group my extincts.

endangered_df_sub <- endangered_df_sub %>%
  mutate(condensed = 
           fct_collapse(condensed, extinct = 
                          c("Delist: The species is extinct", "extinct")))

I’m not sure what : Revision of listed entity means. I’m going to see if there are comments back in the full dataset.

endangered_df %>% 
  filter(`5YSR Recommendation` == "Revision of listed entity") %>% gt()

Scientific Name	Common Name	Where Listed	ESA Listing Status	Lead Region	Listing Date	Most Recently Completed 5YSR	5YSR Recommendation	Notice of In Progress 5YSR	Notice Date of In Progress 5YSR	Group
Rangifer tarandus ssp. caribou	Caribou DPS, Southern Mountain	<div>Southern Mountain DPS</div>	E	1	1983	2019-10-02	Revision of listed entity	No Five Year Review In Progress	NA	Mammals
Cereus eriophorus var. fragrans	Prickly-apple, fragrant	<div></div>	E	4	1985	2021-10-19	Revision of listed entity	No Five Year Review In Progress	NA	Flowering Plants

I’m not seeing any explanation. There is not an entry in the code key either.

Okay, now for a visualization. This actually seems perfect for a waffle. I’ve had bad luck with the waffle package, but know how to make it output something now. So, I will try waffling again. I did try a different package (ggwaffle) that also doesn’t work. It does let you use a dataframe, but it also doesn’t handle large numbers well. It soes let you downsample the data if the numbers are too large, but I’d rather just process the data myself to make it waffle.

So, first I need to summarize the data to get the counts per class.

progress <- endangered_df_sub %>%
  count(condensed)

progress %>% 
  gt() %>%
  cols_label(condensed = "Status", n = "Number of species") %>%
  opt_stylize(style = 6, color = "blue", add_row_striping = TRUE) %>%
  tab_header(title = "Progess of Endangered/Threatened species")

Status	Number of species
Progess of Endangered/Threatened species
extinct	27
delisted	64
endangered	1440
Revision of listed entity	2

Now let’s change to percentages for optimal waffling

num_species <- nrow(endangered_df_sub)
progress_percent <- progress %>%
  mutate(n = ( n/num_species) * 100)

progress_percent <- progress_percent %>%
  mutate(n = round(n,1))

gt(progress_percent) %>%
cols_label(condensed = "Status", n = "% of species") %>%
  opt_stylize(style = 6, color = "blue", add_row_striping = TRUE) %>%
  tab_header(title = "Progess of Endangered/Threatened species")

Status	% of species
Progess of Endangered/Threatened species
extinct	1.8
delisted	4.2
endangered	93.9
Revision of listed entity	0.1

#Values below 1 won't show in a waffle graph anyway, so remove them.
progress_percent <- progress_percent %>%
  filter(n >= 1)

The waffle package won’t work with dataframes for me, so make it a vector.

progress_vec = deframe(progress_percent)

waffle::waffle(progress_vec, colors = c("black", "darkgreen", "darkred"),
               title = "How has the US done with our Endangered species?",
               xlab = "1 square = 1%")

< section class="quarto-appendix-contents">

Citation

BibTeX citation:

@online{e.sinks2023,
  author = {Louise E. Sinks},
  title = {30 {Day} {Chart} {Challenge-} {Endangered} {Species}},
  date = {2023-04-04},
  url = {https://lsinks.github.io/posts/2023-04-04-chart-challenge-4/day4},
  langid = {en}
}

For attribution, please cite this work as:

Louise E. Sinks. 2023. “30 Day Chart Challenge- Endangered Species.” April 4, 2023. https://lsinks.github.io/posts/2023-04-04-chart-challenge-4/day4.

To leave a comment for the author, please follow the link and comment on their blog: Louise E. Sinks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Citation

Related