Journalism: A Highly Dangerous Job Around the World
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Motivation
What comes to your mind when you think about the most dangerous jobs in the world? The other day I was talking to a friend about high-risk professions. She told me journalism would be a risky one. At first I laughed at her, since she’s a journalist. However, after she give me some supporting clues I decided to conduct a little research on this. Eventually, I found the Committee to Protect Journalists – CPJ that maintain a good dataset taking records of injured, imprisoned and killed journalists and media workers around the globe. I realized that many journalists are killed every year while covering everything from business and sports to revolutions, wars, political upheavals, elections, corruption, human rights violations etc.
Data
The CPJ began documenting the deaths of media workers in 1992, as a way to drive attention and recognition of the vital role these individuals play in newsgathering. Journalists are defined by CPJ as people who cover news or comment on public affairs through any media, including in print, in photographs, on radio, on television, and online. The database takes up cases involving staff journalists, freelancers, stringers, bloggers, and citizen journalists. The clean spreadsheet I’m using in this analysis can be acessed from my Github data repo. But a more updated version can be downloaded here.
Results
Reported deaths over time
In 1992, the number of deaths reported was 55, and 74 in 2017. The highest number is seen for 2007.

The sex of journalists
Only a fraction of the journalists killed is female (7.1%). As we don’t know the balance of female/male in the profession across countries, it will be difficult to evaluate this statistic. It could be that male journalists are more likely to be sent risky zones.
jornalists %>% dplyr::filter(!is.na(Sex))%>%
    dplyr::count(Sex) %>%
    dplyr::mutate(Freq = n / sum(n, na.rm=TRUE)) %>%
ggplot(aes(x = Sex, y=Freq)) + geom_bar(stat = "identity") + 
theme_flex()
The nationality
 jornalists %>% dplyr::filter(!is.na(Nationality))%>%
  dplyr::count(Nationality) %>%
     dplyr::mutate(Freq = n / sum(n, na.rm=TRUE)) %>% 
   dplyr::arrange(desc(n)) ## # A tibble: 95 x 3
##    Nationality     n   Freq
##    <chr>       <int>  <dbl>
##  1 Iraq          161 0.140 
##  2 Syria         105 0.0910
##  3 Philippines    80 0.0693
##  4 Pakistan       59 0.0511
##  5 Algeria        58 0.0503
##  6 Russia         54 0.0468
##  7 Somalia        53 0.0459
##  8 Colombia       49 0.0425
##  9 India          46 0.0399
## 10 Mexico         42 0.0364
## # ... with 85 more rowsThe organization
As we can see from the table below, .11 or 11% of the reported deaths reffers to people working as a freelancer jornalist. Al-Arabiya television comes in second, but very far below from the freelancer position.
tab1 = jornalists %>% dplyr::count(Organization)
  
 jornalists$Organization <- ifelse(grepl(pattern = "freelance", jornalists$Organization, perl = FALSE), "Freelance", jornalists$Organization)
 jornalists$Organization <- ifelse(grepl(pattern = "Freelancer", jornalists$Organization, perl = FALSE), "Freelance", jornalists$Organization)
 
# jornalists <- jornalists %>% dplyr::mutate(Organization = ifelse(grepl("freelan", Organization), "Freelance", Organization))
 
tab2 = jornalists %>% dplyr::count(Organization)
jornalists %>% dplyr::filter(!is.na(Organization))%>%
  dplyr::count(Organization) %>%
dplyr::mutate(Freq = n / sum(n, na.rm=TRUE)) %>% 
   dplyr::arrange(desc(n))## # A tibble: 1,271 x 3
##    Organization                  n    Freq
##    <chr>                     <int>   <dbl>
##  1 Freelance                   211 0.112  
##  2 Al-Arabiya                   15 0.00797
##  3 Reuters                      14 0.00744
##  4 Al-Shaabiya                  12 0.00638
##  5 Al-Iraqiya                   11 0.00584
##  6 Baghdad TV                    9 0.00478
##  7 Al-Jazeera                    8 0.00425
##  8 Al-Sharqiya                   8 0.00425
##  9 Algerian State Television     8 0.00425
## 10 BBC                           8 0.00425
## # ... with 1,261 more rowsMedium
What is the most frequent medium of the killed journalists?
jornalists$Medium <- ifelse(grepl(pattern = "Print,Internet", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Television,Internet", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Radio,Television", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Print,Television", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Print,Radio", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Radio,Internet", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists$Medium <- ifelse(grepl(pattern = "Internet,Television", jornalists$Medium, perl = FALSE), "Several", jornalists$Medium)
jornalists %>% dplyr::filter(!is.na(Medium)) %>%
  dplyr::count(Medium) %>%
     dplyr::mutate(Freq = n / sum(n, na.rm=TRUE)) %>%
  dplyr::arrange(desc(n)) %>%
ggplot(aes(x = reorder(Medium, -n), y=n))  + geom_bar(stat = "identity") + 
labs(y = "Number of deaths ", x = "Medium Type", title= "Reported deaths by medium type") +
theme_flex()
Job
The Most Dangerous Job in Journalism Is Just Being a Reporter in
jornalists %>% dplyr::filter(!is.na(Job)) %>%
  dplyr::count(Job) %>%
    dplyr::mutate(Freq = n / sum(n, na.rm=TRUE)) %>%
ggplot(aes(x = Job, y=n))  + geom_bar(stat = "identity") + 
theme_flex()
The type of coverage/episode
jornalists %>% dplyr::filter(!is.na(Coverage)) %>%
 dplyr::count(Coverage) %>%
    dplyr::mutate(freq = n / sum(n, na.rm=TRUE)) %>%
ggplot(aes(x = Coverage, y=n)) + geom_bar(stat = "identity") + 
theme_flex()
The type of death
CPJ applies strict journalistic standards when investigating a death. One important aspect is determining whether a death was work-related or not. The case will be considered “confirmed” only if there is reasonably certain that a journalist was murdered in direct reprisal for his or her work; was killed in crossfire during combat situations; or was killed while carrying out a dangerous assignment such as coverage of a street protest. Journalists who are killed in accidents such as car or plane crashes are therefore not included in the dataset. However, when the motive is unclear, but it is possible that a journalist was killed because of his or her work, CPJ classifies the case as “unconfirmed”, but the investigation can continue.
jornalists %>% dplyr::filter(!is.na(`Type of Death`)) %>%
  dplyr::count(`Type of Death`) %>%
    dplyr::mutate(freq = n / sum(n, na.rm=TRUE)) %>% 
  dplyr::arrange(desc(n)) %>%
    ungroup() %>%
ggplot(aes(x =  reorder(`Type of Death`,-n), y=n)) + geom_bar(stat = "identity") + 
labs(y = "Number of deaths ", x = "Type of Death", title= "Reported deaths by type") +
theme_flex()
The country that kills the most
jornalists %>% dplyr::filter(!is.na(`Country Killed`)) %>%
  dplyr::count(`Country Killed`) %>%
    dplyr::mutate(freq = n / sum(n, na.rm=TRUE)) %>% 
  dplyr::arrange(desc(n))## # A tibble: 108 x 3
##    `Country Killed`     n   freq
##    <chr>            <int>  <dbl>
##  1 Iraq               278 0.148 
##  2 Philippines        138 0.0733
##  3 Syria              131 0.0696
##  4 Mexico              99 0.0526
##  5 Pakistan            89 0.0473
##  6 Colombia            85 0.0452
##  7 Russia              82 0.0436
##  8 India               75 0.0399
##  9 Somalia             71 0.0377
## 10 Algeria             61 0.0324
## # ... with 98 more rowsThe source of fire
jornalists %>% dplyr::filter(!is.na(`Source of Fire`)) %>%
  dplyr::count(`Source of Fire`) %>%
    dplyr::mutate(freq = n / sum(n, na.rm=TRUE)) %>% 
  dplyr::arrange(desc(n))## # A tibble: 23 x 3
##    `Source of Fire`                         n    freq
##    <chr>                                <int>   <dbl>
##  1 Political Group                        412 0.322  
##  2 Military Officials                     231 0.180  
##  3 Unknown Fire                           210 0.164  
##  4 Government Officials                   189 0.148  
##  5 Criminal Group                         115 0.0898 
##  6 Paramilitary Group                      58 0.0453 
##  7 Local Residents                         26 0.0203 
##  8 Mob Violence                            14 0.0109 
##  9 Military Officials, Political Group      5 0.00391
## 10 Criminal Group, Government Officials     4 0.00312
## # ... with 13 more rowsFor additional information, including the list of the journalists killed in 2012, visit:
In 2012 alone, 103 journalists were killed around the globe. Motives were confirmed for 70 of them. The deadliest countries for journalists in 2012 were Syria (28 deaths), Somalia (12 deaths), Pakistan (7 deaths), and Brazil (4 deaths). The motives where the confirmed in these cases.
The way journalists are killed, range from crossfire or combat to murder. Impunity is a shocking 100% for murder cases. More detail in the chart below:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
