Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Motivation
The United States has the title of world’s biggest economy for a long time. However, China’s economy has been growing at a pace that is threatening that leadership. The Gross Domestic Product (GDP) in the US was worth 18.57 trillion dollars in 2016 while China’s GDP was worth 11.2 trillion dollars in the same period.
Recently, the Trump Administration placed tariffs on Chinese products like flat-screen televisions, medical devices and others. The Chinese counter attacked placing tariffs in products like soybeans and pork. The exportation and importation of these products can have a direct impact on the GDP.
This project is designed to analyze and visualize the commodities exportation and importation around the world with a special focus on China and United States.
Data
The dataset is from the United Nations Statistics Division and covers import and export trade values in USD for 5,000 commodities across most countries on Earth over the last 30 years. The size of the file is 1.25GB and a preprocessing step was necessary to reduce that size.
My first attempt was to create an image of the database inside R that reduced the size of the file to less than 100MB. However, the time to load the data from that file was unfeasible. The next logical attempt was to migrate to SQL database. However, after creating the table, the file size was still greater than 1GB and required additional manipulation. The following steps were made:
-
Drop unused columns
-
Normalize the database
-
Tune the database
The first step is straightforward since I had no need to store values that are not going to be used. The next step was necessary because there were two text columns (commodity and category) that had repeated values all over the dataset. Considering that text usually needs more memory space than an integer, two new domain tables were built to store the text values and the unique identifier created was referenced in the main table. For the last step, primary and foreign keys were created to improve the queries performance. The figure below shows the difference between the original and final database version:
Application
This application has multiple tabs, each one offering a different approach to how we can compare the commodities.
The Bar Graph tab will allow us to choose a country and see what commodities most affect the exportation and importation total amount. For China, the total export trade amount was more than US$ 750 billion. The top 10 categories and the percentage that those ten categories influenced on the total amount are highlighted below.
The Map section, in opposite to the first tab, will give you the opportunity to compare countries trade value for a specific commodity. The map graph below was generated calculating the balance (export – import) with all commodities taken into consideration. We can see that China and US are polar opposites in terms of trade values. China shows a higher balance trade value, while the US shows a higher negative trade value.
The last visualization option is a bubble chart. The graph enables a user to see the behavior of two commodities over time by selecting more than one country and the flow (Export or Import). Some insights can be extracted from the graphs below. We see that in ten years China has exceed the US in exportation. Also, in 2016, China was still leading the overall commodities exportation. One curious bit of information that is shown in this graph is that in 2009 all the countries had reduced the trade value amount exported. This was probably because of the 2009 global financial crisis that likely diminished the quantity of products exported or the value of each commodity.
Conclusion
Clearly we can see that China is winning the war trade in commodities. China has a positive balance in contrast to the US, and that helps in the gross domestic product calculation. However, trade balance is only one component in a country’s economy estimate.
This app is very flexible and generic in a sense that analysis on other countries could be done as well.
Reference:
https://www.nytimes.com/2018/04/03/us/politics/white-house-chinese-imports-tariffs.html
https://www.nytimes.com/2018/04/05/business/trump-trade-war-china.html
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.