Venture Capital Deals in 2016 – An Overview (1/2)

Salvino

5 years ago

[This article was first published on English – R-blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

< !-- Go to www.addthis.com/dashboard to customize your tools -->

Over the year 2016, I manually gathered press clippings announcing Venture Capital (VC) deals from various online or newsletters public sources each time I bumped into something that caught my attention. Early January, I then put together and cleaned the data and made it R-usable as a csv dataset of 1,720 different deals. The dataset comprises info about VC deals that took place in 50 different countries summing to a total value of approx. 22.3b US$ over 10 different funding rounds (from angel to series F). However, the dataset should not be considered as a representative sample (in the statistical sense) that accurately reflects the 2016 VC deals universe. The caveat being clearly stated, I nonetheless found interesting digging a bit deeper into the data, at least to have some kind of flavor of VC deals during last year. Expectedly, such flavor reflects more what captured my attention than anything else.

1. Investments

1.1. Overview

A quick visualsation of the data shows 3 important points:

More than 40% of all deals in the dataset have a value lower than 7.5m US$ –actually, based on my experience in the investment industry, I am personally convinced that the value of the vast majority of deals is lower than 0.5m US$ but perhaps smaller deals do not attract media’s attention and should be found through many of the ad hoc monitoring services available online for a (high) fee rather than through my own cherry-picking approach.
In all industries but pharma there is a much higher number of smaller deals than bigger ones. In pharma and, in a lesser extent, in healthcare or biotech the distribution of deals is more uniform, from smaller to very big deals. This is a known phonomenon: to bring innovation to life in pharma, healthcare and biotech it usually takes more resources (time and funding), especially when heavy capital expenditure is needed.
Despite several outliers, the value of deals increases from an investment round to the next one, until series D. From series D onwards, the median value does not vary much, although variability (interquartile range) of series D is much higher than of series E and F.

ANOVA test of value of deal ~ industry:

F Value = 2.9164321
Pr(>F) = 0.0031213

ANOVA test of value of deal ~ funding round:

F Value = 62.9453751
Pr(>F) = 0

1.2. Rankings

It is interesting to underline that the rankings of countries by number of deals, by total value of deals and, therefore, by average value per deal are quite different.

a. Countries Ranked by Number of Deals

Country	Number of VC deals
usa	920
india	140
uk	118
italy	66
germany	56
france	46
china	43
israel	41
canada	38
australia	31

b. Countries Ranked by Total Value of Deals

Country	Total value of VC deals (US$)
usa	14,026,768,270
china	1,824,000,000
uk	1,113,320,110
india	809,793,510
germany	632,764,700
canada	552,980,000
israel	508,100,000
france	499,215,320
united arab emirates	362,700,000
italy	228,518,450

c. Countries Ranked by Average Value of Deals

Country	Average value of VC deals (US$)
united arab emirates	72,540,000
china	42,418,605
iran	22,400,000
luxembourg	22,166,667
indonesia	18,756,000
brazil	17,640,000
ireland	16,789,076
japan	16,423,077
usa	15,499,191
norway	15,000,000

At this point, it is quite easy to verify that the difference in rankings is caused (amongst others) by the country variation of the number of deals in each funding round, knowing that each funding round has a different mean value of deal funding.

1.3. Dominant Industry

I was also curious to see where investments go (in terms of industry) in each country. Rather than the absolute number of deals, the total amount of investment in each industry seems a better indicator. I am pretty convinced that this map would be (very) different if referred to any other year. But, I think the main caveat with this approach is the reliability of the deal allocation per industry in the dataset: many deals are ambivalent and could very well be classified under several different industries. Let’s take the example of a VC deal to support a new digital imaging system for agricultural purposes to be embarked on commercially available drones –under which industry would it fall ? agriculture ? ict ? ar (augmented reality) ? any other category ? There is no right or wrong answer here (as long as consistency is ensured all over the data collection phase) but this has a strong impact on the visualisation and the conclusions that can or cannot be drawn from it.

A similar approach could be adopted to spot, for example, the dominant funding round in each country present in the dataset.

2. Investors

Out of the 1,720 press clippings in the dataset, 1,433 provide info on the investors participating to the deals, for a total count of 2,804 unique investors. Many deals are closed by several investors (1,074 deals have more than 1 investor), and many investors are involved in more than a single deal.

Investor	Number of VC deals
andreessen-horowitz	3
high-tech-gruenderfonds	3
target-partners	3
accel	2
accomplice	2
atomico	2
bessemer-venture-partners	2
glilot-capital-partners	2
global-founders	2
hub21	2
insight-venture-partners	2
jafco-japan	2
key-capital	2
lightspeed-venture-partners	2
omidyar-network	2
principia-sgr	2
sequoia-capital	2
sigma-prime-ventures	2
true-ventures	2
verlinvest	2

Overall, the number of investors per deal seems to be impacted by the funding round; however, it does not depend on the industry.

ANOVA test of number_of_investors per deal ~ funding round:

F Value = 7.1950899
Pr(>F) = 0

ANOVA test of number_of_investors per deal ~ industry:

F Value = 0.7514098
Pr(>F) = 0.6459718

3. More to come…

My original objective in saving press clippings rather than just facts & figures over a year was to run some text mining and analysis exercise. More on that in the next post.

The dataset and complete R code will be made available for download at the end of the second part of this case study.

< !-- Go to www.addthis.com/dashboard to customize your tools -->

To leave a comment for the author, please follow the link and comment on their blog: English – R-blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.