Why do I have a data science blog? 7 benefits of sharing your code
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
My blog statsandr.com was launched in December 2019. Although 9 months of writing is a very short period compared to others, I can already say that it’s been an incredible and very enriching adventure!
With 45 articles published and topics ranging from descriptive statistics, probability, inferential statistics to R Markdown and data visualization, I have seen many benefits of sharing my code through a technical blog. In this article, I highlight 7 of them (in no particular order) with the hope that it will give ideas and incentives to some of you.
In the end of this article, I also mention a couple of modern solutions needed for starting your own blog.
Note that in these 9 months, I did not make a living from my blog and this is not my goal as I would need to make it a priority. However, I have received enough positive feedback from readers, and more importantly, I have learned enough to continue writing.
#1 Learn by writing
I really enjoy learning new stuff in many different domains. I learn (and I am still learning) a lot by teaching statistics to students from diverse backgrounds as part of my teaching assistant position at university.
Before launching this blog, I believed that I understood a statistical concept as soon as I was able to teach it to my students. If I was not able to explain it in a clear and understandable way, it meant that I needed to study it more thoroughly because I actually did not fully understand it.
This is often referred as the Feynman technique. This method of learning is based on the fact that in order to fully master a topic you need to be able to explain it back to someone in simple terms.
Throughout this blog, I actually realized that in order to learn and fully understand something new, one must:
- be able to clearly communicate it and teach it in simple terms,
- but also be able to write it down in a precise and concise manner
So although this blog was first launched to share statistical concepts I am most familiar with (hoping that it would be useful to some people), I now also use it to learn by writing. I think that this additional way of learning is actually as powerful as teaching because writing allows me to consolidate my understanding.
Obviously, I am learning mainly about statistics and its applications in R as they are the main topics of the blog. However, I never thought that I would also learn so much about:
- web development and SEO/analytics (which are increasingly important skills nowadays)
- project management (as you build something from scratch and wish to develop it)
- writing (a skill I still need to improve as a non-native English speaker)1
- communicating results (as any data scientist will tell you, results without proper communication are useless and writing a blog is a great practice)
- marketing/public relation/brand management (think about social networking and how to deal with all sorts of questions from readers)
- etc.
Maintaining a blog teaches me essential skills that are usually taught in full-time jobs. With a blog, you are responsible of everything from the content to readers’ inquiries, similar to an employee in charge of a project and who has to deal and communicate with end users.
Don’t get me wrong, apart from exceptions, I do not think that a blog can completely replace a job in terms of personal development, but it definitely helps to learn a broad range of important skills. Reading books and completing data science online courses are other examples to gain skills in addition to a job, but a blog tends to be more diverse and applied, making it more beneficial (in my opinion).
Another way I am learning is by doing research about the topic I am writing about.
An example of it is my post on outliers detection in R. As part of any descriptive analysis, I am used to check for potential outliers. Since I was familiar with this topic, I decided to write about it. However, I found the post not complete enough so I did some further research. It turned out that there were in fact several statistical tests that I did not know about. I wrote about these tests and I now include these new techniques whenever I check for potential outliers.
Last but not least, I often receive emails from readers asking to write about a topic of their choice. It sometimes happens that I never heard about their suggested topic, so I do some research out of curiosity. Even if I still do not write about it because I am not familiar enough with it, at least I am aware that it exists and I know more or less what this is about.
If the sole benefit of learning did not convince you to start your blog, see other benefits I have experienced in the following sections.
#2 Get feedback
Get feedback, suggestions and constructive critics from more experienced users. Remarks greatly help in correcting typos and bugs present in my code, and they also help in improving my R skills.
You will be amazed to see that, although you spent countless hours to check your code, there will always be someone who will spot a typo that you missed. Valuable feedback from people all over the world has, for instance, definitely improved the quality and completeness of my R Shiny applications that I use on a daily basis.
A blog can therefore be seen as a powerful peer-review method of your understanding of a concept, code or R practices. It is also better to make mistakes when working on a toy example and correct them, than to make mistakes on a real project at your workplace.
#3 Personal note to remind my future self
How many times you searched for a piece of code in the folders of your computer, to finally look for the solution on Google because you could not remember for which project you wrote that piece of code? It happened to me every day.
With blog posts organized by topics, it now takes me much less time (and less frustration!) to find code snippets that I wrote several months ago. This also allows me to keep my code and R practices up-to-date, as I only have to edit them in one place.
An example of this is my article about graphics in R with {ggplot2}
. I prefer plots with the {ggplot2}
package over plots available by default in R base, but I cannot remember all layers and their arguments. Now every time I struggle with a plot using this package, I simply revisit the corresponding article to find the solution. Same goes for many of my articles, every time I have forgotten the code or a nuance around how it works.
You could store your code in different files (R Markdown documents or R scripts for example, as I used to do in the past), but blog posts have the advantages that:
- code snippets are highlighted, and
- by making it available to the world, you are forced to keep it tidy, complete and up-to-date.
#4 Contribute to the open source community
I have learned so much and I keep learning everyday about R thanks to great resources made available for free by developers and scientists who believe in open source and free materials.
Making all my code and articles freely available through a blog is in some sense, my way of:
- “paying back” people who helped me to learn and thanks to whom I stand where I am now, and
- pay in advance the many more from who I will learn in the future.
If you also believe in sharing your knowledge or expertise, having a blog is definitely a great way to contribute to the community. Your contribution does not necessarily have to be huge, as long as it adds something, someone will use it. Remember that everyone started as a beginner, and even world experts are beginners in other domains. And you will see that as you keep sharing your knowledge with others, some people will appreciate it because they can learn from it.
As far as I am concerned, if my small contribution is useful for some people in having a better understanding of statistics or in learning R, my goal will be reached.
#5 Stay humble, stay curious
Since launching my blog, I discovered plenty of other data science blogs of high quality. The more I see new things about different topics and the more I see people doing incredible stuff, the more I realize that I actually do not know much. This reminds me to stay humble.
In addition to that, by sharing my code with scientists from diverse backgrounds and coming from all over the world, it allows me to consider other perspectives and practice open-mindedness, which in turn helps me to stay curious.
Being humble and curious is, in my opinion, a good starting point to keep learning. I would not want to lose my curiosity, otherwise I may lose my appetite for learning.
For your information, R-bloggers and R Weekly are two great blog aggregators focusing on R, and the Towards Data Science publication on Medium is full of high quality blog posts covering many topics related to data science. Subscribe to these if you are interested in discovering more technical blogs.
#6 Learn to be less perfectionist and to prioritize
No matter how much time you spend on your blog, remember that everything cannot always be perfect. As a perfectionist, I cannot deny that I would love everything to run perfectly in all aspects of my life.
At the time of writing my first articles, my tendency for perfectionism was sometimes so prevalent that it was a real weakness: I could spend several minutes thinking whether a comma between two words was needed or not (which would of course make no difference at all). However, at some point, improving something takes so much time and energy that the added value is not as large as the added value you could create by devoting your time in producing something new.
This gets even worse as you keep learning every day. As a consequence, an article which seemed to be perfect in the past may no longer be viewed as perfect today, so you want to change or add a small detail in all your previous articles. I am not saying you should not edit a blog post, it is even recommended! Nonetheless, I advise you to edit it only if it really adds a significant value to it. Otherwise, I believe it is best to spend that time and effort in creating something else.
With a blog you will thus gradually learn to put your effort and time (which are limited resources and seem to be scarcer with the years) where it is most productive. In other words, you will learn to prioritize.
#7 Build connections and professional relationships
By sharing your practices in a specific field, you sometimes come across people who actually work on the same topic than you or have similar research interests. This was the case, among others, with:
- My collection of top R resources on Coronavirus. It started with only 5 resources, then as I was gradually adding more and more resources, Rees Morrison who was collecting blog posts on the same topic contacted me and I added his blog posts to the original collection.
- The presentation of a R package developed by Renan Xavier Cortes which could be used to download free Springer books during Covid-19 quarantine.
- My article on the analysis of the spread of COVID-19 in Belgium, which led, in collaboration with three other researchers, to a seminar and the creation of graphs representing COVID-19 hospital admissions in Belgium (which were later used by a Belgian news television channel).
These collaborations extended my professional network and allowed me to build connections with new researchers from different universities and from different countries. I would never have known these people without first sharing my analyses on a specific topic.
I am sure that a blog can lead to numerous collaborations, making the whole journey even more interesting and enriching. And who knows, the next chapter of your professional life may be shaped by one person with whom you have worked in the past.
Even if your blog does not lead to any collaboration, it is still a great tool for self-promotion. Doing some research, practicing, and writing about a topic help to become (and to be seen as) an expert in the field. Moreover, your blog will become your portfolio when applying for jobs or projects. Recruiters will definitely appreciate to see what you are capable of.
Whether your blog is at the origin of some fruitful collaborations or a showcase of your work, it is a valuable asset when applying for your dream job.
How to start your own blog?
I have seen an acceleration in my learning curve, a deeper understanding of statistics and R, and a large improvement in my communication skills since starting my blog. I never realized how useful maintaining a blog could be as a learning technique. This comes with the added benefit of being able to store information and code for myself and for others to also use, which is essential for getting feedback and connecting with other researchers.
If these 7 benefits have convinced you, the good news is that nowadays it is cheaper and easier than ever to start your own blog.
For non-technical blog I recommend Medium or WordPress. For technical blogs, there are plenty of static site generators to choose from but I highly recommend using Hugo and the {blogdown}
package (I created my blog after reading this book). You can then host it on GitHub then publish it using Netlify.2
Thanks for reading. I hope this article answered questions such as “Why do you have a blog?” or “What is the purpose of it?”, and who knows, gave you the motivation to start your own blog. If you are still undecided, I truly recommend to do it and remember that the hardest part is to start. If you are patient and passionate about a topic, you simply have to start and the rest will follow.
As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion.
Even native English speakers can improve their writing skills with a blog.↩︎
Note that I am not a programmer nor a computer scientist so I do not have extensive knowledge in creating websites. There must be many more options available but I find these options the most optimal choices given my computer skills and my goals.↩︎
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.