Site icon R-bloggers

What is an open source software distribution?

[This article was first published on Stringfest Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I remember when I first got into Python and heard about Anaconda. It wasn’t really clear to me if Anaconda was a “dialect” of Python code, a really awesome suite of tools for working with Python, or something else.

What is a distribution, anyway?

Turns out that Anaconda is first and foremost a distribution of Python… and to understand what that means, you need a basic understanding of particularly open source software.

Open source is a license to redistribute

Open source software means that anyone is free to use, purpose, modify and redistribute it (See where this is going?).

In software development more generally a “distribution” is how software gets from the developer to the end user. Some distribution channels, such as Office 365, are proprietary. You couldn’t just make copies of Excel, for example, add some features, and give it away or resell it. But because Python is open source, this practice is permissible.

The “official” Python code base is available from the Python Foundation at python.org. While you could download it from there, it’s common especially in the data community instead to do so from Anaconda. Essentially, this for-profit company redistributes the Python code with various features and services; hence, you’ll often hear it referred to as the Anaconda “distribution” of Python. You can download the free Individual Edition on Anaconda’s website.

Why Anaconda?

The Jupyter Notebook application comes installed with Anaconda.

Curious why the common preference to download the Anaconda distribution? First, many of the most common packages for working with data in Python come pre-installed with Anaconda. If you’re not familiar with the concept of packages, check out this post.

Anaconda also comes with many popular applications for working with Python code, such as the Jupyter Notebook. To learn more about Jupyter and Python, check out this post.

Distribution received… now get coding!

Legendary computer scientist Alan Kay once said “perspective is worth 80 IQ points.” The ability to conceptually put Anaconda into perspective with Python and open source in general will make your development choices that much more clear.

But nothing beats learning about code than trial and error by coding. If you’re coming into Python from Excel, check out my book Advancing into Analytics: From Excel to Python and R.

To leave a comment for the author, please follow the link and comment on their blog: Stringfest Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.