How to become a data scientist in 30 days?
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
On a late evening, I was scrolling through Reddit and came across a news article about “Why Bill Gates wants us all to get vaccinated?”. The news site looked legitimate. I was half way through the article and saw quite a few grammatical errors. Me being lurker, I switched to comments and saw a few of them mention the article being AI generated. I spent couple of minutes googling how to text generate using AI and found at least half a dozen websites. The first that I caught my eye was DeepAI.
DeepAI
The text generation API is backed by a large-scale unsupervised language model that can generate paragraphs of text. This transformer-based language model, based on the GPT-2 model by OpenAI, intakes a sentence or partial sentence and predicts subsequent text from that input.
Recently, I started to advertise my blog on google. The most common search term used to show my ad was “How to become a data scientist?”. I decided to test it by asking this unsupervised neural network “How one could become a data scientist in 30 days?. The results generated by DeepAI were pretty interesting. You do have an option to use the website to generate text. Why take the easy road when make it a little harder by using R and Python.
R Code
# install httr package install.packages("httr") # load httr package library(httr) # do a post function r = httr::POST( url = "https://api.deepai.org/api/text-generator", body = list("text" = "How to become a data scientist in 30 days?"), add_headers(.headers = c("api-key" = "quickstart-QUdJIGlzIGNvbWluZy4uLi4K")) ) # convert raw result to text cat(content(r, "text"), "\n")
Python Code
import requests r = requests.post( "https://api.deepai.org/api/text-generator", data={ 'text': 'How to become a data scientist in 30 days?', }, headers={'api-key': 'quickstart-QUdJIGlzIGNvbWluZy4uLi4K'} ) print(r.json())
Below is what DeepAI generated:
I was introduced to the concept of statistical data science. Then when I was 16 at the beginning of my senior year at Stanford, I was an undergraduate. It’s a great experience. I did theoretical data science a few years ago — a lot. I also spent a lot of time working with my colleague John Roberts and his team, for example, to do the software development for computing, or for things like this. I’d like to go to another series of data science areas. My colleagues, I had a big day at Stanford when they told me that they thought I was one of the few scientists in their field – we would be in that particular field, or, it seemed, at least it was out in this field. So I spent a lot of time with John Roberts and me as we were doing a paper. But in the late 70s, I was still getting really into statistics, and he had the idea of making our paper, a new form of math. His paper is very important.
So for the most part, you got data science at Stanford, you had computer programs at Princeton, or the National Academy of Sciences, or the NASA/ESA center. You really didn’t have to learn math by computer, to have that knowledge. There was little information for about mathematical data science there. How did the first data science team come across that?
We were working really hard on getting a complete research plan. We were looking to learn in 20 days, about the basic tasks — how to build a machine, how to make a database that can give you great results. In this case, we were really looking to learn the basic problem of building a data machine to solve this task because I was studying a very abstract software framework. I hadn’t really taken the time to get much education in computer science. So I did an internship in a lab where I was reading books in the lab and writing a paper. But in the meantime, I was really getting really motivated with my training to make paper paper. We had a team of people working as an engineer and I wanted to learn how to build a relational database so that I could have the help of people like John Roberts and his team.
Now, there were a couple of other types of projects that the team had that they came across. In 2003, I had made a paper that had been published in the journal PLOS One, from a person with a little bit of a computer science background. I was an editor of its paper called the “Cogent Bayes Analysis of Data Structure” [which] has shown the effect of a given object having a complex structure that has different levels of complexity than a typical human-made database … I had had the idea that what was being used for a computation is actually a complex computer. The problem with that paper was, it has a large size, but it had very small size. But basically, I was reading the paper on the difficulty of constructing a complex structure that was going to be very difficult to make, and I think I might have made a paper about that. The problem is, I thought to myself that the paper you presented would be pretty easy to make.
When I wrote that paper, was it ever just about finding stuff?
It’s not always easy. To be able to think like this is something you never really want to make. In the most simple, I mean, you are just looking at things from the inside out. It takes really long enough to build a database and then it might take a year or so to make it a bit of work. It requires a lot of resources.
Where did you see this pattern in data science as the next step for you, but what do you see when you do it?
The pattern was kind of like a pattern in science. It’s always something we’re gonna have to look at, and then the data that we’re collecting it becomes something that we’re going to look at and think about, then we’ll get really good at what we’re doing and then we’ll make the data more interesting, and then we’re like, oh, I’m going to start making a paper and it’s going to take too long to go back and figure out how to do that. But, ultimately, what we’re doing is not making a paper that can actually understand things without a lot of research because a lot of its parts might be just things that are really important for a paper object or whatever, and we’re really digging into the literature or just finding stuff. That was the real challenge.
As you’ve mentioned, the main goal of Data Science is to take complex data into an understanding system that, for example, could be something like an evolutionary programming model, how you can create a data network that allows you to define what we could do.
AI-Writer
AI-writer is an online platform where, you can ask the AI to write an article for you. It also generates citations & sources for your article. Below is the article written by AI-writer.
Becoming a good data scientist takes a lot of work, energy and time, but don’t think the whole world would be a data scientist if it were easy and fast. Even if there is no right way to learn data science, a learning period of a few months will be one of your lifelong investments. Sources: 3
Even Harvard Business School has found it the sexiest job of the 21st century, with an average salary of more than $100,000 a year and an annual salary of more than $200,000. Sources: 3
No wonder so many people are interested in becoming computer scientists. According to the US Census Bureau, the expected growth rate for data science jobs over the next decade includes more than 1.5 million new jobs for those who work as similar data scientists. As the word “data scientist” has become increasingly popular recently, many companies looking for mathematicians, statisticians, and business analysts will be posting jobs with the title “data scientist” to attract more attention. Sources: 5
You don’t have to be the world’s best programmer or spend a fortune to work in big data analytics and artificial intelligence, but you do need to know the basics. You can start right away by working with a data science program at a company like Google, Microsoft, Apple, Facebook, Amazon or any other company. Sources: 5, 7
This means that virtually anyone can improve their employability and career prospects by learning the basic theoretical and practical skills required for data science. Finally, it should be noted that data science is likely to offer a number of benefits that can help your current career if you have the opportunity to become a full-time or freelance data scientist. Demand for data analysis skills in their day-to-day work is forecast to exceed the requirements of traditionally qualified data scientists in the coming years. Sources: 0, 7
If you have skills in data analysis, you will almost certainly be able to add value in some way, but what exactly you can do depends heavily on what your job is. Although they can affect the outcome of your business and improve your own productivity, they are unlikely to help you earn more from your current role. Sources: 0
There is currently a skills gap in data science, where the majority of data scientists have less than five years of experience. Some companies are looking for experienced professionals with 10 years or more, and in these cases, data analysts will continue their training and hone their skills as data scientists. After 10 + years of work, a data scientist can obtain a doctorate, take on the role of Director of Data Science and undergo further training. His or her title may not change, but in this case the data analyst will continue his or her education or hone his or her skills before becoming a data scientist. Sources: 1
If you are interested in a career related to data, there are two ways to become a data scientist: to become a data analyst or director of data science. Sources: 1
In this way, the candidate will be able to limit the chances of failure and understand the questions and skills that recruiters expect from a data scientist. Data science and analytics professionals are in high demand and enjoy high pay and benefits, as well as access to a wide range of opportunities. A 2017 study by IBM, Quant Crunch, found that employers are looking for in-demand skills such as analytics, machine learning, and artificial intelligence. Sources: 1, 6
The road to learning data science is rocky, but regardless of whether you put in enough hard work, this period is suitable for anyone who knows the pros and cons of data science. Sources: 6
If becoming a data scientist sounds fascinating, the Flatiron School offers several ways to start your career in big data. For candidates looking for a way to crack an interview, we recommend taking a look at the Landing School’s data science education program, as it is a great example of how the interview process works in the field of data science. The Flat Iron School’s “Data Science in 30 Days” program turns you into a data scientist in just 15 weeks. Sources: 2, 6
For those who want to hone their skills but may not be quite ready to change their career, this could be the right option for you. Sources: 2
If you’re not sure if you can commit to a full bootcamp when it comes to data science, the Online Data Science Bootcamp Prep will cover the basics. Most of the people who attended NYC science academy bootcamps have STEM backgrounds. Sources: 2, 4
Business data is increasingly digitalized, and computers with higher computing power are faster and cheaper. The Internet of Things leads data scientists to sift through billions of pieces of data, including emails, social media posts, and other data. Soon, almost every type of company will be looking for a data scientist, and their business data will increasingly be digitized.
Cited Sources
- https://www.dataquest.io/blog/career-guide-data-science-options/ 0
- https://www.mastersindatascience.org/careers/data-analyst-vs-data-scientist/ 1
- https://flatironschool.com/blog/data-analyst-vs-data-scientist/ 2
- https://data-flair.training/blogs/top-data-science-skills/ 3
- https://uncubed.com/daily/become-data-scientist/ 4
- https://fourpillarfreedom.com/how-to-become-a-data-scientist/ 5
- https://analyticsindiamag.com/can-someone-become-a-data-scientist-in-6-months/ 6
- https://www.forbes.com/sites/bernardmarr/2020/02/24/the-9-best-free-online-data-science-courses-in-2020/ 7
These articles are not the greatest. But, once the article is generated, you could spend about an hour to brush this up and publish it. In the time of fake news being spread every where and these tools getting better every day, it becomes quite challenging to believe what is “true” and what is “fake”. At some point of time we need a government legislation and a set of laws on building ethical AI.
If you enjoyed this article, do check out my other recent post. Also, feel free to comment below and share it.
The post How to become a data scientist in 30 days? appeared first on Hi! I am Nagdev.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.