Recent updates to the Team Data Science Process
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It's been over a year since we first introduced introduced the Team Data Science Process (TDSP). The data, technology and practices behind Data Science continue to evolve, and the TDSP has evolved in parallel. Over the past year, several new facets have been added, including:
- The IDEAR (Interactive Data Exploration, Analysis and Reporting) framework, an open source extension to R and Python designed to standardize the process of data exploration and reporting;
- Guidance for use of Spark 2.0, including an end-to-end Spark v2.0 walkthrough;
- Guidance for use of in-database Python with SQL Server, including an end-to-end in-database Python tutorial;
- Instantiation of TDSP projects and templates within the new Azure Machine Workbench.
For an example of applying the TDSP to effective data science projects, check out Buck Woody's 10-part series walking through every stage of a typical data science project.
As the practice of data science changes, the TDSP continues to evolve. The TDSP is an open project hosted on Github, and your contributions are welcome.
Cortana Intelligence and Machine Learning Blog: The Microsoft Team Data Science Process (TDSP) – Recent Updates
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.