The Dream 8 Challenges
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The 8th iteration of the DREAM Challenges are underway.
DREAM is something like the Kaggle of computational biology with an open science bent. Participating teams apply machine learning and statistical modeling methods to biological problems, competing to achieve the best predictive accuracy.
This year’s three challenges focus on reverse engineering cancer, toxicology and the kinetics of the cell.
- HPN-DREAM Breast Cancer Network Inference Challenge
Infer the signaling networks in breast cancer cell lines - NIEHS-NCATS-UNC DREAM Toxicogenetics Challenge
Predict individual response to environmental and pharmaceutical chemicals - The Whole-Cell Parameter Estimation DREAM Challenge
Infer the kinetic parameters underlying biological processes in whole cell models
Sage Bionetworks (my employer) has teamed up with DREAM to offer our Synapse platform as an integral part of the challenges. Synapse is targeted at providing a platform for hosting data analysis projects, much like GitHub is a platform for software development projects.
My own part in Synapse is on the Python client and a bit on the R client. I expect to get totally pummeled by bug reports once participation in the challenges really gets going.
Open data, collaborative innovation and reproducible science
The goal of Synapse is to enable scientists to combine data, source code, provenance, prose and figures to tell a story with data. The emphasis is on open data and collaboration, but full access control and governance for restricted access is built in.
In contrast to Kaggle, the DREAM Challenges are run in the spirit of open science. Winning models become part of the scientific record rather than the intellectual property of the organizers. Sharing code and building on other contestant’s models is encouraged in with hopes of forming networks of collaborative innovation.
Aside from lively competition, these challenges are a great way to compare the virtues of various methods on a standardized problem. Synapse is aiming to become an environment for hosting standard open data sets and documenting reproducible methods for deriving models from them.
Winning methods will be presented at the RECOMB/ISCB Conference in Toronto this fall.
So, if you want to sharpen your data science chops on some important open biological problems, check out the DREAM8 challenges.
More on DREAM, Sage Bionetworks, and Challenges
- May the Best Model Win
- Wisdom of crowds for robust gene network inference
- Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling.
- Metcalfe’s law and the biology information commons
- Synapse – a Kaggle for molecular medicine?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.