Validation? Why Bother?
By Kate Hanley.
I may be biased, but I strongly believe that R is one of the best tools out there for performing statistical analyses and related tasks. It is driven by academia, so cutting-edge methods are available quickly and painlessly, and many of the world’s foremost statistics and data mining experts are using and developing for it on a regular basis. It was designed by statisticians, for statisticians, and it’s very good at what is does. After being one of the statistics community’s best kept secrets for many years, R is finally at the forefront. It’s free, it’s cool, it’s sexy … and it makes people in regulated industries very nervous. Why? Because it’s open source.
Generally, I’ve always thought of the open source movement as a Good Thing. You have a new idea you’d like to incorporate in your favourite software? Go ahead, write it. You want to change the source code on your local installation to suit you? You can do that too. But where the nervousness starts to creep in is when you’re making life-and-death decisions based on code that someone wrote for free in their spare time. One of R’s key advantages suddenly becomes, for some industries, its biggest drawback.
Let’s be honest here – R comes with no guarantee that it does what it says it does. The “base” installation of R and recommended packages are maintained by the Core Development Team (a group of talented academics and programmers), so you can be reasonably sure that at least the core and recommended packages are well put-together. But what about that package that someone wrote as part of their undergraduate dissertation and put on CRAN because they “might as well”? How do you know it works? Has it been well written? Should we be relying on it?
If you work in a regulated industry, such as finance or pharmaceuticals, these are big questions. You want to use R when making clinical decisions, or decisions that could cost millions if they go wrong. Mango have been working with clients in the pharma and finance industries for over a decade, and many have come to us to talk about these concerns. Unfortunately, without performing any kind of formal testing all anyone can say is that R is widely used and it has a massive user base both testing and writing code on a daily basis. It was a calculated risk – R can save you a lot of money, boost productivity and open up many techniques that are not available in commercial statistical software, but you had to weigh that up against the chance that something may go wrong.
Over the past year, Mango have been working hard to improve our process for providing validated installations of R. Mango’s ValidR is a validated version of R which is available as either an off-the-shelf product, or a fully-customisable system. Taking a test-based approach to reduce risk Mango have assessed each package, identifying key requirements, and writing tests to ensure that the package is performing as expected. ValidR is delivered as a single installer, which installs R, all required packages, and also the Mango Validation Engine. The installation process ensures that the software has installed correctly, behaves as expected and meets our minimum performance requirements each time it is installed. We’ve designed the ValidR product to fully comply with FDA guidelines[i] for software validation, the guidance for regulatory compliance and validation issues[ii], created by the R Foundation for Statistical Computing and Mango’s own ISO 9001 quality requirements, so you can rest assured that it meets very high standards.
Over the past months, we’ve been delighted with the interest that ValidR has been generating, with many large pharmaceutical companies signing up for ValidR version 1.0. We’re looking forward to many more ValidR releases, and hope that the product will help those working in regulated industries sleep a little easier.
– See more at: http://www.mango-solutions.com/wp/2014/05/validation-why-bother/