Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Projects chapter of my Empirical software engineering book has been added to the draft pdf (download here).
This material turned out to be harder to bring together than I had expected.
Building software projects is a bit like making sausages in that you don’t want to know the details, or in this case those involved are not overly keen to reveal the data.
There are lots of papers on requirements, but remarkably little data (Soo Ling Lim’s work being the main exception).
There are lots of papers on effort prediction, but they tend to rehash the same data and the quality of research is poor (i.e., tweaking equations to get a better fit; no explanation of why the tweaks might have any connection to reality). I had not realised that Norden did all the heavy lifting on what is sometimes called the Putnam model; Putnam was essentially an evangelist. The Parr curve is a better model (sorry, no pdf), but lacked an evangelist.
Accurate estimates are unrealistic: lots of variation between different people and development groups, the client keeps changing the requirements and developer turnover is high.
I did turn up a few interesting data-sets and Rome came to the rescue in places.
I have been promised more data and am optimistic some will arrive.
As always, if you know of any interesting software engineering data, please tell me.
I’m looking to rerun the workshop on analyzing software engineering data. If anybody has a venue in central London, that holds 30 or so people+projector, and is willing to make it available at no charge for a series of free workshops over several Saturdays, please get in touch.
Reliability chapter next.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.