Site icon R-bloggers

How to simplify your code by using data flows

[This article was first published on gtdir, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

How can one effectively develop and manage code in large complex data analysis projects?

In the past I routinely developed conventions for naming my R scripts so that those scripts have prefixes to determine the sequence to run the scripts. I used this convention several years until I came across a massive data analysis task. I needed to process data generated by a trading algorithm that managed a portfolio of hundreds of stocks. The initial solution was clear — to write R scripts that manage other R scripts. So I persisted. However, some tasks had to be run manually, such as launching a sequence of R instances that would process data in parallel. Finally, it became clear that using script naming conventions and special folder structures were not an optimal solution as I had an even more complicated challenge ahead — connecting the algorithm to the market. The workflow was no longer hierarchical but had a structure that could only be conveyed by a graph with loops.

Thus came into being the platform pxWorks (www.pxworks.io, the screenshot is below). The platform is open source and the code is published under AGPLv3.

Some of the features of this platform are as follows:

More details and technical specifications of the platform can be found in the forum on the website of the project.

I am currently looking for collaborators and feedback to help me improve the software to make it even more useful to as many people as possible. Let me know if you need any new features. Participating is easy, just fork the code on GitHub and start extending the code base or report any issues you have.

I am also developing a production-stage algorithmic trading system using this platform, so leave a comment at the forum of the website www.pxworks.io if you are interested in that trading code being open sourced.

To leave a comment for the author, please follow the link and comment on their blog: gtdir.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.