My slides and paper submissions for Prob Prog 2021

Bob Carpenter

5 months ago

[This article was first published on R – Statistical Modeling, Causal Inference, and Social Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Prob Prog 2021 just ended. Prob Prog is the big (250 registered attendees, and as many as 180 actually online at one point) probabilistic programming conference. It’s a very broadly scoped conference.

The online version this year went very smoothly. It ran a different schedule every day to accommodate different time zones. So I wound up missing the Thursday talks other than the posters because of the early start. There was a nice amount of space between sessions to hang out in the break rooms and chat.

Given that there’s no publication for this conference, I thought I’d share my slides here. The talks should go up on YouTube at some point.

Slides: What do we need from a PPL to support Bayesian workflow?

There was a lot of nice discussion around bits of workflow we don’t really discuss in the paper or book: how to manage file names for multiple models, how to share work among distributed teammates, how to put models into production and keep them updated for new data. In my talk, I brought up issues others have to deal with like privacy or intellectual property concerns.

My main focus was on modularity. After talking to a bunch of people after my talk, I still don’t think we have any reasonable methodology as a field to test out components of a probabilistic program that are between the level of a density we can unit test and a full model we can subject to our whole battery of workflow tests. How would we go about just testing a custom GP prior or spatio-temporal model component? There’s not even a way to represent such a module in Stan, which was the motivation for Maria Gorinova‘s work on SlicStan. Ryan Bernstein (a Stan developer and Gelman student) is also working on IDE-like tools that provide a new language for expressing a range of models.

Then Eli Bingham (of Pyro fame) dropped the big question: is there any hope we could use something like these PPLs to develop a scalable, global climate model? Turns out that we don’t even know how they vet the incredibly complicated components of these models. Just the soil carbon models are more complicated than most of the PK/PD models we fit and they’re one of the simplest parts of these models.

I submitted two abstracts this year and then they invited me to do a plenary session and I decided to focus on the first.

Paper submission 1: What do we need from a probabilistic programming language to support Bayesian workflow?

Paper submission 2: Lambdas, tuples, ragged arrays, and complex numbers in Stan

P.S. Andrew: have you considered just choosing another theme at random? It’s hard to imagine it’d be harder to read than this one.

To leave a comment for the author, please follow the link and comment on their blog: R – Statistical Modeling, Causal Inference, and Social Science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.