rOpenSci News Digest, January 2023
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Dear rOpenSci friends, it’s time for our monthly news roundup!
You can read this post on our blog. Now let’s dive into the activity at and around rOpenSci!
rOpenSci HQ
{targets} in Action Community Call
Tuesday, 31 January 2023 20:00 UTC / Tuesday, 31 January 2023 15:00 EST / Wednesday, 1st February 07:00 AEDT.
The {targets} package is a pipeline tool for Statistics and data science in R. With {targets}, you can maintain a reproducible workflow without repeating yourself. {targets} learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data.
On this call Will, Eric and Joel will share their experience putting {targets} into action. Eric will share with us Using {targets} with HPC and Joel will talk about Using {targets} for bioinformatics pipelines, then Will will demonstrate Debugging {targets} pipelines.
More info on the event page.
Coworking
Join us for social coworking & office hours monthly on first Tuesdays! Hosted by Steffi LaZerte and various community hosts. Everyone welcome. No RSVP needed. Consult our Events page to find your local time and how to join.
- Tuesday, Feb 7th, 9:00 Americas Pacific / 17:00 UTC “Setting up Continuous Integration” Hosted by community host Hugo Gruson and Steffi LaZerte
- Do some reading to learn about Continuous Integration; Setup Continuous Integration on one (or more) of your projects;
- Talk to Hugo and discuss how Continuous Integration can simplify your development process and how to get set up.
- Tuesday, Mar 7th, 9:00 Australian Western / 01:00 UTC “Checking data with naniar, visdat, assertr, and skimr” Hosted by community host Nick Tierney and Steffi LaZerte
- Explore documentation, use cases and tutorials on these packages, or check your data!
- Talk to Nick and discuss how to use these packages in your workflow.
And remember, you can always cowork independently on work related to R, work on packages that tend to be neglected, or work on what ever you need to get done!
Code of Conduct annual review and transparency report
Find our annual review of the rOpenSci Code of Conduct, reporting process, and internal guidelines for handling reports and enforcement as well as our transparency report. We thank our two independent community members Megan Carter (until June 2022) and Kara Woo.
Software 📦
New packages
The following two packages recently became a part of our software suite:
-
hoardr, developed by Tamás Stirling together with Scott Chamberlain: Suite of tools for managing cached files, targeting use in other R packages. Uses rappdirs for cross-platform paths. Provides utilities to manage cache directories, including targeting files by path or by key; cached directories can be compressed and uncompressed easily to save disk space. It is available on CRAN.
-
phruta, developed by Cristian Roman Palacios: The phruta R package is designed to simplify the basic phylogenetic pipeline. Specifically, all code is run within the same program and data from intermediate steps are saved in independent folders. Furthermore, all code is run within the same environment which increases the reproducibility of your analysis. phruta retrieves gene sequences, combines newly downloaded and local gene sequences, and performs sequence alignments. It has been reviewed by Anna Krystalli, Rayna Harris, and Frederick Boehm.
Discover more packages, read more about Software Peer Review.
New versions
The following fifteen packages have had an update since the last newsletter: datefixR (v1.4.0
), GSODR (v3.1.7
), ijtiff (v2.3.0
), jagstargets (1.1.0
), nasapower (v4.0.9
), opentripplanner (0.5.0
), ReLTER (2.0.0
), rgbif (v3.7.5
), rtweet (v1.1.0
), skimr (v2.1.5
), stantargets (0.1.0
), tarchetypes (0.7.4
), targets (0.14.2
), tidytags (v1.1.1
), and writexl (v1.4.2
).
Software Peer Review
There are twelve recently closed and active submissions and 2 submissions on hold. Issues are at different stages:
-
One at ‘6/approved’:
- phruta, Phylogenetic Reconstruction and Time-dating. Submitted by Cristian Román Palacios.
-
Two at ‘5/awaiting-reviewer(s)-response’:
-
stochLAB, Stochastic Collision Risk Model. Submitted by Grant. (Stats).
-
rb3, Download and Parse Public Data Released by B3 Exchange. Submitted by Marcelo S. Perlin.
-
-
Three at ‘4/review(s)-in-awaiting-changes’:
-
octolog, Better Github Action Logging. Submitted by Jacob Wujciak-Jens.
-
tsbox, Class-Agnostic Time Series. Submitted by Christoph Sax. (Stats).
-
healthdatacsv, Access data in the healthdata.gov catalog. Submitted by iecastro.
-
-
Four at ‘3/reviewer(s)-assigned’:
-
waywiser, Ergonomic Methods for Assessing Spatial Models. Submitted by Michael Mahoney. (Stats).
-
openalexR, Getting Bibliographic Records from OpenAlex Database Using DSL. Submitted by Trang Le.
-
dfms, Dynamic Factor Models. Submitted by Sebastian Krantz.
-
wmm, World Magnetic Model. Submitted by Will Frierson.
-
-
One at ‘2/seeking-reviewer(s)’:
- bssm, Bayesian Inference of Non-Linear and Non-Gaussian State Space. Submitted by Jouni Helske. (Stats).
-
One at ‘1/editor-checks’:
- ohun, Optimizing Acoustic Signal Detection. Submitted by Marcelo Araya-Salas.
Find out more about Software Peer Review and how to get involved.
On the blog
-
rOpenSci Code of Conduct Annual Review by Yanina Bellini Saibene, Mark Padgham, Kara Woo, and Megan Carter. Updates for version 2.4 of rOpenSci’s Code of Conduct.
-
rOpenSci 2022 Code of Conduct Transparency Report by Yanina Bellini Saibene, Mark Padgham, and Kara Woo. rOpenSci 2022 Code of Conduct Transparency Report.
-
Expanding our Community through Multilingual Publishing by Yanina Bellini Saibene, Pao Corrales, Elio Campitelli, and Maëlle Salmon. We are translating rOpenSci’s materials on best practices for software development, code review, and contribution to open source projects into Spanish! We are also developing guidelines and tools for translating open source resources to reach a wider audience. Learn about the project in this blog post.
-
Agrandando nuestra comunidad con publicaciones multi-idioma by Yanina Bellini Saibene, Pao Corrales, Elio Campitelli, and Maëlle Salmon. Estamos traduciendo los materiales de rOpenSci sobre buenas prácticas de desarrollo de software, revisión de código y contribución a proyectos de código abierto al español! También estamos desarrollando guías y herramientas para traducir recursos de código abierto y alcanzar una mayor audiencia. Entérate de este proyecto en este artículo.
Tech Notes
- curl 5.0.0: massive concurrent downloads and HTTP/2 by Jeroen Ooms. A new major version of the curl package has been released to CRAN. This release both brings major big internal improvements as well as new user-facing functionality, in particular with respect to concurrent downloads.
Call for maintainers
Calls for maintainers
-
RSelenium, R Bindings for ‘Selenium WebDriver’. Provides a set of R bindings for the ‘Selenium 2.0 WebDriver’ using the ‘JsonWireProtocol’. ‘Selenium 2.0 WebDriver’ allows driving a web browser natively as a user would either locally or on a remote machine using the Selenium server it marks a leap forward in terms of web browser automation. Issue for volunteering.
-
elastic, General Purpose Interface to ‘Elasticsearch’. Connect to ‘Elasticsearch’, a ‘NoSQL’ database built on the ‘Java’ Virtual Machine. Interacts with the ‘Elasticsearch’ ‘HTTP’ API, including functions for setting connection details to ‘Elasticsearch’ instances, loading bulk data, searching for documents with both ‘HTTP’ query variables and ‘JSON’ based body requests. Issue for volunteering.
-
citesdb,a high-performance database of shipment-level CITES trade data. Provides convenient access to over 40 years and 20 million records of endangered wildlife trade data from the Convention on International Trade in Endangered Species of Wild Fauna and Flora, stored on a local on-disk, out-of memory ‘DuckDB’ database for bulk analysis. Issue for volunteering
Call for comaintainers
Refer to our recent blog post to identify packages where help is especially wished for!
Package development corner
Some useful tips for R package developers. 👀
Bad code? Good code?
Why not feel code shame when looking at older, less good code of yours? In case you might ignore it, this 2015 blog post by David Robinson underlines how important it is to not code shame anyone lest they lose the courage to keep coding and improving: “A Million Lines of Bad Code”. And remember testthat always believes in you. 😉
Write an R Package from R Markdown?
If you ever dreamed of writing an R package from R Markdown, check out the fusen package by Sébastien Rochette or the litr package by Jacob Bien and Patrick Vossler.
Example of help guidance
The targets manual (by targets maintainer Will Landau) has an interesting chapter on how to ask for help that might inspire other contributing guides! Note the explanation of “out of office” periods.
External libraries and the rOpenSci build system
Packages needing external system libraries should specify those libraries in the “SystemRequirements” field of the “DESCRIPTION” file. Most package installation systems will parse “SystemRequirements” entries using the rules provided at rstudio/r-system-requirements. Any packages not listed in the “rules” sub-folder of that repository will generally not be automatically installed. It is nevertheless still useful to list all external dependencies, to at least aid manual installation. Our rOpenSci build system includes libraries listed in the r-universe-org/base-image. That list of libraries can easily be extended, so please contact us, or submit a pull request, if you’d like our system to include any additional system libraries.
Last words
Thanks for reading! If you want to get involved with rOpenSci, check out our Contributing Guide that can help direct you to the right place, whether you want to make code contributions, non-code contributions, or contribute in other ways like sharing use cases.
If you haven’t subscribed to our newsletter yet, you can do so via a form. Until it’s time for our next newsletter, you can keep in touch with us via our website and Twitter account.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.