How well prepared are we to rapidly analyse a new influenza pandemic? A brief perspective on analysis conducted for UK government advisory groups during COVID-19

[This article was first published on Epiverse-TRACE developer space, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

With multiple reports of influenza H5N1 cases that have no clear animal exposure, it is useful to consider what kinds of analysis would be required, and how easily this could be performed. As a starting point, this post reflects on some of the real-time work my colleagues and I contributed to inform the UK response to COVID-19.

Reflections on COVID-19 Contributions

During COVID-19, academic participants like myself in the UK contributed analysis to the SPI-M-O advisory group, which focused on epidemiology and modelling. This was a subgroup of the Scientific Advisory Group for Emergencies.

These contributions typically fell into two main categories:

  • Reports in response to specific questions from the Secretariat (e.g. exploring implications of policy options).
  • Reports or preliminary results detailing broader epidemiological insights about COVID-19 my colleagues and I thought were noteworthy (e.g. unusual patterns with novel variants).

This post focuses on the analysis reports that I made a major contribution to as a member of SPI-M-O in the first 18 months after COVID was identified as a threat, i.e. between Jan 2020 and July 2021 (a full list has previously been published). Narrative reports or analysis that did not involve substantial analytics or modelling (e.g. just direct plots of data) are not included here. I also include some major early piece of analysis that did not form SPI-M-O reports, but were published and informed subsequent analysis and modelling.

If another pandemic were to hit, how easily and quickly could we do these analyses again? To document where we currently are, it is important to understand where potential gaps and bottlenecks are. For each report, I therefore review three main criteria:

  1. Code availability: is the original analysis code public? (With link if relevant, or context if not.)
  2. Package availability: is the analysis code or underlying method currently packaged up for easy reuse?
  3. Task readiness time: roughly how long would it take to get the code or package into a rough state where it could re-run an equivalent analysis using the characteristics of a future transmissible H5N1 influenza? And how long to do so while also following robust best practice Epiverse-TRACE development principles, so others can easily build on the analysis? Would it take minutes (i.e. possible to run immediately), hours, days or weeks to get the basic functionality working?

Criteria (1) and (2) are either marked as available (✅), not available (❌), or partially available (⏳). Time taken is also divided into ‘rough’ (rapid, imperfect code to deliver a task) and ‘robust’ (i.e. best practice development for future re-use) to reflect wider discussions about what constitutes ‘good enough’ work in a pandemic. I also suggest some areas for potential further development, or links to ongoing development that will enable easier completion of tasks in future. The focus of the post tends to be packages that sit on the CMMID, Epiverse-TRACE, Epiforecasts or my GitHub repositories, because these were most directly related to the tasks being discussed, but some wider packages are also signposted.

This is not an exclusive list of work performed by myself and colleagues in the Centre for Mathematical Modelling of Infectious Diseases at LSHTM; there is a larger CMMID repository of real-time work, as well as a large volume of published papers in academic journals and reports on the gov.uk website. However, I hope this initial post can provide a useful summary of tasks that were performed, and framework for evaluating wider efforts required for a pandemic.

The overall effectiveness of the UK response, and which areas could be strengthened, are topics currently being examined by the COVID Inquiry, and will not be covered here. If readers are interested, there are some broader reflections from members of the UK modelling community, and recommendations for improvements, in Sherratt et al, 2024, Wellcome Open Research.

28 Jan 2020: Early estimation of transmissibility and control

Code ✅ | Package ⏳ | Days (rough), Weeks (robust)

This analysis in early 2020 focused on estimation of transmissiblity and subsequent effect of lockdown control measures, and brought together reported cases in China, exported cases identified internationally, and infections detected on evacuation flights. A stochastic SEIR model was fitted with sequential Monte Carlo to estimate how varied over time, to distinguish between an epidemic that ended because of control vs immunity. Later published as Kucharski et al, Lancet Inf Dis, 2020.

The code used to generate estimates for the probability of a large outbreak given are now available in the Epiverse-TRACE {superspreading} package.

Suggested or ongoing development: There are now much more efficient methods for performing the main real-time inference analysis, particularly the {dust} toolkit in combination with {mcstate}. As future tools will have to be flexible enough to be applicable in a wide range of modelling scenarios with different data sources, the main ongoing task is to ensure these are well documented and have been tested with relevant examples that can serve as templates for future work. The {seir} package is collating a library of simple model fitting implementations and a branch has implemented estimation of a fixed value in an SEIR model using early exported COVID cases. The next step would be to implement an example with a time-varying reproduction number.

2 Feb 2020: Early analysis of contact tracing effectiveness

Code ✅ | Package ✅ | Hours (rough), Days (robust)

Another early analysis was a paper for SPI-M-O: Feasibility of controlling 2019-nCoV outbreaks by isolation of cases and contacts, later published as Hellewell et al, Lancet Global Health (2020). It used a branching process model to explore how transmission (e.g.  and % presymptomatic transmission) and control parameters (e.g. % contacts traced) could influence the risk of a large outbreak.

Suggested or ongoing development: There are several issues on the {ringbp} package repo that, once complete, would allow for faster implementation for new pathogens, especially in combination with {epiparameter}. The {epi.branch.sim} package, which is based on the Hellewell et al paper, also offers an arguably more developed package for plug-and-play analysis in the meantime. The {epichains} package also allows for estimation of simpler branching processes (i.e. without targeted control like contact tracing).

3 Mar 2020: Early estimation of severity

Code ✅ | Package ✅ | Hours (rough), Days (robust)

This analysis estimated the infection and case fatality ratio by age for COVID-19 using age-adjusted data from the outbreak on the Diamond Princess cruise ship. Later published as Russell et al, Eurosurveillance, 2020.

The methods used for this analysis are now implemented as the Epiverse-TRACE {cfr} package. There is also a ‘how to’ example for age-specific CFR estimation.

Suggested or ongoing development: There are several GitHub issues open that aim to strengthen {cfr}, especially for edge cases (e.g. very high CFR) or uncertainty when estimating underascertainment.

11 Mar 2020: pre-COVID social mixing patterns

Data ✅ | Package ❌ | Hours (rough), Days (robust)

Another early paper for SPI-M-O focused on social mixing patterns in the UK from the 2017/18 BBC public science project: Some results from the BBC project on contact rates by context and age, later expanded into Klepac et al, MedRxiv, 2020. The underlying contact matrices were made available alongside the paper.

Suggested or ongoing development: It would be useful to incorporate data into {socialmixr} to enable easy re-use in R. There is also an issue to add eigenvector calculation, alongside the calculation already implemented in the {finalsize} package, to illustrate which age groups drive early epidemic growth. There is also an incoming training episode on contact matrices.

3 Mar 2020 onwards: Early population-level scenarios

Code ✅ | Package ✅ | Hours (rough), Days (robust)

A collection of population-level scenario modelling reports generated between February and April 2020 was later published as a summary paper, Davies et al, Lancet Public Health, 2020. The original model, known as covidm, had a code base that would be reused for multiple epidemic waves, including novel variants. However, many the basic scenarios can be now be explored using the Epiverse-TRACE {epidemics}. In particular, {epidemics} can simulate scenarios with multiple overlapping interventions targeting different age groups, and reflect uncertainty in . A version of this package has already been used to project future outbreak scenarios in Gaza.

Some further reflections on specific pieces of COVID analysis for SPI-M-O in 2020 are listed below:

Suggested or ongoing development: There is work in progress with {epidemics} to allow a more flexible and editable {odin} back end, in case different features are required in future. This will have the advantage of combining the plug-and-play ability to rapidly define age-specific contact structure, demography, parameter uncertainty and overlapping interventions using {epidemics} syntax with a fast and adaptable {odin} simulation model.

Detailed contact tracing analysis

Code ✅ | Package ✅ | Days (rough), Weeks (robust)

This collection of individual-level testing and contact tracing modelling reports, which made use of the BBC social mixing data, was later published as a summary paper, Kucharski et al, Lancet Inf Dis, 2020. The original model was in R, and was later converted into a Python library as part of the Royal Society Delve initiative, feeding in to follow up analysis. This was a rapidly developed bespoke model with multiple types of contact (e.g. home, school, work, other) and an approximated transmission dynamic rather than full simulation (i.e. if 50% of infectious contacts are traced half-way through their likely infectious period, it would cut by 25%).

Suggested or ongoing development: For a future epidemic, it may be more useful to merge these concepts into two types of tool, building on succesful outputs for COVID: 1) a ‘ready reckoners’ method that shows very intuitively how contact changes influence overall (this could make use of functionality in{finalsize}), and 2) a more comprehensive model of isolation and quarantine, like the prospective {epinetwork} package.

Some further reflections on specific analysis reports for SPI-M-O are below:

3 June 2020: Analysis of superspreading

Code ❌ | Package ✅ | Minutes (robust)

This paper for SPI-M-O, Analysis of SARS-CoV-2 transmission clusters and superspreading events, provided different metrics to summarise the superspreading features of SARS-CoV-2. These functions are now in the {superspreading} package, with examples in the Epiverse-TRACE training.

10 June 2020: Analysis of forwards and backwards tracing

Code ✅ | Package ❌ | Hours (rough), Days (robust)

This paper for SPI-M-O, Branching process modelling of effectiveness of forward and backward tracing for SARS-CoV-2 control was later published as Endo et al, Wellcome Open Res, 2020.

Suggested or ongoing development: Although the underlying model isn’t in a package, the analysis is featured in the Epiverse-TRACE training. The core insight is also a relatively simple equation, i.e. that backward tracing would be expected to identify cases, where is the dispersion parameter, so should add this to relevant training and/or vignettes.

14 Oct 2020: Testing and contact tracing in a real-world network

Code ✅ | Package ✅ | Days (rough), Weeks (robust)

This paper for SPI-M-O, Modelling effectiveness of TTI and physical distancing in controlling SARS-CoV-2 in high and low prevalence communities, based on UK contact network data built on the Firth et al, Nature Med, 2020. This used BBC contact network data from Haslemere to investigate interventions in clustered networks, leading to a package {covidhm} that built on {ringbp}. This package was subsequently also used for analysis of outbreak dynamics on Singapore test cruises.

Suggested or ongoing development: Once {ringbp} is stable, there is scope to expand to include the above functionality with the placeholder {epinetwork} package, which is a fork of the more specific {covidhm} implementationn.

Oct-Dec 2020: Strategies for PCR and lateral flow testing

The below papers for SPI-M-O used data on PCR and lateral flow performance to investigate different testing strategies.

21 Oct. Modelling frequent testing using PCR and lateral flow based on detection probabilities estimated from regular testing of health care workers. This paper used testing data from UCLH to infer the probability of test positivity post infection. It would later be published as Hellewell et al, BMC Medicine, 2021.

Code ✅ | Package ⏳ | Days (rough), Weeks (robust)

Suggested or ongoing development: This analysis focused on test positivity as an outcome, tailored to data available from the UCLH study (PCR + paired serology) but a more detailed framework could use Ct data as well, such as the codebase for LEGACY Ct modelling and {epikinetics} package currently in progress for antibody kinetics (which could also be adapted to other biological timescales.

2 Dec. Estimating detection of infection among household gathering attendees based on one-off pre-gathering lateral flow tests. This paper used posteriors from the above analysis to explore different testing scenarios for family gatherings.

Code ❌ | Package ❌ | Minutes (rough), Days (robust)

Suggested or ongoing development: The underlying equations in this analysis are quite simple (i.e. no more than a few lines of code), but could form a useful helper package. James Hay also built a somewhat related Shiny app (code here) linked to an accompanying paper about intution behind testing performance.

9 Mar 2021: Potential for herd immunity against the Alpha variant

Code ✅ | Package ❌ | Minutes (rough), Hours (robust)

This paper for SPI-M-O looked at the potential for vaccination-induced herd immunity against SARS-CoV-2, based on and vaccine effectiveness. It would be later published as Hodgson et al, Eurosurveillance, 2021.

Suggested or ongoing development: The basic calculation was relatively simple (, which holds regardless of age mixing assumptions, as long as the correct has been derived for the population of interest). However, {finalsize} has the functionality required to estimate this (or ) for a given population and immunity structure (e.g. from prior infection).

May-Jun 2021: Transmission dynamics of the Delta variant

Code ✅ | Package ❌ | Days (rough), Weeks (robust)

This collection of reports for SPI-M-O/SAGE analysed the transmission dynamics of the B.1.617.2 (Delta) variant in the UK, untangling imported infections from community transmission. The real-time model was coded in R, with fitting via MLE and then MCMC (as parameter space increased), with a stan prototype of the model also developed by Sam Abbott.

Suggested or ongoing development: The current code base is tailored to COVID-19, but the broader issue of distinguishing external importations (which may be known to some extent, e.g. based on timing of travel ban) from domestic human-to-human transmission also comes up for infections like avian influenza and mpox. If cases can be disaggregated into imported or domestic origin, then {EpiEstim} can calculate domestic transmission based on these data (Thompson et al, Epidemics, 2019). However, estimation packages based on renewal processes, like {EpiEstim} and {EpiNow2} are not structured to infer such dynamics if the exact number of importations are unknown, so in future it may be useful to have a framework that builds on the Golding et al two component approach developed for the Australian COVID response (accompanying code here).

1 Jun 2021: Analysis of social contact data during reopening

Code ✅ | Package ⏳ | Hours (rough), Weeks (robust)

Two papers for SPI-M-O/SAGE looked at social contact dynamics during reopening:

CoMix data are now available as part of the {socialmixr} package, with code for the secondary case distribution released with the Chapman et al pre-print.

Suggested or ongoing development: The methods used for ‘first principles’ reconstruction of could be included in a future social mixing analysis package, but would benefit from a larger database of viral load trajectories for different pathogens (currently available for SARS-CoV-2, but in theory estimatable for a range of acute infectious diseases).

Concluding thoughts

Code was generally made available alongside public reports by LSHTM during COVID-19, except for very simple calculations. Several key functions are now available in R packages developed since the emergence of COVID-19, turning tasks that would have taken days into tasks that require only hours, but there are still some remaining bottlenecks to ensure that the methods would be applicable easily to H5N1, meaning some customised tasks for pandemic influenza would take hours when – with some further refinement – they could take minutes.

Reuse

Citation

BibTeX citation:
@online{kucharski2024,
  author = {Kucharski, Adam},
  title = {How Well Prepared Are We to Rapidly Analyse a New Influenza
    Pandemic? {A} Brief Perspective on Analysis Conducted for {UK}
    Government Advisory Groups During {COVID-19}},
  date = {2024-12-23},
  url = {https://epiverse-trace.github.io/posts/covid-analysis/},
  langid = {en}
}
For attribution, please cite this work as:
Kucharski, Adam. 2024. “How Well Prepared Are We to Rapidly Analyse a New Influenza Pandemic? A Brief Perspective on Analysis Conducted for UK Government Advisory Groups During COVID-19.” December 23, 2024. https://epiverse-trace.github.io/posts/covid-analysis/.
To leave a comment for the author, please follow the link and comment on their blog: Epiverse-TRACE developer space.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)