Influenza pandemics are potentially the most serious natural catastrophes that affect the human population. New findings published in PLOS Computational Biology suggest that with both timely and accurate data and sophisticated numerical models, the likely impact of a new pandemic can be assessed quickly, and key decisions made about potential mitigation strategies.

Novel strains of influenza emerge periodically and can pose major challenges for health planners. The 1918 Spanish flu, for example, was responsible for the deaths of some 50 million people. A recent 2015 report by the UK’s Cabinet Office, National Risk Register of Civil Emergencies, identified Pandemic Influenza as the highest priority natural hazards risk. When faced with a new emerging strain of virus, policy makers would like to know: (1) how many people are going to be affected? (2) how severe will it be? And (3) what successful mitigation strategies can be implemented?

In a new study by a team of international researchers representing commercial, academic, and government institutions, these questions have been addressed through the study of a unique set of data from active duty personnel of the U.S. military and the development of a sophisticated mathematical model. The team focused on the 2009 pandemic (known as the “swine flu“). They created profiles of incidence from detailed recorded visits to military clinics for all major military installations. They also developed a tailor-made model to ingest these data and jointly estimate both the transmissibility of the pandemic as well as its severity.

Transmissibility is generally estimated through the parameter known as the basic reproductive number, R0, which is the average number of secondary cases generated by one typically infectious individual in an otherwise susceptible population. In other words, it is the number of people that an infectious person is likely to infect. For influenza, this is typically between 1.5-3. Severity can be estimated in a number of ways. For this study, the authors estimated the fraction of those infected who actually presented themselves to a clinic (pC).

The authors were able to demonstrate that timely data from early-infected military bases could inform the model and produce robust predictions for the later large-scale outbreak across the USA. Additionally, the model estimated the fraction of those with more serious conditions. While the 2009 pandemic was, in retrospect, a mild pandemic (R0 was estimated to be 1.35, and pC was estimated to be 7%), it served as an ideal test bed for developing a general predictive tool that can be applied in the early stages of a future pandemic.

To test the benefit of this approach, the authors simulated a future moderate pandemic strain with pC approximately 10 times that of 2009. The results showed that even before the peak had passed the first affected population, both R0 and pC could be well estimated and predicted for all populations. Additionally, they were able to show what the effects of mitigation strategies would be in terms of the total number of people infected and the severity of their infection.

The study highlighted the importance of: (1) using a two-dimensional space for assessing future novel respiratory pathogens (R0 and severity); (2) timely and accurate incidence data; and (3) the use of models specifically honed for the particular data being assessed.

The investigation was based on a large database of clinical visit data. Such databases are being constructed in many different countries at the moment. The fact that the analysis described in this paper could be conducted on any such database adds to the justification that these information resources should be generated in real-time and made available to the best data science tools.