Records of nearly 16,000 confirmed COVID-19 cases in the UK were lost because Public Health England reportedly configured Microsoft’s Excel to import CSV files from testing labs into the legacy XLS file format.
The BBC reports that Public Health England (PHE) developers botched the import of CSV files received from commercial firms that were contracted to analyze swab tests to detect who has the virus.
PHE had set up a system to automatically import the test labs’ CSV files into Excel spreadsheet templates, which would be then uploaded to a central system and shared with the NHS Test and Trace Team and other agencies.
SEE: Tableau business analytics platform: A cheat sheet (free PDF download) (TechRepublic)
But because PHE developers opted for Excel’s XLS (.xls) file format, which dates back to 2007, each CSV import was limited to about 65,000 rows of data.
Had PHE used the newer XLSX (.xlsx) format – the XML (Extensible Markup Language)-based format – Excel could have handled files with one million rows of data.
The error has hampered the UK’s contact-tracing program at a time when the country is undergoing a second wave of coronavirus infections.
By using the XLS format, PHE’s Excel template could only handle about 1,400 test results, each of which contain several rows of data. Anything beyond that limit failed to import to the Excel templates, resulting in the loss of an estimated 15,841 confirmed cases between 25 September and 2 October.
As the BBC notes, using XLSX would have allowed PHE to import 16 times the number of cases and avoid the error until the UK reached far higher levels of testing.
While nearly 16,000 files were lost for a few days, the Excel blunder potentially affects many more people who were in contact with the people confirmed to have COVID-19. The delay also risks creating another rise in infections that could have been avoided.
The Guardian reports that UK contact tracers are now racing to reach 50,000 people who should be self-isolating.
“All those individuals with positive results that were not entered into the system have contacts who remained an infection risk to others over this period and so we can expect that they will have already contributed extra infections, which we shall see over the coming week or so,” Rowland Kao, a professor of veterinary epidemiology and data science at Edinburgh University, told The Guardian.
PHE revealed the spreadsheet blunder on Sunday, three days after PHE discovered the issue.
“A technical issue was identified overnight on Friday, 2 October in the data-load process that transfers COVID-19 positive lab results into reporting dashboards,” said PHE interim chief executive Michael Brodie.
“After rapid investigation, we have identified that 15,841 cases between 25 September and 2 October were not included in the reported daily COVID-19 cases. The majority of these cases occurred in most recent days.”
On October 4, PHE announced there were 12,594 daily cases, nearly 5,000 more cases than it had recorded before discovering the spreadsheet glitch.