13 August 2020
Co-location, co-location, co-location: using hospital data to predict healthcare-acquired infection
In this blog Jeff Rewley discusses a method of predicting healthcare-acquired infection using hospital data, and how it may apply to contact tracing for COVID-19. The method counts the number of hours a patient spends in the same hospital ward as a patient suspected of infection, and specifically predicts a variety of infections with high accuracy.
Dr. Rewley is an Advanced Fellow for Health Services Research at the Philadelphia VA Medical Center and the Perelman School of Medicine. He completed his PhD in Social Epidemiology at the NIH and Oxford University, where the work discussed below was completed.

The COVID-19 pandemic has highlighted the importance of timely identification of likely-infected persons. Earlier detection of infection can lead to faster treatment, more effective quarantine, and improved contact tracing to limit the spread of the disease. 

COVID-19 has also highlighted that physical tests alone are not sufficient to identify all infected persons and perform contact tracing. Mobile phone apps to perform 'digital contact tracing' are being developed to tackle this issue, but these raise serious ethical and privacy issues. Questions such as how the data is transmitted between devices, what personal health information (PHI) is transmitted, and how data are secured contribute to the complexity of such an app and the surrounding ethics discussions.

Fortunately, a setting exists where each individual is assigned a particular space, and individuals are regularly tested for infection: the hospital. In our recent JHI paper, research conducted by myself and colleagues from both the US and UK used the existing infrastructure of Hospital Administrative Data (HAD) and Electronic Medical Record data (EMR) to examine whether the quantity of time a patient spends in a ward with other patients clinically suspected of infection can be used as a tool to predict subsequent healthcare-associated infection. 

The HAD included information on a patient’s assigned space in the hospital (such as current hospital ward), and the EMR included results of infectious disease tests. Combining these data sources allowed us to monitor patients for prolonged spatial proximity to a patient suspected of infection (based on being assigned to the same ward or ward bay). Because every patient’s proscribed space was recorded throughout their stay as part of the hospital’s record-keeping (although specific moment-to-moment movements were not tracked), many of the data privacy worries attached to new mobile apps were not relevant. The hospital therefore served as an ideal proving ground to test some form of digital contact tracing and its ability to identify potential infection early and accurately. Because the data were not pinpoint-accurate on moment-to-moment movement, but the approach was still highly predictive of infection, digital contact tracing apps may be able to 'fuzzify' (add statistical noise) real-time data to safeguard personal data while still remaining highly predictive. 

By counting the number of hours a patient was in the same hospital ward as other patients suspected of infection, we were able to specifically predict subsequent infection for a variety of pathogens. This approach has many advantages over traditional contact tracing. One advantage is the HAD is an always-on system, thereby reducing worries about patient recall bias of potential contacts. We were also able to parse contact time more finely than contact tracing often has the luxury to do: rather than ‘any’ vs. ‘no’ contact, we were able to test the predictive utility across the whole continuum of contact time. We found that it often took a significant amount of time (more than one day) before maximum predictive ability accounting for both sensitivity and specificity was reached. 

In addition to the strength of the prediction, the research also indicated benefits based on when we identified infected patients. The digital contact tracing identified infected persons up to an average of a day before they were actually tested for the infection. If done prospectively, this could mean that infected patients are tested and quarantined earlier. 

The results of our study therefore have implications for stemming hospital outbreaks, and can inform longer-term design of digital contact tracing apps which work from many of the same fundamentals. Although the research was retrospective, we limited the data available for prediction based on what would be available in real-time. Hospitals should therefore be able to implement this approach to identify, or aid in identifying, infected patients. 

With respect to the current COVID-19 pandemic, our research shows the benefits of digital contact tracing. For hospitals, because HAD infrastructure is largely in place, it can be readily deployed for predicting COVID-19 without large overhead costs. Outside of hospitals, the prediction algorithm can be applied within digital contact tracing apps to improve prediction or to address ethical issues surrounding data collection and transfer. By monitoring the co-location of patients and not requiring surveys or other traditional contact tracing approaches, this method allows for real-time evaluation of contact, and for reporting those who were predicted to be infected. 

The microcosm of the healthcare environment can therefore be expanded to use these results in identifying infections outside as well.