P. Elbers, P. Thoral, T. Dam, L. Fleuren on behalf of the Dutch ICU Data Warehouse Collaborators
Laboratory for Critical Care Computational Intelligence, Department of Intensive Care Medicine, Amsterdam Medical Data Science, Amsterdam
UMC, Vrije Universiteit Amsterdam and University of Amsterdam, Amsterdam, the Netherlands
P. Elbers – firstname.lastname@example.org
Sharing is caring: how COVID-19 led to large scale collaboration for icudata.nl
For many intensivists worldwide, the pandemic will have created long-lasting memories. Some of them grim, such as massively overwhelmed ICUs, the dehumanising appearance of seemingly interchangeable and mostly proned patients, and healthcare professionals hidden behind personal protective equipment. Some of them energising, including surprising patient recoveries as well as the unprecedented praise and recognition for intensivists and intensive care medicine from both the media and society at large.
For the Laboratory for Critical Care Computational Intelligence at Amsterdam UMC, the pandemic proved to be nothing short of a rollercoaster ride. And despite the many challenges imposed upon our profession by the pandemic, this ride was largely fuelled by excitement, in particular the rapidly expanding enthusiasm for large-scale data sharing and collaboration between Dutch ICUs.
Obviously, our story started long before the COVID-19 crisis. Intensive care medicine is a natural habitat for data science as large amounts of data are routinely collected during intensive care treatment, such as those from devices for monitoring and life support. Our laboratory was created with the primary aim to unite clinical and data science expertise to use these data to improve the care and treatment of future critically ill patients. We do so by developing and validating models, integrating these into clinical decision support tools to be used at the bedside and evaluating their effect on outcomes relevant for critically ill patients.
Three of the most prominent results from our philosophy are AmsterdamUMCdb, the first freely available European ICU database under the European Society of Intensive Care Medicine / Society of Critical Care Medicine joint data sharing initiative, bedside decision support for personalised antibiotic dosing[3,4] and bedside decision support for preventing untimely patient discharge from intensive care units (ICUs). Because of these contributions to the field, our lab had the infrastructure and knowledge base to readily facilitate large-scale data sharing when the pandemic hit the Netherlands. Specifically, our expertise could facilitate sharing of high-frequency device data and most other clinical information from the electronic health record (EHR), with the goal to generate insights from ICU patient data as the pandemic was unfolding. These data were thought to reflect the large variation in COVID-19 related clinical practice resulting from the limited and rapidly evolving COVID-19 evidence base and possibly the large variation in patient characteristics and outcomes between centres. These variations may be leveraged by advanced statistics and machine learning to determine optimal individual patient management.
Right from the start of the project we experienced an unprecedented momentum. Data protection officers immediately offered help to provide a legal framework for responsible data sharing. Within days, our medical ethics committee approved our protocols. Data sharing agreements were drafted ensuring equal possibilities for data access for all participating ICUs. All hospitals in the Netherlands with an ICU were approached and documentation was reviewed locally before permission to participate was granted. With full support of the Dutch Society for Intensive Care (NVIC) and their research network RCCnet, 66 out of 81 ICUs confirmed their participation within weeks.
Template Structured Query Language (SQL) queries were developed to automatically extract EHR data for each of the major EHR systems used in the Netherlands: MetaVision, ChipSoft and Epic. Collected data cover the entire ICU stay and include demographics, data from devices for vital signs monitoring and life support, data on administered medication, laboratory results and data entered by the treatment team including clinical observations. All data were pseudonymised in the delivering hospitals.
An extraction, transformation and load process was designed to combine raw data from the different EHR systems. All parameters from the collaborating ICUs were manually reviewed and mapped to a common parameter ontology. Subsequently, a software data pipeline converted all parameter units as needed, filtered out data entry errors, calculated derived parameters and merged data into the data warehouse. Data quality control was a continuous process with internal validity checks from the providing hospitals and validation checkpoints throughout the software pipeline. Covering entire ICU admissions with highly granular data, the Dutch Data Warehouse is the largest COVID-19 dataset to date. It is a natural habitat for advanced statistics and machine learning, effectively extending the opportunities for high-frequency big data analysis provided by large ICU datasets such as MIMIC and AmsterdamUMCdb to the COVID-19 domain. So far, data from 23 ICUs on 1633 patients treated between March and October 2020 have been processed and added to the Dutch Data Warehouse, now containing over 120 million data points mapped to a common ontology of 875 parameter names.
Of course, data sharing among ICUs is far from new. In fact, intensivists in the Netherlands have been doing so for at least 25 years in the context of the National Intensive Care Evaluation (NICE). Their important contribution to the field should be applauded. However, their efforts are primarily focussed on benchmarking and quality improvement at the level of the ICU. The primary focus of the data warehouse is to improve the quality of care, treatment and outcome at the level of the individual patient, by leveraging high-frequency information during the entire ICU treatment.
While the first insights from the COVID-19 data are now being published and disseminated by webinars, we did discover that combining these data is a much slower process than we initially anticipated. This is partly related to the extensive data processing and quality control steps as described above, but most importantly to the fact that every Dutch ICU stores their data differently, even if they use the same type of EHR system.
Therefore, we are very excited that the NVIC, strongly supported by RCCnet, has now initiated a similar large-scale collaboration by Dutch ICUs to engage in large-scale data sharing on all critically ill patients. Again, this should help understand the timing and combination of treatments that may lead to better outcomes in a specific ICU patient. This collaboration is coordinated by our laboratory at Amsterdam UMC. There are strong ties with the NICE foundation, known for their experience in analysing data for ICU benchmarking. Machine learning partner Pacmed will use their expertise to combine and analyse the data. Zorgverzekeraars Nederland, uniting all major health insurance companies, will support the initiative with 2 million euros for the next five years.
This new collaboration between Dutch ICUs is called icudata.nl and is expected to give rise to the Dutch ICU Data Warehouse (figure 1). We are thrilled that it was immediately received with great enthusiasm among a large majority of ICUs of all sizes, including those from university medical centres, large teaching hospitals as well as smaller community hospitals. Their enthusiasm could make the Dutch Data Warehouse the largest of its kind worldwide.
It is our intention to extend the transparent collaboration as fuelled by the COVID pandemic to icudata.nl. We want to ensure equal possibilities for data access for all participating ICUs. The process to access the data for bona fide research and quality purposes to improve the care and treatment of future critically ill patients should be as easy and as little bureaucratic as possible. On the other hand, a governance structure with the participating hospitals, NVIC, NICE, representatives from patients, and other stakeholders is currently being designed. This structure should again ensure that data will be shared responsibly and in full compliance with all relevant laws and regulations. Participating ICUs will receive frequent updates and progress may also be monitored through www.icudata.nl. This is our story on how a severe crisis can give rise to a unique opportunity. Let’s keep up the momentum and join us if you have not done so already!
All authors declare no conflict of interest. No funding or financial support was received.
- The Laboratory for Critical Care Computational Intelligence. https://icudata.nl/indexlccci.html.
- Thoral P. AmsterdamUMCdb, the first freely accessible European ICU database under
the ESICM/SCCM Joint Data Sharing Initiative [Internet]. Available from: https://www.
- Roggeveen LF, Fleuren LM, Guo T, et al. Right Dose Right Now: bedside data-driven
personalized antibiotic dosing in severe sepsis and septic shock—rationale and
design of a multicenter randomized controlled superiority trial. Trials. BioMed Central.
- Roggeveen LF, Guo T, Driessen RH, et al. Right Dose, Right Now: Development of
AutoKinetics for Real Time Model Informed Precision Antibiotic Dosing Decision
Support at the Bedside of Critically Ill Patients. Front Pharmacol. 2020;11:646.
- Thoral PJ, Fornasa M, de Bruin DP, et al. Developing a Machine Learning prediction
model for bedside decision support by predicting readmission or death following
discharge from the Intensive Care unit. researchsquare.com; 2020; Available from:
- CovidPredict [Internet]. Available from: https://covidpredict.org/
- The Dutch ICU Data Warehouse. https://icudata.nl/