What is a data lake?
A data lake is a storage method involving the retention of massive amounts of raw data in their native format or with very light processing. The data are extracted right from the CHUM’s source systems and deposited directly into the lake. Unlike a data warehouse, the data lake provides additional flexibility by making it possible to process more than one type of data.
CITADEL’s data lake allows for the integration of data from different clinical systems (laboratories, clinical records, imaging, vital signs, etc.).
How does CITADEL data lake work?
CITADEL moves clinical and administrative data, research data, and data from the Centre hospitalier de l’Université de Montréal’s (CHUM’s) different computer systems into its data lake.
In addition to storage, CITADEL’s mandate offers data extraction and analysis services tailored to the projects submitted. For example, a user may submit a request for access to some of these data. They will be extracted from the lake, processed into a relevant format and deposited in a secure space at the CHUM reserved for the project to allow the data to be analyzed as needed.
Users can have access to these data, complying with certain strict predetermined ethical and legal regulatory criteria, and must observe CITADEL’s management framework (refer to the management framework).
Once the data has been extracted, what's the next step?
When the data are extracted, research teams that want to can analyze the data in a secure space to answer their research questions. If further analyses are required, CITADEL also offers a consultation and analysis service by a team specializing in the analysis of health data.
What are CITADEL'S objective?
What are the data that can be made available thanks to CITADEL’s data lake?
Aggregated data, for example:
Requests for access to data sets, for example:
In addition to restricting access to data to individuals entitled to it (through the required regulatory, ethical and legal approvals), the data are depersonalized or de-identified from the outset. Additional restrictions may also be implemented depending on the nature of the request.
Where is the data kept?
The data are kept in the CHUM’s secure enclosure.
Who has access to the data?
Data can be accessed by individuals with the required regulatory and legal approvals. Access by the members of a research team must be endorsed by the researcher responsible for the project at the institutional level.
How is access to data made possible?
Access to data via CITADEL is made possible through a robust governance structure and strict regulatory monitoring. The CITADEL management framework describes the legal and regulatory framework underlying data access, governance, and the terms of data access.
How is data confidentiality ensured?
In addition to restricting data access to individuals entitled to it (through the required regulatory, ethical and legal approvals), data are depersonalized or de-identified at the outset. Additional restrictions may also be implemented depending on the nature of the request.
What services are offered by CITADEL in terms of statistics?
CITADEL’s team of experts in biostatistics can help you in different aspects of your research work:
Are CITADEL'S services free?
The services offered by CITADEL follow the fee schedule of the CRCHUM’s core scientific facilities.
What are the costs?
The costs related to a project depend on the nature and complexity of the project. During the initial assessment of a project, a quote is provided to the user. The project begins when CITADEL and the user agree on the costs involved. During the course of the project, if CITADEL’s mandate changes (increases or decreases), the user will be notified and no expense will be incurred without the express authorization of the person in charge of the project.