CC BY-NC-ND 4.0 · Yearb Med Inform 2023; 32(01): 104-110
DOI: 10.1055/s-0043-1768721
Section 2: Cancer Informatics
Survey

Clinical Informatics Approaches to Facilitate Cancer Data Sharing

Sanjay Aneja
1   Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, USA
2   Center for Outcomes Research and Evaluation at Yale, New Haven, CT, USA
3   Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
,
Arman Avesta
1   Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, USA
2   Center for Outcomes Research and Evaluation at Yale, New Haven, CT, USA
,
Hua Xu
3   Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
,
Lucila Ohno Machado
3   Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
› Author Affiliations

Summary

Objectives: Despite growing enthusiasm surrounding the utility of clinical informatics to improve cancer outcomes, data availability remains a persistent bottleneck to progress. Difficulty combining data with protected health information often limits our ability to aggregate larger more representative datasets for analysis. With the rise of machine learning techniques that require increasing amounts of clinical data, these barriers have magnified. Here, we review recent efforts within clinical informatics to address issues related to safely sharing cancer data.

Methods: We carried out a narrative review of clinical informatics studies related to sharing protected health data within cancer studies published from 2018-2022, with a focus on domains such as decentralized analytics, homomorphic encryption, and common data models.

Results: Clinical informatics studies that investigated cancer data sharing were identified. A particular focus of the search yielded studies on decentralized analytics, homomorphic encryption, and common data models. Decentralized analytics has been prototyped across genomic, imaging, and clinical data with the most advances in diagnostic image analysis. Homomorphic encryption was most often employed on genomic data and less on imaging and clinical data. Common data models primarily involve clinical data from the electronic health record. Although all methods have robust research, there are limited studies showing wide scale implementation.

Conclusions: Decentralized analytics, homomorphic encryption, and common data models represent promising solutions to improve cancer data sharing. Promising results thus far have been limited to smaller settings. Future studies should be focused on evaluating the scalability and efficacy of these methods across clinical settings of varying resources and expertise.



Publication History

Article published online:
06 July 2023

© 2023. IMIA and Thieme. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 Warner JL, Patt D. Cancer Informatics in 2019: Deep Learning Takes Center Stage. Yearb Med Inform 2020 Aug;29(01):243–6. doi: 10.1055/s-0040-1701993.
  • 2 Kuderer NM, Choueiri TK, Shah DP, Shyr Y, Rubinstein SM, Rivera DR, et al. Clinical impact of COVID-19 on patients with cancer (CCC19): a cohort study. The Lancet 2020 Jun;395(10241):1907–18. doi: 10.1016/S0140-6736(20)31187-9.
  • 3 Connor M, Paulino AC, Ermoian RP, Hartsell WF, Indelicato DJ, Perkins S, et al. Variation in Proton Craniospinal Irradiation Practice Patterns in the United States: A Pediatric Proton Consortium Registry (PPCR) Study. Int J Radiat Oncol 2022 Mar;112(4):901–12. doi: 10.1016/j.ijrobp.2021.11.016.
  • 4 The AACR Project GENIE Consortium. AACR Project GENIE: Powering Precision Medicine through an International Consortium. Cancer Discov 2017 Aug 1;7(8):818–31. doi: 10.1158/2159-8290.CD-17-0151.
  • 5 Kaushal A, Altman R, Langlotz C. Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms. JAMA 2020 Sep 22;324(12):1212. doi: 10.1001/jama.2020.12067.
  • 6 Khullar D, Casalino LP, Qian Y, Lu Y, Krumholz HM, Aneja S. Perspectives of Patients About Artificial Intelligence in Health Care. JAMA Netw Open 2022 May 4;5(5):e2210309. doi: 10.1001/jamanetworkopen.2022.10309.
  • 7 Adnan M, Kalra S, Cresswell JC, Taylor GW, Tizhoosh HR. Federated learning and differential privacy for medical image analysis. Sci Rep 2022 Feb 4;12(1):1953. doi: 10.1038/s41598-022-05539-7.
  • 8 Vorisek CN, Lehne M, Klopfenstein SAI, Mayer PJ, Bartschke A, Haese T, et al. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review. JMIR Med Inform 2022 Jul 19;10(7):e35724. doi: 10.2196/35724.
  • 9 The Cancer Genome Atlas Research Network; Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 2013 Oct;45(10):1113–20. doi: 10.1038/ng.2764.
  • 10 Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, et al. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J Digit Imaging 2013 Dec;26(6):1045–57. doi: 10.1007/s10278-013-9622-7.
  • 11 Zhang A, Xing L, Zou J, Wu JC. Shifting machine learning for healthcare from development to deployment and from models to data. Nat Biomed Eng 2022 Jul 4;6(12):1330–45. doi: 10.1038/s41551-022-00898-y.
  • 12 Bonomi L, Huang Y, Ohno-Machado, L. Privacy challenges and research opportunities for genomic data sharing. Nat Genet 2020 Jul;52(7):646–54. doi: 10.1038/s41588-020-0651-0.
  • 13 Aneja S, Chang E, Omuro A. Applications of artificial intelligence in neuro-oncology. Curr Opin Neurol 2019 Dec;32(6):850–6. doi: 10.1097/WCO.anejaaneja0761.
  • 14 Thompson RF, Valdes G, Fuller CD, Carpenter CM, Morin O, Aneja S, et al. Artificial Intelligence in Radiation Oncology Imaging. Int J Radiat Oncol 2018 Nov;102(4):1159–61. doi: 10.1016/j.ijrobp.2018.05.070.
  • 15 Chang K, Balachandar N, Lam C, Yi D, Brown J, Beers A, et al. Distributed deep learning networks among institutions for medical imaging. J Am Med Inform Assoc 2018 Aug 1;25(8):945–54. doi: 10.1093/jamia/ocy017.
  • 16 Deist TM, Dankers FJWM, Ojha P, Scott Marshall M, Janssen T, Faivre-Finn C, et al. Distributed learning on 20 000+ lung cancer patients – The Personal Health Train. Radiother Oncol 2020 Mar;144:189–200. doi: 10.1016/j.radonc.2019.11.019.
  • 17 Konečný J, McMahan B, Ramage D. Federated Optimization:Distributed Optimization Beyond the Datacenter. arXiv; 2015 [cited 2023 Mar 29]. [Available from: http://arxiv.org/abs/1511.03575]
  • 18 Kaissis GA, Makowski MR, Rückert D, Braren RF. Secure, privacy-preserving and federated machine learning in medical imaging. Nat Mach Intell 2020 Jun 8;2(6):305–11. doi: 10.1038/s42256-020-0186-1
  • 19 Scherer J, Nolden M, Kleesiek J, Metzger J, Kades K, Schneider V, et al. Joint Imaging Platform for Federated Clinical Data Analytics. JCO Clin Cancer Inform 2020 Nov;(4):1027–38. doi: 10.1200/CCI.20.00045.
  • 20 Sarma KV, Harmon S, Sanford T, Roth HR, Xu Z, Tetreault J, et al. Federated learning improves site performance in multicenter deep learning without data sharing. J Am Med Inform Assoc 2021 Jun 12;28(6):1259–64. doi: 10.1093/jamia/ocaa341.
  • 21 Agbley BLY, Li J, Hossin MA, Nneji GU, Jackson J, Monday HN, et al. Federated Learning-Based Detection of Invasive Carcinoma of No Special Type with Histopathological Images. Diagnostics 2022 Jul 9;12(7):1669. doi: 10.3390/diagnostics12071669.
  • 22 Lu MY, Chen RJ, Kong D, Lipkova J, Singh R, Williamson DFK, et al. Federated learning for computational pathology on gigapixel whole slide images. Med Image Anal 2022 Feb;76:102298. doi: 10.1016/j.media.2021.102298.
  • 23 Bercea CI, Wiestler B, Rueckert D, Albarqouni S. Federated disentangled representation learning for unsupervised brain anomaly detection. Nat Mach Intell 2022 Aug 25;4(8):685–95. doi: 10.1038/s42256-022-00515-2.
  • 24 Rajendran S, Obeid JS, Binol H, D Agostino R, Foley K, Zhang W, et al. Cloud-Based Federated Learning Implementation Across Medical Centers. JCO Clin Cancer Inform 2021 Dec;(5):1–11. doi: 10.1200/CCI.20.00060.
  • 25 Hansen CR, Price G, Field M, Sarup N, Zukauskaite R, Johansen J, et al. Larynx cancer survival model developed through open-source federated learning. Radiother Oncol 2022 Nov;176:179–86. doi: 10.1016/j.radonc.2022.09.023.
  • 26 Zerka F, Barakat S, Walsh S, Bogowicz M, Leijenaar RTH, Jochems A, et al. Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care. JCO Clin Cancer Inform 2020 Nov;(4):184–200. doi: 10.1200/CCI.19.00047.
  • 27 Pati S, Baid U, Edwards B, Sheller M, Wang SH, Reina GA, et al. Federated learning enables big data for rare cancer boundary detection. Nat Commun. 2022; Nat Commun 2022 Dec 5;13(1):7346. doi: 10.1038/s41467-022-33407-5.
  • 28 Nasr M, Shokri R, Houmansadr A. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In: 2019 IEEE Symposium on Security and Privacy (SP). San Francisco, CA, USA: IEEE; 2019 [cited 2023 Jan 29]. p. 739–53. [Available from: https://ieeexplore.ieee.org/document/8835245/].
  • 29 Froelicher D, Troncoso-Pastoriza JR, Raisaro JL, Cuendet MA, Sousa JS, Cho H, et al. Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption. Nat Commun 2021 Oct 11;12(1):5910. doi: 10.1038/s41467-021-25972-y.
  • 30 Joel MZ, Umrao S, Chang E, Choi R, Yang DX, Duncan JS, et al. Using Adversarial Images to Assess the Robustness of Deep Learning Models Trained on Diagnostic Images in Oncology. JCO Clin Cancer Inform 2022 May;(6):e2100170. doi: 10.1200/CCI.21.00170.
  • 31 Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Nitin Bhagoji A, et al. Advances and Open Problems in Federated Learning. Found Trends® Mach Learn 2021;14(1–2):1–210.
  • 32 Kuo TT, Jiang X, Tang H, Wang X, Bath T, Bu D, et al. iDASH secure genome analysis competition 2018: blockchain genomic data access logging, homomorphic encryption on GWAS, and DNA segment searching. BMC Med Genomics 2020 Jul;13(S7):98, s12920-020-0715–0. doi: 10.1186/s12920-020-0715-0.
  • 33 Blatt M, Gusev A, Polyakov Y, Rohloff K, Vaikuntanathan V. Optimized homomorphic encryption solution for secure genome-wide association studies. BMC Med Genomics 2020 Jul;13(S7):83. doi: 10.1186/s12920-020-0719-9.
  • 34 Blatt M, Gusev A, Polyakov Y, Goldwasser S. Secure large-scale genome-wide association studies using homomorphic encryption. Proc Natl Acad Sci 2020 May 26;117(21):11608–13. doi: 10.1073/pnas.1918257117.
  • 35 Khilji IQ, Saha K, Amin J, Iqbal M. Application of Homomorphic Encryption on Neural Network in Prediction of Acute Lymphoid Leukemia. Int J Adv Comput Sci Appl 2020 [cited 2023 Jan 26];11(6). [Available from: http://thesai.org/Publications/ViewPaper?Volume=11&Issue=6&Code=IJACSA&SerialNo=46].
  • 36 Son Y, Han K, Lee YS, Yu J, Im YH, Shin SY. Privacy-preserving breast cancer recurrence prediction based on homomorphic encryption and secure two party computation. Vijayakumar P, editor. PLoS One 2021 Dec 20;16(12):e0260681. doi: 10.1371/journal.pone.0260681.
  • 37 Paddock S, Abedtash H, Zummo J, Thomas S. Proof-of-concept study: Homomorphically encrypted data can support real-time learning in personalized cancer medicine. BMC Med Inform Decis Mak 2019 Dec;19(1):255. doi: 10.1186/s12911-019-0983-9.
  • 38 Raisaro JL, Klann JG, Wagholikar KB, Estiri H, Hubaux JP, Murphy SN. Feasibility of Homomorphic Encryption for Sharing I2B2 Aggregate-Level Data in the Cloud. AMIA Jt Summits Transl Sci Proc 2018 May 18;2017:176-85.
  • 39 Klann JG, Joss MAH, Embree K, Murphy SN. Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model. PLoS One 2019 Feb 19;14(2):e0212463. doi: 10.1371/journal.pone.0212463.
  • 40 Common Data Model Harmonization Project FInal Report. 2020 Aug. [Available from : https://aspe.hhs.gov/sites/default/files/private/pdf/259016/CDMH-Final-Report-14August2020.pdf].
  • 41 Hayman JA, Dekker A, Feng M, Keole SR, McNutt TR, Machtay M, et al. Minimum Data Elements for Radiation Oncology: An American Society for Radiation Oncology Consensus Paper. Pract Radiat Oncol 2019 Nov;9(6):395–401. doi: 10.1016/j.prro.2019.07.017.
  • 42 Corley DA, Feigelson HS, Lieu TA, McGlynn EA. Building Data Infrastructure to Evaluate and Improve Quality: PCORnet. J Oncol Pract 2015 May;11(3):204–6. doi: 10.1200/JOP.2014.003194.
  • 43 Belenkaya R, Gurley MJ, Golozar A, Dymshyts D, Miller RT, Williams AE, et al. Extending the OMOP Common Data Model and Standardized Vocabularies to Support Observational Cancer Research. JCO Clin Cancer Inform 2021 Dec;(5):12–20. doi: 10.1200/CCI.20.00079.
  • 44 Warner JL, Dymshyts D, Reich CG, Gurley MJ, Hochheiser H, Moldwin ZH, et al. HemOnc: A new standard vocabulary for chemotherapy regimen representation in the OMOP common data model. J Biomed Inform 2019 Aug;96:103239. doi: 10.1016/j.jbi.2019.103239.
  • 45 Jeon H, You SC, Park J, Park RW. Conversion of Diagnosis and Chemotherapy Data in Electronic Health Records to Episode-based Oncology Extension of OMOP-CDM. [Available from : https://www.ohdsi.org/2019-us-symposium-showcase-12/].
  • 46 Yu Y, Ruddy KJ, Wen A, Zong N, Chen J, Shah ND, et al. Integrating Electronic Health Record Data into the ADEpedia-on-OHDSI Platform for Improved Signal Detection: A Case Study of Immune-related Adverse Events. AMIA Jt Summits Transl Sci Proc 2020 May 30;2020:710-719.
  • 47 Lee SM, Kim K, Yoon J, Park SK, Moon S, Lee SE, et al. Association between Use of Hydrochlorothiazide and Nonmelanoma Skin Cancer: Common Data Model Cohort Study in Asian Population. J Clin Med 2020 Sep 9;9(9):2910. doi: 10.3390/jcm9092910.
  • 48 Gruendner J, Schwachhofer T, Sippl P, Wolf N, Erpenbeck M, Gulden C, et al. KETOS: Clinical decision support and machine learning as a service – A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services. PLoS One 2019 Oct 3;14(10):e0223010. doi: 10.1371/journal.pone.0223010.
  • 49 Papez V, Moinat M, Payralbe S, Asselbergs FW, Lumbers RT, Hemingway H, et al. Transforming and evaluating electronic health record disease phenotyping algorithms using the OMOP common data model: a case study in heart failure. JAMIA Open 2021 Jul 31;4(3):ooab001. doi: 10.1093/jamiaopen/ooab001.
  • 50 Shin SJ, You SC, Park YR, Roh J, Kim JH, Haam S, et al. Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study. J Med Internet Res 2019 Mar 26;21(3):e13249. doi: 10.2196/13249.
  • 51 Carnahan RM, Waitman LR, Charlton ME, Schroeder MC, Bossler AD, Campbell WS, et al. Exploration of PCORnet Data Resources for Assessing Use of Molecular-Guided Cancer Treatment. JCO Clin Cancer Inform 2020 Nov;(4):724–35. doi: 10.1200/CCI.19.00142.
  • 52 Osterman TJ, Terry M, Miller RS. Improving Cancer Data Interoperability: The Promise of the Minimal Common Oncology Data Elements (mCODE) Initiative. JCO Clin Cancer Inform 2020 Nov;(4):993–1001. doi: 10.1200/CCI.20.00059.
  • 53 Chen J, Chiang Y. Applying the Minimal Common Oncology Data Elements (mCODE) to the Asia-Pacific Region. JCO Clin Cancer Inform 2021 Dec;(5):252–3. doi: 10.1200/CCI.20.00181.
  • 54 Potter D, Brothers R, Kolacevski A, Koskimaki JE, McNutt A, Miller RS, et al. Development of CancerLinQ, a Health Information Learning Platform From Multiple Electronic Health Record Systems to Support Improved Quality of Care. JCO Clin Cancer Inform 2020 Nov;(4):929–37. doi: 10.1200/CCI.20.00064.
  • 55 Guérin J, Laizet Y, Le Texier V, Chanas L, Rance B, Koeppel F, et al. OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology. JCO Clin Cancer Inform 2021 Dec;(5):256–65. doi: 10.1200/CCI.20.00094.
  • 56 Everson J, Patel V, Adler-Milstein J. Information blocking remains prevalent at the start of 21st Century Cures Act: results from a survey of health information exchange organizations. J Am Med Inform Assoc 2021 Mar 18;28(4):727–32. doi: 10.1093/jamia/ocaa323.
  • 57 Kozlov, M. NIH issues a seismic mandate: share data publicly. Nature. 2022 Feb 24;602(7898):558–9. doi: 10.1038/d41586-022-00402-1.