Subscribe to RSS
DOI: 10.3414/ME11-01-0001
Multidimensional Point Transform for Public Health Practice
Publication History
received:11 January 2011
accepted:10 May 2011
Publication Date:
20 January 2018 (online)
Summary
Background: With increases in spatial information and enabling technologies, location-privacy concerns have been on the rise. A commonly proposed solution in public health involves random perturbation, however consideration for individual dimensions (at-tributes) has been weak.
Objectives: The current study proposes a multidimensional point transform (MPT) that integrates the spatial dimension with other dimensions of interest to comprehensively anonymise data.
Methods: The MPT relies on the availability of a base population, a subset patient dataset, and shared dimensions of interest. Perturbation distance and anonymity thresholds are defined, as are allowable dimensional perturbations. A preliminary implementation is presented using sex, age and location as the three dimensions of interest, with a maximum perturbation distance of 1 kilometre and an anonymity threshold of 20%. A synthesised New York county population is used for testing with 1000 iterations for each of 25, 50, 100, 200 and 400 patient dataset sizes.
Results: The MPT consistently yielded a mean perturbation distance of 46 metres with no sex or age perturbation required. Displacement of the spatial mean decreased with patient dataset size and averaged 5.6 metres overall.
Conclusions: The MPT presents a flexible, customisable and adaptive algorithm for perturbing datasets for public health, allowing tweaking and optimisation of the trade-offs for different datasets and purposes. It is not, however, a substitute for secure and ethical conduct, and a public health framework for the appropriate disclosure, use and dissemination of data containing personal identifiable information is required. The MPT presents an important component of such a framework.
-
References
- 1 Boulos MNK, Curtis AJ, AbdelMalik P. Musings on privacy issues in health research involving disaggregate geographic data about individuals. International Journal of Health Geographics 2009 8 46 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2716332/pdf/1476-072-8-46.pdf.
- 2 Wartenberg D, Thompson WD. Privacy versus public health: the impact of current confidentiality rules. American Journal of Public Health 2010; 100 (003) 407-412.
- 3 De Moor GJE, Claerhout B, De Meyer F. Privacy enhancing techniques. Methods Inf Med 2003; 42: 148-153.
- 4 AbdelMalik P, Boulos MNK, Jones R. The perceived impact of location privacy: a web-based survey of public health perspectives and requirements in the UK and Canada. BMC Public Health 2008; 8: 156 http://www.biomedcentral.com/1471-2458/8/156
- 5 Saxena N, MacKinnon MP, Watling J, Willison D, Swinton M.. Understanding Canadians’ attitudes and expectations: Citizens’ dialogue on privacy and the use of personal information for health research in Canada. Report (Research Report P|09): Canadian Policy Research Networks Inc.; March 2006
- 6 Robling MR, Hood K, Houston H, Pill R, Fay J, Evans HM. Public attitudes towards the use of primary care patient record data in medical research without consent: a qualitative study. Journal of Medical Ethics 2004; 30: 104-109. http://jme.bmj.com/cgi/content/abstract/30/1/104
- 7 Jones C. The utilitarian argument for medical confidentiality: a pilot study of patients’ views. Journal of Medical Ethics 2003; 29 (006) 348-352. http://jme.bmj.com/cgi/content/abstract/29/6/348
- 8 Barrett G, Cassell JA, Peacock JL, Coleman MP. National survey of British public’s views on use of identifiable medical data by the National Cancer Registry. British Medical Journal 2006; 332: 1068-1072.
- 9 Onsrud HJ, Johnson JP, Lopez XR. Protecting personal privacy in using geographic information systems. Photogrammetric Engineering & Remote Sensing 1994; 60 (009) 1083-1095.
- 10 Domingo-Ferrer J, Torra V. A critique of k-anonymity and some of its enhancements. In: IEEE 2008
- 11 Dalenius T. Finding a needle in a haystack or identifying anonymous census records. Journal of Official Statistics 1986; 2 (003) 329-336.
- 12 El Emam K, Brown A, AbdelMalik P, Neisa A, Walker M, Bottomley J, Roffey T. A method for managing re-identification risk from small geographic areas in Canada. BMC Medical Informatics and Decision Making 2010; 10: 18 http://www.biomedcentral.com/1472-6947/10/18
- 13 Claerhout B, De Moor GJE. Privacy protection for HealthGrid applications. Methods Inf Med 2005; 44: 140-143.
- 14 Sweeney L. k-Anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 2002; 10 (005) 557-570
- 15 Meyerson A, Williams R. General k-anonymization is hard. Report (CMU-CS-03-113). Pittsburgh (PA): School of Computer Science, Carnegie Mellon University; 2003.
- 16 Curtis AJ, Mills JW, Agustin L, Cockburn M.. Confidentiality risks in fine scale aggregations of health data. Computers, Environment and Urban Systems 2010. In press. doi: 10.1016/j.compenvurbsys.2010.08.002
- 17 Boulos MNK. Towards evidence-based, GIS-driven national spatial health information infrastructure and surveillance services in the United Kingdom. International Journal of Health Geographics2004; 3: 1. http://www.ij-healthgeographics.com/content/3/1/1.
- 18 Gutmann MP, Stern PC. Putting people on the map: protecting confidentiality with linked social-spatial data. Washington, D.C.: The National Academies Press; 2007.
- 19 Armstrong MP, Rushton G, Zimmerman DL. Geographically masking health data to preserve confidentiality. Statistics in Medicine 1999; 18: 497-525. http://www3.interscience.wiley.com/cgi-bin/fulltext/45002090/PDFSTART
- 20 Kwan Mei-Po, Casas I, Schmitz BC. Protection of geoprivacy and accuracy of spatial information: how effective are geographical masks?. Cartographica 2004; 39 (002) 15-28.
- 21 Cassa CA, Grannis SJ, Overhage JM, Mandl KD. A context-sensitive approach to anonymizing spatial surveillance data: impact on outbreak detection. JAMIA 2006; 13 (002) 160-165.
- 22 Wieland SS, Cassa CA, Mandl KD, Berger B. Revealing the spatial distribution of a disease while preserving privacy. Proceedings of the National Academy of Sciences of the United States of America 2008; 105 (046) 17608-17613.
- 23 Hampton KH, Fitch MK, Allshouse WB, Doherty IA, Gesink DC, Leone PA, Serre ML, Miller WC. Mapping health data: improved privacy protection with donut method geomasking. American Journal of Epidemiology 2010; 172 (009) 1062-1069.
- 24 Models of Infectious Disease Agent Study Synthesized data. Last updated. 2009 Accessed. https:/www.epimodels.org/midas/pubsyntdata1.do2005.
- 25 2005 American Community Survey: Age and sex population. Last updated. 2008 Accessed 11-12-2010. http://factfinder.census.gov/servlet/STTable?_bm=y&-geo_id=01000US&-qr_name=ACS_2005_EST_G00_S0101&-ds_name=ACS_2005_EST_G00_&-_lang=en&-redoLog=false].
- 26 RAMDisk software. Last updated. 2010 Accessed [http://memory.dataram.com/products-and-services/software/ramdisk].
- 27 Zimmerman DL, Pavlik C. Quantifying the Effects of Mask Metadata Disclosure and Multiple Releases on the Confidentiality of Geographically Masked Health Data. Geographical Analysis 2008; 40 (001) 52-76. http://www3.interscience.wiley.com/journal/119390400/stract?CRETRY=1&SRETRY=0]
- 28 Cassa CA, Wieland SC, Mandl KD. Re-identification of home addresses from spatial locations anonymized by Gaussian skew. International Journal of Health Geographics 2008 7 45 http://www.ij-healthgeographics.com/content/7/1/45.
- 29 El Emam K, Dankar FK. Protecting Privacy Using k-Anonymity. JAMIA 2008; 15: 627-637. http://www.jamia.org/cgi/content/abstract/15/5/627
- 30 Wheaton WD, Cajka JC, Chasteen BM, Wagener DK, Cooley PC, Ganapathi L, Roberts DJ, Allpress JL. Synthesized population databases: a US geospatial database for agent-based models. Report (RTI Press publication No. MR-0010-0905). Research Triangle Park, NC: RTI International; May 2009.
- 31 El Emam K, Brown A, AbdelMalik P. Evaluating Predictors of Geographic Area Population Size Cutoffs to Manage Re-identification Risk. JAMIA 2008; 16 (002) 256-266. http://www.jamia.org/cgi/content/full/16/2/256
- 32 Boulos MNK, Cai Q, Padget JA, Rushton G. Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses. Journal of Biomedical Informatics 2006; 39 (002) 160-170. http://www.sciencedirect.com/science/article/B6WHD-4GR32TM-1/2/be0cb959aa15839693f49f582633e59b
- 33 Black N. Secondary use of personal data for health and health services research: why identifiable data are essential. Journal of Health Services Research and Policy 2003; 8 (Suppl. 01) Suppl S1:36-S1:40. http://www.ncbi.nlm.nih.gov/pubmed/12869337