Subscribe to RSS
DOI: 10.1055/a-2148-6414
POINT: Pipeline for Offline Conversion and Integration of Geocodes and Neighborhood Data
Funding This work was funded by the National Institute on Aging (grant no.: R01AG062499).Abstract
Objectives Geocoding, the process of converting addresses into precise geographic coordinates, allows researchers and health systems to obtain neighborhood-level estimates of social determinants of health. This information supports opportunities to personalize care and interventions for individual patients based on the environments where they live. We developed an integrated offline geocoding pipeline to streamline the process of obtaining address-based variables, which can be integrated into existing data processing pipelines.
Methods POINT is a web-based, containerized, application for geocoding addresses that can be deployed offline and made available to multiple users across an organization. Our application supports use through both a graphical user interface and application programming interface to query geographic variables, by census tract, without exposing sensitive patient data. We evaluated our application's performance using two datasets: one consisting of 1 million nationally representative addresses sampled from Open Addresses, and the other consisting of 3,096 previously geocoded patient addresses.
Results A total of 99.4 and 99.8% of addresses in the Open Addresses and patient addresses datasets, respectively, were geocoded successfully. Census tract assignment was concordant with reference in greater than 90% of addresses for both datasets. Among successful geocodes, median (interquartile range) distances from reference coordinates were 52.5 (26.5–119.4) and 14.5 (10.9–24.6) m for the two datasets.
Conclusion POINT successfully geocodes more addresses and yields similar accuracy to existing solutions, including the U.S. Census Bureau's official geocoder. Addresses are considered protected health information and cannot be shared with common online geocoding services. POINT is an offline solution that enables scalability to multiple users and integrates downstream mapping to neighborhood-level variables with a pipeline that allows users to incorporate additional datasets as they become available. As health systems and researchers continue to explore and improve health equity, it is essential to quickly and accurately obtain neighborhood variables in a Health Insurance Portability and Accountability Act (HIPAA)-compliant way.
Protection of Human and Animal Subjects
The study was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects and was reviewed by VUMC Institutional Review Board.
Publication History
Received: 02 May 2023
Accepted: 03 August 2023
Accepted Manuscript online:
04 August 2023
Article published online:
18 October 2023
© 2023. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 Braveman P, Egerter S, Williams DR. The social determinants of health: coming of age. Annu Rev Public Health 2011; 32 (01) 381-398
- 2 Marmot M, Allen JJ. Social determinants of health equity. Am J Public Health 2014; 104 (Suppl. 04) S517-S519
- 3 Irwin A, Scali E. Action on the social determinants of health: a historical perspective. Glob Public Health 2007; 2 (03) 235-256
- 4 Adler NE, Glymour MM, Fielding J. Addressing social determinants of health and health inequalities. JAMA 2016; 316 (16) 1641-1642
- 5 Palmer RC, Ismond D, Rodriquez EJ, Kaufman JS. Social determinants of health: future directions for health disparities research. Am J Public Health 2019; 109 (S1): S70-S71
- 6 Davidson J, Vashisht R, Butte AJ. From genes to geography, from cells to community, from biomolecules to behaviors: the importance of social determinants of health. Biomolecules 2022; 12 (10) 1449
- 7 Braveman P, Gottlieb L. The social determinants of health: it's time to consider the causes of the causes. Public Health Rep 2014; 129 (Suppl. 02) 19-31
- 8 Norton JM, Moxey-Mims MM, Eggers PW. et al. Social determinants of racial disparities in CKD. J Am Soc Nephrol 2016; 27 (09) 2576-2595
- 9 Coughlin SS. Social determinants of breast cancer risk, stage, and survival. Breast Cancer Res Treat 2019; 177 (03) 537-548
- 10 Avendano M, Glymour MM. Stroke disparities in older Americans: is wealth a more powerful indicator of risk than income and education?. Stroke 2008; 39 (05) 1533-1540
- 11 Gillum RF, Ingram DD. Relation between residence in the southeast region of the United States and stroke incidence. The NHANES I epidemiologic followup study. Am J Epidemiol 1996; 144 (07) 665-673
- 12 Reshetnyak E, Ntamatungiro M, Pinheiro LC. et al. Impact of multiple social determinants of health on incident stroke. Stroke 2020; 51 (08) 2445-2453
- 13 Hill-Briggs F, Adler NE, Berkowitz SA. et al. Social determinants of health and diabetes: a scientific review. Diabetes Care 2020; 44 (01) 258-279
- 14 Cook LA, Sachs J, Weiskopf NG. The quality of social determinants data in the electronic health record: a systematic review. J Am Med Inform Assoc 2021; 29 (01) 187-196
- 15 Cantor MN, Thorpe L. Integrating data on social determinants of health into electronic health records. Health Aff (Millwood) 2018; 37 (04) 585-590
- 16 Hatef E, Weiner JP, Kharrazi H. A public health perspective on using electronic health records to address social determinants of health: the potential for a national system of local community health records in the United States. Int J Med Inform 2019; 124: 86-89
- 17 Wang M, Pantell MS, Gottlieb LM, Adler-Milstein J. Documentation and review of social determinants of health data in the EHR: measures and associated insights. J Am Med Inform Assoc 2021; 28 (12) 2608-2616
- 18 Bazemore AW, Cottrell EK, Gold R. et al. “Community vital signs”: incorporating geocoded social determinants into electronic records to promote patient and population health. J Am Med Inform Assoc 2016; 23 (02) 407-412
- 19 Center for Disease Control, Agency for Toxic Substances and Disease Registry. CDC/ATSDR Social Vulnerability Index data and documentation download. 2022 Accessed January 2, 2023 at: https://www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html
- 20 Centers for Disease Control. Environmental Justice Index (EJI). 2022 Accessed January 2, 2023 at: https://www.atsdr.cdc.gov/placeandhealth/eji/index.html
- 21 Agency for Healthcare Research and Quality. Social determinants of health database. Accessed January 2, 2023 at: https://www.ahrq.gov/sdoh/data-analytics/sdoh-data.html
- 22 Krieger N, Waterman P, Chen JT, Soobader MJ, Subramanian SV, Carson R. Zip code caveat: bias due to spatiotemporal mismatches between zip codes and US census-defined geographic areas–the Public Health Disparities Geocoding Project. Am J Public Health 2002; 92 (07) 1100-1102
- 23 Krieger N. A century of census tracts: health & the body politic (1906-2006). J Urban Health 2006; 83 (03) 355-361
- 24 Committee on the Recommended Social and Behavioral Domains and Measures for Electronic Health Record. Board on Population Health and Public Health Practice, Institute of Medicine.; Capturing Social and Behavioral Domains and Measures in Electronic Health Records. National Academies Press (US); 2015. https://www.ncbi.nlm.nih.gov/books/NBK268995/
- 25 Brokamp C, Wolfe C, Lingren T, Harley J, Ryan P. Decentralized and reproducible geocoding and characterization of community and environmental exposures for multisite studies. J Am Med Inform Assoc 2018; 25 (03) 309-314
- 26 OpenStreetMap contributors. Accessed November 5, 2022 at: OpenStreetMap 2017 https://www.openstreetmap.org
- 27 Rashidian S, Dong X, Jain SK, Wang F. EaserGeocoder: integrative geocoding with machine learning (demo paper). In: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. SIGSPATIAL '18. Association for Computing Machinery; 2018: 572-575
- 28 PostGIS. PostGIS 3. 2022 Accessed November 5, 2022 at: https://postgis.net
- 29 Environmental Systems Research Institute. ESRI. ArcGIS Desktop
- 30 QGIS Association. QGIS geographic information system. Accessed November 5, 2022 at: http://www.qgis.org
- 31 Health Insurance Portability and Accountability Act of 1996. Pub. L. No. 104–191, § 264, 110 Stat.1936
- 32 US Census Bureau. US census TIGER/Line shapefiles file name definitions. Accessed November 5, 2022 at: www2.census.gov/geo/tiger/TIGER2022/2022_TL_Shapefiles_File_Name_Definitions.pdf
- 33 The PostgreSQL Global Development Group. PostgreSQL 15.1. 2022 Accessed November 5, 2022 at: https://www.postgresql.org
- 34 kevin-s-guo/point-geocoder. Accessed March 23, 2023 at: https://github.com/kevin-s-guo/point-geocoder
- 35 Uvicorn: ASGI web server implementation for python. Accessed November 5, 2022 at: https://github.com/encode/uvicorn
- 36 Merkel D. Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014; 2014 (239) 2
- 37 CDC Social Determinants of Health and PLACES Data. Accessed November 5, 2022 at: https://www.cdc.gov/places/social-determinants-of-health-and-places-data/index.html
- 38 US Department of Agriculture Economic Research Service. USDA ERS - Food Environment Atlas. Accessed January 2, 2023 at: https://www.ers.usda.gov/data-products/food-environment-atlas/
- 39 University of North Carolina at Chapel Hill. National health literacy mapping to inform health care policy. Health Literacy Data Map. Accessed January 2, 2023 at: http://healthliteracymap.unc.edu/#
- 40 US Census Bureau. Community resilience estimates. Census.gov. Accessed January 2, 2023 at: https://www.census.gov/programs-surveys/community-resilience-estimates.html
- 41 Kind AJH, Buckingham WR. Making neighborhood-disadvantage metrics accessible - the neighborhood atlas. N Engl J Med 2018; 378 (26) 2456-2458
- 42 University of Wisconsin School of Medicine and Public Health. 2020 Area Deprivation Index. Accessed March 23, 2023 at: https://www.neighborhoodatlas.medicine.wisc.edu/
- 43 OpenAPI Initiative. OpenAPI Specification v3.1.0. Accessed November 5, 2022 at: https://spec.openapis.org/oas/v3.1.0
- 44 OpenAddresses. OpenAddresses: the free and open global address collection. Accessed January 5, 2023 at: http://openaddresses.io/
- 45 Harris DR, Delcher C. bench4gis: Benchmarking privacy-aware geocoding with open big data. In: 2019 IEEE International Conference on Big Data (Big Data); 2019: 4067-4070
- 46 Guo KS, Steitz BD. Open addresses nationally representative subset (1m). July 3, 2023
- 47 US Census Bureau. Batch address geocoder. 2022 Accessed November 5, 2022 at: https://geocoding.geo.census.gov/geocoder/geographies/addressbatch?form
- 48 Plotly Technologies Inc. Collaborative data science. 2015 Accessed November 5, 2022 at: https://plot.ly
- 49 Predmore Z, Hatef E, Weiner JP. integrating social and behavioral determinants of health into population health analytics: a conceptual framework and suggested road map. Popul Health Manag 2019; 22 (06) 488-494
- 50 Hewner S, Casucci S, Sullivan S. et al. Integrating social determinants of health into primary care clinical and informational workflow during care transitions. EGEMS (Wash DC) 2017; 5 (02) 2
- 51 Ash AS, Mick EO, Ellis RP, Kiefe CI, Allison JJ, Clark MA. Social determinants of health in managed care payment formulas. JAMA Intern Med 2017; 177 (10) 1424-1430
- 52 Lemke D, Mattauch V, Heidinger O, Hense HW. [Who hits the mark? A comparative study of the free geocoding services of Google and OpenStreetMap]. Gesundheitswesen 2015; 77 (8-9): e160-e165
- 53 Singh SK. Evaluating two freely available geocoding tools for geographical inconsistencies and geocoding errors. Open Geospatial Data Softw Stand 2017; 2 (01) 11
- 54 Bell S, Wilson K, Shah TI, Gersher S, Elliott T. Investigating impacts of positional error on potential health care accessibility. Spat Spatio-Temporal Epidemiol 2012; 3 (01) 17-29
- 55 Delmelle EM, Desjardins MR, Jung P. et al. Uncertainty in geospatial health: challenges and opportunities ahead. Ann Epidemiol 2022; 65: 15-30
- 56 Xierali IM. Physician multisite practicing: impact on access to care. J Am Board Fam Med 2018; 31 (02) 260-269