Data Privacy Compliant Validation of Health Insurance Claims Data: the IDOMENEO
     Approach

Christian-Alexander Behrendt; Thea Schwaneberg; Sandra Hischke; Tobias Müller; Tom Petersen; Ursula Marschall; Sebastian Debus; Levente Kriston

doi:10.1055/a-0883-5098

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000022.xml

Download PDF

CC BY-NC-ND 4.0 · Gesundheitswesen 2020; 82(S 02): S94-S100
DOI: 10.1055/a-0883-5098

Übersichtsarbeit

Data Privacy Compliant Validation of Health Insurance Claims Data: the IDOMENEO Approach

Datenschutzkonforme Validierung von Routinedaten – Die IDOMENEO Methode

Authors

Christian-Alexander Behrendt

¹Department of Vascular Medicine, Work Group GermanVasc, University Medical Center Hamburg-Eppendorf, Hamburg
Thea Schwaneberg

¹Department of Vascular Medicine, Work Group GermanVasc, University Medical Center Hamburg-Eppendorf, Hamburg
Sandra Hischke

²Department of Medical Psychology, University Medical Center Hamburg-Eppendorf, Hamburg

¹Department of Vascular Medicine, Work Group GermanVasc, University Medical Center Hamburg-Eppendorf, Hamburg
Tobias Müller

³Department of Informatics, University of Hamburg, Hamburg
Tom Petersen

³Department of Informatics, University of Hamburg, Hamburg
Ursula Marschall

⁴BARMER, Healthcare Research, Wuppertal
Sebastian Debus

¹Department of Vascular Medicine, Work Group GermanVasc, University Medical Center Hamburg-Eppendorf, Hamburg
Levente Kriston

²Department of Medical Psychology, University Medical Center Hamburg-Eppendorf, Hamburg

Funding The IDOMENEO study is funded by the German Joint Federal Committee (Gemeinsamer Bundesausschuss, G-BA) (01VSF16008) and the GermanVasc registry is co-funded by the German Stifterverband as well as by the CORONA foundation (S199/10061/2015).

Further Information

Correspondence

Dr. Christian-Alexander Behrendt, MD

Department of Vascular Medicine, Work Group GermanVasc,

University Medical Center Hamburg-Eppendorf

Martinistraße 52

20246 Hamburg

Email: behrendt@hamburg.de

Publication History

Publication Date:
23 May 2019 (online)

Also available at

PDF Download Permissions and Reprints

Abstract

Recently, health insurance claims have regained the attention of the scientific community as a source of real-world evidence in health care research and quality improvement. To date, very few studies are available which investigate the validity of health insurance claims; these may be affected by bias from several sources, such as possible upcoding of co-morbidities and complications for reimbursement advantages. The IDOMENEO study investigates the inpatient treatment of peripheral arterial disease (PAD) comprehensively using various data sources with a consortium involving experts from health care research and data privacy, a large health insurance fund, biostatisticians, jurists, and computer scientists. Prospective registry data were collected from 30–40 vascular centres in Germany using the GermanVasc registry. In addition, health insurance claims data were prospectively collected from BARMER, the second largest health insurance fund in Germany. The consortium is currently developing a data privacy compliant method of health insurance claims data validation, the methodological foundations of which are described here.

Zusammenfassung

Routinedaten gewinnen zunehmend an Aufmerksamkeit durch die wissenschaftliche Community bei Projekten der Versorgungsforschung und Qualitätsentwicklung. Bis heute sind allerdings nur wenige Studien verfügbar, die sich mit der Validität von Routinedaten beschäftigen; Diese können einem Bias unterliegen, z. B. durch Fehlkodierungen von Komorbiditäten oder Komplikationen, um Vorteile bei der Erlösabrechnung zu erreichen. Die IDOMENEO-Studie untersucht die stationären Behandlungen von Patienten mit peripherer arterieller Verschlusskrankheit (PAVK) und nutzt dafür verschiedene Datenquellen. Das Studienkonsortium umfasst Experten aus den Bereichen Versorgungsforschung, Datenschutz, Kostenträger, Biostatistik, Rechtswissenschaften und Informatik. Primärdaten aus Registern werden an 30–40 Gefäßzentren prospektiv über das GermanVasc-Register erhoben. Zusätzlich werden Routinedaten der BARMER, der zweitgrößten gesetzlichen Krankenversicherung in Deutschland, analysiert. Das Konsortium entwickelt derzeit datenschutzkonforme Methoden, um die Routinedaten zu validieren. Die methodischen Grundlagen werden in diesem Artikel beschrieben.

Schlüsselwörter

Periphere arterielle Verschlusskrankheit - Krankenkassendaten - Register - Routinedaten - Qualitätsentwicklung - Datenvalidität

Key words

Peripheral Arterial Disease - Health Insurance Claims Data - Registries - Routine Data - Quality Improvement - Reliability and Validity

Background

Due to the arising digital revolution and the implementation of diagnosis-related groups (DRG) in the United States healthcare financing system, hospital data that were originally collected for statistical and administrative purposes are becoming accessible and sufficiently structured for broad research purposes [1]. This development is accompanied by an ongoing controversy regarding validity of administrative health data.

Recently, health insurance claims have regained attention of the scientific community as a source of real-world evidence in health care research, quality improvement, and further development of so called pragmatic trials [2.] To date, very few studies are available which investigate the validity of health insurance claims, which may be suffer from several sources of bias, for example possible upcoding of co-morbidities and complications for reimbursement advantages. Since national reimbursement and classification systems differ, the transferability of validation studies from one country to another remains unknown [3] [4]. Various ways to validate data exist, while a direct comparison of data from a source with unknown validity with another data from a source with known high validity remains the method of choice. Under most circumstances, the European Union General Data Privacy Regulation (GDPR) and ethical considerations complicate the direct cross-linking of registries with claims for validation purposes [5] [6]. Thus, the development and testing of alternative methodological approaches for the validation of claims data is mandatory. The IDOMENEO study investigates the inpatient treatment of peripheral arterial disease (PAD) comprehensively using various data sources with a consortium involving experts from healthcare research and data privacy, a large health insurance fund, biostatisticians, jurists, and computer scientists [7] [8]. The consortium is currently developing a data privacy compliant method of health insurance claims data validation, of which methodological foundations are described here.

Data Sources

The IDOMENEO study

The IDOMENEO study aims to prospectively collect both primary data and health insurance claims data in the field of PAD, a widespread disease with more than 1 million affected inhabitants in Germany undergoing more than 300,000 invasive revascularisation procedures per year [9]. The target population of the IDOMENEO study comprises patients hospitalised with symptomatic PAD who are treated with catheter-based endovascular revascularisations, open-surgical endarterectomy, or bypass surgery. Information on patients’ co-morbidities and outcomes in claims data will be contrasted to prospectively collected registry data to answer the question how claims data can be utilised for research and quality improvement in vascular medical care [7] [8] [10] [11].

Registry data

Registry data will be obtained through the GermanVasc registry, a prospective non-randomized multicentre registry including invasive revascularisations performed in 10,000 patients treated for symptomatic PAD at 30 to 40 German hospitals with 3 follow-up measures within 12 months. Automated completeness and plausibility checks and independent site visit monitoring will be performed to assure high internal validity [11] [12].

Health insurance claims data

Parallel to the registry, health insurance claims data routinely collected by BARMER health insurance fund will be obtained. BARMER is Germany’s second largest insurance provider documenting the medical care provided to approximately 9.4 million German citizens (13.2% of Germany’s population). Data from patients with symptomatic PAD by the International Classification of Diseases (ICD-10-GM: I70.0, I70.20–24, and I70.9 up to 2014 and I70.0, I70.20–25, I70.29, and I70.9 from 2015) treated by invasive revascularisation by the Operationen- und Prozedurenschlüssel (OPS; the German version of International Classification of Procedures in Medicine) in all German hospitals will be included. We expect to comprise data on around 10,000 to 20,000 patients.

Validation Methods

Validation approaches

We present 2 main approaches to contrast claims with registry data without individual cross-linking for aims of validation. The first approach (model-based validation) compares models fitted to both data sets, while the second approach (stratification-based validation) contrasts descriptive estimates for comparable subgroups from the 2 data sets. ([Fig. 1])

Fig. 1 Illustration of the IDOMENEO approaches to validate health insurance claims data (BARMER) with prospectively collected and quality assured registry data (GermanVasc).

Model-based validation

Basic principles

The central presumption of the first approach is that validity is not a feature of the data but rather of the interpretation and consequences of the analyses that are performed on the data [13]. Therefore, this approach validates not the health insurance claims data themselves but rather the models that are fitted to the data ([Fig. 1]). The essence of the approach is fitting the same statistical model to the claims and the registry data and using global and local indicators of model fit to compare the results. In order to account for the hierarchical structure of the data (patients clustered within hospitals), multilevel models of various complexity are an ideal choice for data analysis [14] [15].

Random intercept model without covariates

The random intercept model in the registry (R) can be defined as

Y _R,ij = β _R,0j + r _R,ij,

where y _R,ij , is the observed outcome of patient i in hospital j, β _R,0j is the mean outcome in hospital j, and the residuals are normally distributed as

r _R,ij~N(0,σ² _R,r)

with a mean of zero and a variance of σ² _R,r.

The distribution of the hospital means (random intercept) can be defined as

β _R,0j = γ _R,00 + u _R,0j,

where γ _{R, 00} is the grand mean of the outcome, and the hospital-specific deviation from this mean is normally distributed as

u _R,0j~N(0,σ² _R,u)

with a mean of zero and a variance of σ² _R,u.

Correspondingly, the same model fitted in the claims data (C) can be written as

y _C,ij=β _C,0j + r _C,ij,

r _C,ij~N(0,σ² _C,r),

β _C0j=γ _C,00 + u _C,0j,

u _C,0j~N(0,σ² _C,u).

Parameter γ _R,00 and γ _C,00 describe the mean level of an outcome in the population (e. g., amputation-free survival time or prevalence of major adverse limb events), which is frequently targeted in clinical and epidemiological research. Comparing parameter γ _R,00 with parameter γ _C,00 reveals whether the estimated level of outcome in the target population is estimated similarly across the data sources and thus, whether claims data can be used to answer corresponding research questions. Comparing parameter σ² _R,r to σ² _C,r tells whether the amount of variation regarding the outcome within hospitals is similarly estimated and contrasting σ² _R,u to σ² _C,u informs about the comparability of estimates of the variation across hospitals. A statistically non-significant formal comparison of these parameters between the data sources with a reasonably narrow confidence interval of the estimated difference is necessary to support the validity of the claims data for investigating the grand mean of the analysed outcome.

For quality assessment, investigating the association of u _R,0j with u _C,0j reveals whether ranking of the hospitals regarding the outcome is similar using the 2 data sources. A high rank correlation (e. g., 0.80 or higher) would support the validity of using claims data for ranking hospitals with regard the outcome. It is important that for this ranking comparison hospitals should be identifiable in both data sources.

Random intercept model with covariates

Adding patient-level covariates to the random intercept model results in the registry model

Y _R,ij = β _R,0j+β _R,1 × x _R,1+ r_R,ij,

r _R,ij~N(0,σ² _R,r),

β _R,0j = γ _R,00+u _R,0j,

u _R,0j~N(0,σ² _R,u).

and claims data model

Y _C,ij = β _C,0j+β _C,1 × x _C,1+ r_C,ij,

r _C,ij ~ N(0,σ² _C,r),

β _C,0j=γ _C,00+u _C,0j,

u _C,0j~ N(0,σ² _C,u).

Compared to the random intercept model, interpretation of the parameters γ _R,00 and γ _C,00, σ² _R,r and σ² _C,r, as well as σ² _R,u and σ² _C,u changes, as they are now controlled (adjusted) for the effect of the patient-level variable x _R,1 and x _C,1, respectively. Thus,γ _R,00 and γ _C,00 describe grand means given a fixed value of the covariate, and variance related to the covariate is removed from σ² _R,r and σ² _C,r. The hospital-level variance parameters σ² _R,u and σ² _C,u as well as the deviations u _R,0j and u _C,0j, should be interpreted in this model as variation that is not due to differences between hospitals due to the covariate.

Particularly when several patient-level covariates are included, comparing estimates of the models based on registry and claims data reveals the trustworthiness of claims data in answering prognostic and predictive research questions (i. e., associations between covariates and outcomes) as well as in case-mix adjusted description, comparison, and ranking of hospitals for benchmarking and quality improvement.

Model appraisal

In addition to comparing single estimates (local parameters) from models fitted to claims and registry data, classification of models as a whole is possible. Based on the literature on the cross-group invariance of measurement,[16] the following levels of validity (e. g., correspondence between model estimates from claims and registry data) can be outlined:

Configural validity of a model from claims data is given, when it is possible to include the same covariates (x ₁, x ₂, etc.) in the claims data model as in the reference registry data-based model.
Metric validity is given, when the estimated regression coefficients (β ₁, β ₂, etc.) from the claims data display the same direction as the corresponding coefficients from the registry data with overlapping 95% confidence intervals.
Scalar validity requires that the intercepts (γ) are comparable between the models from claims and registry data.
Strict validity is given, when the variance parameters (σ ²) estimated from the claims data are identical to those estimated from the registry data.

Particularly in complex models, it is possible that some parts of the model show strong (scalar or strict) while others weak (configural or metric) validity, resulting in partial validity.

Extensions

Multilevel models are flexible tools, which fit to a broad range of modelling situations. If the outcome is not quantitative and/or its distribution is not normal, generalised linear mixed models can be applied. If necessary, further data levels can be added (e. g., treatment episodes nested within patients). Furthermore, complex associations can be investigated by using random slope models (where the effect of covariates may vary across hospitals), adding hospital-level covariates (e. g., number of beds) that may explain variation in the intercept and/or the slope of the outcome, and modelling (cross-level) interaction and nonlinear effects. Further extensions include using latent (not directly observed) variables and performing path analyses for investigating even more complex research questions.

Stratification-based validation

Basic principles

The central presumption of the second approach is that results of descriptive or sophisticated methods in healthcare research usually focus on subgroups with comparable co-morbidities or treatments rather than single individuals. If a specific subgroup (e. g., females ≥80 years of age undergoing open surgery) in the registry data is comparable to a corresponding subgroup in the claims data in terms of their relevant co-morbidities, the measured outcomes will likely be comparable. The stratification-based validation of registry and health insurance claims data can be interpreted as a successive approximation to an individual cross-linking. In order to respect the patients’ privacy, the method ought to ensure the principles of k-anonymity of the data (group wise linking)[17]. An underlying assumption is the existence of comparable subgroups in both data sources that can be matched in terms of relevant descriptive estimates (e. g., mean and standard deviation).

Principles of k-anonymity for group wise linking in small subgroups

For this study, the patients did not give their informed consent for an individual cross-linking of their personal data collected by the GermanVasc registry to corresponding health insurance claims data. In order to still be able to process the patient data, the method ought to ensure k-anonymity [17]. This notion of privacy expresses that at least k-1 individuals can be identified with any given attributes of a record. Several approaches to achieve k-anonymity exist. For example, removing attributes that expose different values for different patients, but are not of interest for the research question at hand, can be used to remove quasi-identifiers of a record. A generalisation mechanism can be used to group, for example, the age of the patients to, e. g. ≤60, ≥61 to ≤70, and ≥71. It has been proven that the selection of an optimal k-anonymity technique is nondeterministic polynomial time (NP)-hard [18] and that even an optimal k-anonymity technique cannot prevent certain attacks [19] For example, if all attributes of records for a quasi-identifier are of the same value, then stripping that information does not prevent an attacker from inferring it for all matching records. This homogeneity attack can be countered with a technique called l-diversity. Data is said to have the l-diversity property if the attributes are of at least l distinct values [20]. However, not every possible value is of equal distribution or entropy. For example, if a disease is occurring very often for a given set of patients, then the positive indicator is of less entropy than the negative indicator. To defend against an attacker exploiting such knowledge, the notion of t-closeness has been established. If distance of the distribution of values in the target set to the distribution of the whole table is no more than t, the data set is said to possess the t-closeness property [21].

Group wise linking and comparison for internal validity

In the prospectively collected registry data (R), we have a pre-defined set of p variables which were consented before [10] [22] and have q variables in the health insurance claims data (C), whereas we can transform information in variables as International Classification of Diseases (ICD) coding in several dummy variables (e. g. diabetes yes/no). We have a set of m variables which are present in both data sets m<p,q, (intersection variables e. g. age, gender, diabetes, death etc.). For further variables (e. g., treatment costs) we cannot match data. Within the m presenting variables in both data sets, we identify n, n>m relevant subgroups in the registry data: R ₁ , R ₂ , …, R _n (e. g., R ₁: females undergoing bypass surgery in the registry data) and n corresponding subgroups in the health insurance claims data C ₁ , C ₂ , …, C _n (e. g., C ₁: females undergoing bypass surgery in the BARMER cohort). Obviously, subgroups R _i and C _i (i=1,2, …, n) are not identical, neither their crude sample size, nor their composition, because not every female patient in the registry is insured by BARMER and not every vascular centre in Germany is recruiting for the registry, but we assume that R _i and C _i are more like each other than to other subgroups. We link the subgroups group wise R ₁ -C ₁ , R ₂ -C ₂ , …, R _n -C _n due to protection of k-anonymity. We suppose that if these linked subgroups have high similarity regarding their co-morbidity rates as measured by Elixhauser co-morbidity groups,[ 23] [24] we could see a higher similarity in their outcomes than to other subgroups (variance between groups smaller or as small as variance within groups). If we consider registry data is the gold standard, increasing similarity of the linked subgroups in the 2 different data sets suggests increasing internal validity.

Discussion

Adding to the methodological foundations of the validation of health insurance claims data may contribute to a more efficient utilisation of this valuable resource for health services research and quality improvement. In times of increased workload in medical care, physicians and nurses often decline additional documentation requirements emphasising the need to utilise data already collected for reimbursement or management of medical care provided to the patients. If health insurance claims data contain valid information regarding co-morbidities and quality indicators, research and quality improvement registries may be run with less effort by adding complementary information from these resources instead.

Recently, 2 comprehensive reforms of the European Union regulatory framework with major impact on real-world evidence have been implemented. On the one side, the GDPR aims to modernise data protection and privacy in times of big data techniques and significantly strengthens informed consent of the data subjects [5] [25.] On the other side, a new Medical Device Regulation promotes the utilisation of available real-world data for market access and surveillance of medical devices what actually affects a wide spectrum of multidisciplinary vascular medicine. This extensive development of Union law encourages an ongoing controversy regarding real-world data complementing evidence from randomised and controlled trials (RCT) [26] [27]. Despite their potential, there are important risks associated to the use of real-world data [3]. Most importantly, the value of registry-based and claims-based research and quality improvement depends on its validity [28]. This apparent assumption leads to the question if data from health insurance claims can be stated valid for research or quality improvement purposes. To the best of our knowledge, there is no commonly-accepted international standard on how to validate these data. This is mainly due to differences in coding classifications and reimbursement systems. Furthermore, it is caused by the paucity of appropriate data sources to compare claims data with. Even the validity of data from well-designed prospective registries remains unknown until it has been validated itself. The question arises what data can be considered as real-world and there is no simple answer. Several research groups such as the working group for the collection and utilisation of secondary data (AGENS) of the German society for social medicine and prevention (DGSMP) and the German society for epidemiology (DGEPI) are currently evaluating suitable methods to validate health insurance claims data. Although it is usually recognised as gold standard to match patient files or registry data to claims data, several aspects limit this approach. Firstly, to ensure the lawfulness of any processing of personal data in terms of record linkage, an explicit informed consent by the data subject is necessary until no other legal justification exists. This, however, requires a great deal of effort and costs and potentially introduces harms of individual privacy [5] [17] [25]. Secondly, the reference data (e. g., registry data) must have enough internal and external validity to be suitable for validation purposes. Especially the completeness of data and possible missing data remain an important problem of registries limiting their scientific value. Against this backdrop, the completeness of follow-up visits in registries remains a critical issue. To encounter these challenges, the data quality assurance of the GermanVasc registry data is implemented by various measures including random-sample and risk-based independent site visits [11]. Strongly related to the aspect of internal validation, the question arises if registry data and health insurance claims data cover the same target population. There is probably a significant selection bias that should be further illuminated to pounder the value of both data sources for research and quality improvement. The items defined by the data dictionary of a registry probably differs from the ICD codes used to identify PAD patients in health insurance claims. It is well-known that PAD patients differ in their risk profiles among the different stages of disease and from other patient populations. By comparing these risk profiles and clusters of comorbidities, the approach aims to examine this aspect. Thirdly, it seems impossible to examine the validity of all different aspects of health insurance claims data. These rapidly growing data sources involve not only inhomogeneous data collected during hospitalisations but also on medication prescriptions, outpatient treatments, and others. Thus, it will never be justified to state validity of health insurance claims data in general. Validation projects can only prove enough validity of data concerning a specific context including the defined target population. Lastly, this study is limited to the data from the second largest health insurance company from Germany covering approximately 13% of Germany’s population. In Germany, approximately 73 million inhabitants (87%) are insured by 110 statutory health insurance companies and additional 9 million inhabitants (11%) are insured by 50 private insurance companies (data for 2017/2018). However, standardization methods can help to generalize results from single insurance providers to the German target population.

Although PAD is widely distributed and causes a significant burden in modern healthcare systems, approximately 50% of all recommendations in practical guidelines are based on consensus of experts due to lacking high quality studies. To develop data privacy compliant validation methods can help to complement the insufficient knowledge-base especially in fields of medical care where evidence from RCTs remain uncommon. There is a good case to believe that these sources of real-world data will be of increasing importance in the future and there is already a strong interest to use health insurance claims data in a further development of pragmatic trials [2]. Cross-disciplinary consortia involving healthcare researchers, statisticians, computer scientists, and jurists can help to develop suitable and feasible methods and technical infrastructures following the privacy by design principles. The IDOMENEO-approach aims to contribute to this endeavour by providing insights on how and to what extent claims data can be utilised for research and quality improvement in vascular medical care.

Conclusions

The utilisation of health insurance claims data in health care research and quality improvement will increase in the future emphasising the need to validate these data. The European Union data protection regulations complicate direct crosslinking of personal data without legal justification or informed consent. The IDOMENEO study aims to prospectively collect registry and claims data and to develop methods for a data privacy compliant validation.

Ethics

The GermanVasc registry trial complies with the Helsinki Declaration 2013. The primary ethics approval was granted by the Hamburg Medical Chamber Ethics Committee (PV5691, January 2018) and the approval was confirmed by the local ethics committees. An insurance contract was concluded for the 10,000 patients included in this registry trial. All technical and conceptual measures, and the access to the anonymised BARMER health insurance claims data is in accordance with European Union and German regulations.

Conflicts of interest

The authors declare no conflicts of interest.

References
1 Quinn K. After the revolution: DRGs at age 30. Ann Intern Med 2014; 160: 426-429 doi:10.7326/M13-2115

PubMed Search in Google Scholar
Download RIS citation
2 Choudhry NK. Randomized, Controlled Trials in Health Insurance Systems. N Engl J Med 2017; 377: 957-964 doi:10.1056/NEJMra1510058

Crossref PubMed Search in Google Scholar
Download RIS citation
3 Behrendt CA, Debus ES, Mani K. et al. The Strengths and Limitations of Claims Based Research in Countries With Fee for Service Reimbursement. Eur J Vasc Endovasc Surg 2018;

Crossref PubMed Search in Google Scholar
Download RIS citation
4 Behrendt CA, Heidemann F, Riess HC. et al. Registry and health insurance claims data in vascular research and quality improvement. Vasa 2017; 46: 11-15. doi:10.1024/0301-1526/a000589

Crossref PubMed Search in Google Scholar
Download RIS citation
5 Behrendt CA, Joassart Ir A, Debus ES. et al. The Challenge of Data Privacy Compliant Registry Based Research. Eur J Vasc Endovasc Surg 2018; 55: 601-602. doi:10.1016/j.ejvs.2018.02.018

Crossref PubMed Search in Google Scholar
Download RIS citation
6 Behrendt C-A, Pridöhl H, Schaar K et al. Klinische Register im 21. Jahrhundert – Ein Spagat zwischen Datenschutz und Machbarkeit? Chirurg 2017

Download RIS citation
7 Behrendt CA, Riess H, Harter M. et al. Guideline recommendations and quality indicators for invasive treatment of peripheral arterial disease in Germany: The IDOMENEO study for quality improvement and research in vascular medicine. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2018; 61: 218-223. doi:10.1007/s00103-017-2676-9

Crossref PubMed Search in Google Scholar
Download RIS citation
8 Behrendt CA, Härter M, Kriston L. et al. IDOMENEO – Ist die Versorgungsrealität in der Gefäßmedizin Leitlinien- und Versorgungsgerecht?. Gefässchirurgie 2017; 22: 41-47. doi:10.1007/s00772-016-0234-7

Crossref Search in Google Scholar
Download RIS citation
9 DeStatis SB. Krankenhausdiagnosestatistik. In URL: https://www.gbe-bund.de/ Statistisches Bundesamt DeStatis: Gesundheitsberichterstattung des Bundes. 2014

Search in Google Scholar
Download RIS citation
10 Riess HC, Debus ES, Schwaneberg T. et al. Indicators of outcome quality in peripheral arterial disease revascularisations – a Delphi expert consensus. Vasa 2018;

Crossref PubMed Search in Google Scholar
Download RIS citation
11 Debus ES, Kriston L, Schwaneberg T. et al. Rationale and methods of the IDOMENEO health outcomes of the peripheral arterial disease revascularisation study in the GermanVasc registry. Vasa 2018; doi:10.1024/0301-1526/a000730

Crossref PubMed Search in Google Scholar
Download RIS citation
12 Behrendt CA, Tsilimparis N, Diener H. et al. Einführung des GermanVasc. Gefässchirurgie 2014; 19: 403-411 doi:10.1007/s00772-014-1351-9

Crossref Search in Google Scholar
Download RIS citation
13 Messick S. Standards of Validity and the Validity of Standards in Performance Assessment. Educational Measurement: Issues and Practice 2005; 14: 5-8 doi:10.1111/j.1745-3992.1995.tb00881.x

Search in Google Scholar
Download RIS citation
14 Farin E. Die Anwendung Hierarchischer Linearer Modelle für Einrichtungsvergleiche in der Qualitätssicherung und Rehabilitationsforschung. Die Rehabilitation 2005; 44: 157-164 doi:10.1055/s-2004-834785

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
15 Wirtz M. Die Mehrebenenanalyse als Verfahren zur Analyse rehabilitationswissenschaftlicher Forschungsfragen. Die Rehabilitation 2018;

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
16 Steenkamp J, Benedict EM, Baumgartner H. Assessing Measurement Invariance in Cross-National Consumer Research. Journal of Consumer Research 1998; 25: 78-107 doi:10.1086/209528

Crossref Search in Google Scholar
Download RIS citation
17 Sweeney L. k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 2002; 10: 557-570. doi:10.1142/s0218488502001648

Search in Google Scholar
Download RIS citation
18 Meyerson A, Williams R. On the complexity of optimal K-anonymity. Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems – PODS '04 2004

Download RIS citation
19 Aggarwal CC, Yu PS. A General Survey of Privacy-Preserving Data Mining Models and Algorithms. Privacy-Preserving Data Mining 2008; 11-52 doi:10.1007/978-0-387-70992-5_2

Search in Google Scholar
Download RIS citation
20 Machanavajjhala A, Kifer D, Gehrke J. et al. L-diversity. ACM Transactions on Knowledge Discovery from Data 2007; 1: 3-es doi:10.1145/1217299.1217302

Crossref Search in Google Scholar
Download RIS citation
21 Li N, Li T, Venkatasubramanian S. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. 2007 IEEE 23rd International Conference on Data Engineering 2007; : 106-115.

Search in Google Scholar
Download RIS citation
22 Behrendt CA, Bertges D, Eldrup N. et al. International Consortium of Vascular Registries Consensus Recommendations for Peripheral Revascularisation Registry Data Collection. Eur J Vasc Endovasc Surg 2018; 56: 217-237. doi:10.1016/j.ejvs.2018.04.006

Crossref PubMed Search in Google Scholar
Download RIS citation
23 Elixhauser A, Steiner C, Harris DR. et al. Comorbidity measures for use with administrative data. Medical care 1998; 36: 8-27

Crossref PubMed Search in Google Scholar
Download RIS citation
24 Quan H, Sundararajan V, Halfon P. et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care 2005; 43: 1130-1139

Crossref PubMed Search in Google Scholar
Download RIS citation
25 Behrendt CA, Pridohl H, Schaar K. et al. Clinical registers in the twenty-first century : Balancing act between data protection and feasibility?. Chirurg 2017; 88: 944-949. doi:10.1007/s00104-017-0542-9

Crossref PubMed Search in Google Scholar
Download RIS citation
26 Bjorck M, Mani K. Publication of Vascular Surgical Registry Data: Strengths and Limitations. Eur J Vasc Endovasc Surg 2017; 54: 788 doi:10.1016/j.ejvs.2017.09.013

Crossref PubMed Search in Google Scholar
Download RIS citation
27 Bergqvist D, Bjorck M, Sawe J. et al. Randomized trials or population-based registries. Eur J Vasc Endovasc Surg 2007; 34: 253-256. doi:10.1016/j.ejvs.2007.06.014

Crossref PubMed Search in Google Scholar
Download RIS citation
28 Venermo M, Mani K, Kolh P. The quality of a registry based study depends on the quality of the data – without validation, it is questionable. Eur J Vasc Endovasc Surg 2017; 53: 611-612 doi:10.1016/j.ejvs.2017.03.017

Crossref PubMed Search in Google Scholar
Download RIS citation

Correspondence

Dr. Christian-Alexander Behrendt, MD

Department of Vascular Medicine, Work Group GermanVasc,

University Medical Center Hamburg-Eppendorf

Martinistraße 52

20246 Hamburg

Email: behrendt@hamburg.de

References
1 Quinn K. After the revolution: DRGs at age 30. Ann Intern Med 2014; 160: 426-429 doi:10.7326/M13-2115

PubMed Search in Google Scholar
Download RIS citation
2 Choudhry NK. Randomized, Controlled Trials in Health Insurance Systems. N Engl J Med 2017; 377: 957-964 doi:10.1056/NEJMra1510058

Crossref PubMed Search in Google Scholar
Download RIS citation
3 Behrendt CA, Debus ES, Mani K. et al. The Strengths and Limitations of Claims Based Research in Countries With Fee for Service Reimbursement. Eur J Vasc Endovasc Surg 2018;

Crossref PubMed Search in Google Scholar
Download RIS citation
4 Behrendt CA, Heidemann F, Riess HC. et al. Registry and health insurance claims data in vascular research and quality improvement. Vasa 2017; 46: 11-15. doi:10.1024/0301-1526/a000589

Crossref PubMed Search in Google Scholar
Download RIS citation
5 Behrendt CA, Joassart Ir A, Debus ES. et al. The Challenge of Data Privacy Compliant Registry Based Research. Eur J Vasc Endovasc Surg 2018; 55: 601-602. doi:10.1016/j.ejvs.2018.02.018

Crossref PubMed Search in Google Scholar
Download RIS citation
6 Behrendt C-A, Pridöhl H, Schaar K et al. Klinische Register im 21. Jahrhundert – Ein Spagat zwischen Datenschutz und Machbarkeit? Chirurg 2017

Download RIS citation
7 Behrendt CA, Riess H, Harter M. et al. Guideline recommendations and quality indicators for invasive treatment of peripheral arterial disease in Germany: The IDOMENEO study for quality improvement and research in vascular medicine. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2018; 61: 218-223. doi:10.1007/s00103-017-2676-9

Crossref PubMed Search in Google Scholar
Download RIS citation
8 Behrendt CA, Härter M, Kriston L. et al. IDOMENEO – Ist die Versorgungsrealität in der Gefäßmedizin Leitlinien- und Versorgungsgerecht?. Gefässchirurgie 2017; 22: 41-47. doi:10.1007/s00772-016-0234-7

Crossref Search in Google Scholar
Download RIS citation
9 DeStatis SB. Krankenhausdiagnosestatistik. In URL: https://www.gbe-bund.de/ Statistisches Bundesamt DeStatis: Gesundheitsberichterstattung des Bundes. 2014

Search in Google Scholar
Download RIS citation
10 Riess HC, Debus ES, Schwaneberg T. et al. Indicators of outcome quality in peripheral arterial disease revascularisations – a Delphi expert consensus. Vasa 2018;

Crossref PubMed Search in Google Scholar
Download RIS citation
11 Debus ES, Kriston L, Schwaneberg T. et al. Rationale and methods of the IDOMENEO health outcomes of the peripheral arterial disease revascularisation study in the GermanVasc registry. Vasa 2018; doi:10.1024/0301-1526/a000730

Crossref PubMed Search in Google Scholar
Download RIS citation
12 Behrendt CA, Tsilimparis N, Diener H. et al. Einführung des GermanVasc. Gefässchirurgie 2014; 19: 403-411 doi:10.1007/s00772-014-1351-9

Crossref Search in Google Scholar
Download RIS citation
13 Messick S. Standards of Validity and the Validity of Standards in Performance Assessment. Educational Measurement: Issues and Practice 2005; 14: 5-8 doi:10.1111/j.1745-3992.1995.tb00881.x

Search in Google Scholar
Download RIS citation
14 Farin E. Die Anwendung Hierarchischer Linearer Modelle für Einrichtungsvergleiche in der Qualitätssicherung und Rehabilitationsforschung. Die Rehabilitation 2005; 44: 157-164 doi:10.1055/s-2004-834785

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
15 Wirtz M. Die Mehrebenenanalyse als Verfahren zur Analyse rehabilitationswissenschaftlicher Forschungsfragen. Die Rehabilitation 2018;

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
16 Steenkamp J, Benedict EM, Baumgartner H. Assessing Measurement Invariance in Cross-National Consumer Research. Journal of Consumer Research 1998; 25: 78-107 doi:10.1086/209528

Crossref Search in Google Scholar
Download RIS citation
17 Sweeney L. k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 2002; 10: 557-570. doi:10.1142/s0218488502001648

Search in Google Scholar
Download RIS citation
18 Meyerson A, Williams R. On the complexity of optimal K-anonymity. Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems – PODS '04 2004

Download RIS citation
19 Aggarwal CC, Yu PS. A General Survey of Privacy-Preserving Data Mining Models and Algorithms. Privacy-Preserving Data Mining 2008; 11-52 doi:10.1007/978-0-387-70992-5_2

Search in Google Scholar
Download RIS citation
20 Machanavajjhala A, Kifer D, Gehrke J. et al. L-diversity. ACM Transactions on Knowledge Discovery from Data 2007; 1: 3-es doi:10.1145/1217299.1217302

Crossref Search in Google Scholar
Download RIS citation
21 Li N, Li T, Venkatasubramanian S. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. 2007 IEEE 23rd International Conference on Data Engineering 2007; : 106-115.

Search in Google Scholar
Download RIS citation
22 Behrendt CA, Bertges D, Eldrup N. et al. International Consortium of Vascular Registries Consensus Recommendations for Peripheral Revascularisation Registry Data Collection. Eur J Vasc Endovasc Surg 2018; 56: 217-237. doi:10.1016/j.ejvs.2018.04.006

Crossref PubMed Search in Google Scholar
Download RIS citation
23 Elixhauser A, Steiner C, Harris DR. et al. Comorbidity measures for use with administrative data. Medical care 1998; 36: 8-27

Crossref PubMed Search in Google Scholar
Download RIS citation
24 Quan H, Sundararajan V, Halfon P. et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care 2005; 43: 1130-1139

Crossref PubMed Search in Google Scholar
Download RIS citation
25 Behrendt CA, Pridohl H, Schaar K. et al. Clinical registers in the twenty-first century : Balancing act between data protection and feasibility?. Chirurg 2017; 88: 944-949. doi:10.1007/s00104-017-0542-9

Crossref PubMed Search in Google Scholar
Download RIS citation
26 Bjorck M, Mani K. Publication of Vascular Surgical Registry Data: Strengths and Limitations. Eur J Vasc Endovasc Surg 2017; 54: 788 doi:10.1016/j.ejvs.2017.09.013

Crossref PubMed Search in Google Scholar
Download RIS citation
27 Bergqvist D, Bjorck M, Sawe J. et al. Randomized trials or population-based registries. Eur J Vasc Endovasc Surg 2007; 34: 253-256. doi:10.1016/j.ejvs.2007.06.014

Crossref PubMed Search in Google Scholar
Download RIS citation
28 Venermo M, Mani K, Kolh P. The quality of a registry based study depends on the quality of the data – without validation, it is questionable. Eur J Vasc Endovasc Surg 2017; 53: 611-612 doi:10.1016/j.ejvs.2017.03.017

Crossref PubMed Search in Google Scholar
Download RIS citation

Permissions and Reprints

Related Journals

Related Books

Subscribe to RSS

Share / Bookmark

Data Privacy Compliant Validation of Health Insurance Claims Data: the IDOMENEO Approach

Authors

Correspondence

Publication History

Abstract

Zusammenfassung

Schlüsselwörter

Key words

Background

Data Sources

The IDOMENEO study

Registry data

Health insurance claims data

Validation Methods

Validation approaches

Model-based validation

Basic principles

Random intercept model without covariates

Random intercept model with covariates

Model appraisal

Extensions

Stratification-based validation

Basic principles

Principles of k-anonymity for group wise linking in small subgroups

Group wise linking and comparison for internal validity

Discussion

Conclusions

Conflicts of interest

References

Correspondence

References