Skip main navigation

Generating Evidence From Computerized Healthcare Utilization Databases

Originally published 2015;65:490–498

Preliminary Remarks

Randomized clinical trials (RCTs) are regarded as the highest level of therapeutic evidence because they are based on random allocation of participants to ≥2 treatment groups, which provides patients with superimposable initial demographic and clinical characteristics and allows the results to reflect the effect of the treatment strategies under study safely. Also, new treatments are compared with placebo or current interventions, which offers information on their absolute or added value. Finally, end points of documented clinical relevance are selected, the results having an immediate bearing for patients’ health.

However, RCTs also have limitations that can make their results of uncertain and limited application to daily life medicine.14 For example, in RCTs, treatments are delivered according to preselected plans that make management more rigid than the one adopted in real-life. Two, treatments are delivered in a highly controlled environment by operators with specific competence, which results into a much lower chance of mistreatment or errors. Three, for several reasons (cost, progressive patients’ dropout, changes of patient residence, job instability of investigators, etc), RCTs can have a few year duration only, extrapolation been required to apply their results to daily life patients with a much longer life expectancy. Four, to make data scientifically interpretable, in RCTs, vulnerable patients are usually avoided and patients are recruited based on restricted eligibility criteria, which do not reflect the demographic and clinical heterogeneity of the individuals to whom the trial results are eventually applied. Finally, in RCTs, high motivation and close follow-up make patients well compliant to treatment, at variance from clinical practice in which a low and variable treatment adherence is common, with unmeasured but probably substantial modifications of the original trial results.512

Recognition of the above limitations has favored the design and conduction of trials, which could may more appropriately reflect clinical practice. One example is the expansion of the end points by which to determine the efficacy of the intervention to events (eg, revascularization procedures) diagnostically more open to errors or bias, but nevertheless of a large prevalence and decision relevance in real life.13 Furthermore, to ensure a better generalizability of the results, the so-called pragmatic trials are more and more frequently performed, with the aim of selecting patients more similar to those in whom the trial results are applied.14 However, it is widely thought that this does not substantially reduce the gap between the artificial environment where trial data are collected and real-life medicine.15 This has been instrumental in the growth of interest on observational studies, which could complement the results of clinical trials with information on how strong and persistent the effects of healthcare interventions are in real-life conditions. These studies can make use of 3 different data sources, ie, in-field collected data, disease registries, and electronically stored databases generated for administrative purposes by health providers or by general practitioners or specialists. In this review, we will discuss the approach based on computerized healthcare databases with the aim to highlight their strengths, weaknesses, and potential. We will offer examples of the importance of the information provided by these databases on treatment of cardiovascular disease with particular reference to hypertension and to the large healthcare database covering since several years the entire citizenship (10 millions) of the Lombardy region in Italy.

Definition of a Healthcare Database

A healthcare database can be defined as an electronic system designed to store, on an ongoing basis, disease-related data (eg, drug prescriptions, hospital diagnoses, outpatient visits, and so forth) from a well-defined dynamic population, eg, that covered by a public or private healthcare delivery system or attended by a network of general practitioners or specialists. In other terms, disease-related data of interest from a specific target population need to be available in the definition of a healthcare database.

Types, Sources, and Characteristics of Large Healthcare Data

Databases collecting healthcare data can be classified into 2 broad categories: (1) those collecting information for administrative purposes (ie, administrative or healthcare utilization [HCU] databases) and (2) those generated by medical records [MRs] to allow physicians to track information on their patients over time (ie, MR databases).16 A description of HCU and MR databases, in comparison with conventional sources of healthcare information, is provided in Table 1.1721

Table 1. Relative Advantages (+) and Disadvantages (−) of Major Data Sources in Health Research

Desired Traits for Health ResearchProspective Data CollectionAnalysis of Existing Databases
Controlled Randomized TrialsLongitudinal Observational StudiesHealthcare Utilization DatabasesMedical Record Research Databases
Less expensive++++
Promptness of data availability++++++
Patient awareness/level of intrusion++++++
Data applicability to multiple conditions/diseases++++++
Size of collected data+++++
Patient heterogeneity representativeness++++++
Real life clinical practice representativeness+++++++
Quality/extent of clinical information+++++
Absence of confounding by indication/group comparability+++
Accessibility by health services investigators+
Accuracy of disease coding+++++
Upcoding of diseases or services*+++++++++
Control of collected information by investigators++++++
ExampleScandinavian Simvastatin Survival Study18The Framingham Heart Study19Medicaid database20General Practice Research Database21

Reprinted from Gandhi et al.17

*Done to maximize reimbursement.

HCU databases were initially created to supply payments to providers of health services within public or private healthcare delivery systems.22 Their management requires electronically stored data on patients’ demography, healthcare procedures, and services representing a cost for Health Authorities, such as drug dispensations, hospital admissions and diagnoses, surgical and other interventions in and outside hospitals, outpatients visits by general practitioners and specialists, laboratory examinations, vaccinations, etc. HCU databases are widespread in United States where they are funded by the government (Medicare, Medicaid, and Veteran Administration), large health insurance companies (eg, United Health), or health maintenance organizations (eg, Kaiser Permanente).16 More recently, HCU databases have spread to several countries of the European Union, with the advantages, compared with United States, that data are usually collected from more stable populations and that health coverage is extended to all diseases and involves virtually all individuals. Despite these advantages, HCU databases are not as popular in Europe as in the United States, mainly because of the difficulty posed by strict privacy regulations, which has to date favored the use of MR databases.

The most important MR database is probably the UK General Practice Research Database, a large computerized database of anonymized longitudinal patient records from hundred general practitioners and about 3 million patients (ie, ≈5% of the United Kingdom population).2123 Similar databases, however, are available in other European countries. In the Netherlands, the Integrated Primary Care Information database involves ≈150 general practitioners and up to half a million patients.24 In Italy, the Health Search MR database involves ≈900 general practitioners and >1.2 million patients.25 In Sweden, a database that includes >75 primary care centers covering an area with ≈800 000 individuals is used for investigating several fields of healthcare research, including hypertension.26,27 In general, the clinical information provided by MR databases is much more extensive than that of HCU databases, the data including lifestyle habits, risk factors, blood pressure values, and patients’ clinical history. MR databases, however, suffer from the fact that physicians usually provide information on patients’ diagnoses and care they are more directly responsible for, which means that clinical data may not infrequently be partial and the patient’s overall clinical status unlikely to be available in a comprehensive database format. Data quality may also be a problem and selection of participating physicians may be such as to make MR-generated data of uncertain representativeness of the more general healthcare standard.

Strengths of HCU Databases

HCU databases have important advantages. One, information can be obtained at low cost, over long time frames, and quickly because data are laid down in an electronic format. Two, available data include many different healthcare services (prescription of drug medicaments, medical visits, laboratory and instrumental examinations, hospitalizations, etc), which can be electronically linked by an unique patient code so that for each individual the assigned care, its changes over time, and its effect on ≥1 outcomes can be tracked. Three, because of the large size of the covered population (often up to million patients), data can reliably and timely identify trends in the use of healthcare interventions, drugs, and devices, which may allow to verify the consequences of medical recommendations or changes in healthcare regulations suitably. Four, because they are recorded independently of patient agreement, data are immune from the bias related to the selection of patients by their willingness to participate, an often forgotten limitation impossible to exclude in conventional research studies.28 Finally, and most importantly, HCU databases guarantee that the information reflects the state of clinical practice in the general population, this being particularly the case where healthcare is assured by a system covering the whole citizenship. The above advantages are listed in Table 2.

Table 2. Major Advantages of Using Healthcare Utilization Databases

Data CharacteristicsAdvantage
Data are available in electronic formatLow cost of investigation
Data include all healthcare services supplied to delivery system beneficiariesA comprehensive healthcare history of each beneficiary of healthcare system may be available
Patients and doctors are not involved in data collectionFindings are free from bias generated by awareness of being under observation
Data cover large populationsOutcomes that rarely occur may be investigated
Data collected from unselected populationsInformation reflects the state of clinical practice in the general population (particularly where healthcare is assured to the whole citizenship)

Research Applications of HCU Databases

HCU databases are particularly suitable for investigating the following areas.

Profiles of Drug Use

HCU data offer accurate information on the prevalence, incidence, and duration of drugs’ use, which is obviously essential for health system planning and for assessing appropriateness of prescribing. Drug use measures (including the number of current users, the new or incident users, and the duration of use) can be directly obtained and comparisons can be made between individual drugs or drug classes and between drug use at different times or in different geographical areas. This provides a comprehensive picture of the therapeutic habits of a population and of their temporal modifications.

Of special interest is the possibility offered by HCU data to study persistence and adherence with chronic therapy,29 ie, (1) the overall duration of uninterrupted drug therapy often measured as the cumulative proportion of patients who do not experience any episode of treatment discontinuation during follow-up (persistence) and (2) the duration of drug use often measured by the ratio between the cumulative number of days in which the medication is available and the days of the overall follow-up (adherence). Because of patients’ unawareness, the HCU data are free from the so-called Hawthorne effect, namely the behavioral distorsion that occurs when human beings know to be under observation.30 Furthermore, the adherence/persistence patterns can be determined over prolonged follow-ups and their relationship to events can be assessed by linking the results to those derived from hospitalization or other sources. Using this approach, in the HCU database from Lombardy, we showed that about one third of newly treated hypertensive patients discontinued treatment after the first prescription31 and that only slightly <50% of the remaining patients did not experience an episode of prolonged (≥3 months) treatment discontinuation during the follow-up.10 This also occurred for antidiabetic and lipid-lowering drugs, the rate of prescription coverage being in all instances related to the rate of hospitalization for cardiovascular morbid events.31 HCU data have been also shown to help clarifying the factors related to, and possibly responsible for, treatment discontinuation. In the Lombardy HCU database, discontinuation of antihypertensive treatment was found to be closely dependent on the type of drug or treatment strategy (mono or combination treatment) prescribed.10,32 It was also found to be majorly affected by patients’ demography, cotreatments, type and severity of diseases of cardiovascular or noncardiovascular nature, and even adversely influenced by unexpected variables, such as residence in metropolitan areas and density of the population where the patient lived.33 It should be emphasized that knowledge of the factors involved in low adherence to treatment is preliminary to and fundamental for any action that aims at reducing the extent of this phenomenon in real life. For hypertension, dyslipidemia, and diabetes mellitus, this would be the most crucial means to improve cardiovascular prevention strategies because for all these diseases nonadherence is majorly responsible for the strikingly low rate of their therapeutic control. This low control is the primary reason for the maintenance of cardiovascular disease as the main cause of death worldwide.34

Postmarketing Studies on Treatment Effectiveness

As pointed out by Cochrane ≈40 years ago,35 RCTs that test drug efficacy for regulatory purposes aim at assessing the extent to which an intervention does more good than harm under ideal circumstances (ie, whether it can work). However, once its efficacy is established, a medical intervention has to be and is applied to people and in circumstances that can be different from the original ones, making the assessment of its effect in real-world practice (ie, whether it not only can but also does work crucially important). HCU databases can provide information on the effectiveness of treatment: (1) in patients often excluded from premarketing studies or clinical trials, eg, frail elders, patients with comorbidities, women, adolescents or younger patients, sometimes with the first demonstration of unanticipated beneficial effects; (2) over times much longer than those compatible with RCTs or with numbers far greater than those available in in-field observational studies; and (3) under the low and variable adherence to the prescribed treatment regimen that is typical of the real-life population. In this context, HCU data from Lombardy showed that in the real-world setting, patients who persisted with antihypertensive drug therapy had a 37% reduction in the risk of coronary events and a 36% reduction in the risk of cerebrovascular events.36 Likewise, good adherence with statins prescriptions was associated with a 19% reduction of the risk of ischemic heart disease37 and a 25% reduction of the risk of dementia.38 This shows that the protective effect of antihypertensive and lipid-lowering drugs is not lost when these treatments are implemented in real life, but also that in real life the degree of protection is highly variable, factors such as adherence to the treatment regimen playing a major role.

Safety Concerns

The longitudinal nature, large size (up to several million individuals), and quick availability make use of HCU databases extremely attractive also for safety studies, with perhaps an unsurpassed advantage in the case of treatment-dependent rare events or long-term effects of drugs and interventions. The advantage is made even greater by the fact that benefits and harms of a given medicament or therapeutic strategy can be assessed on a comparable scale, thereby providing regulatory agencies and caregivers with material for a balanced decision on the net beneficial effect of a therapeutic approach when applied to the general population.1 An example is offered by the examination, via the HCU database, of the long-term benefit and harm of statin administration in the Lombardy population, the benefit consisting of the reduction of coronary events and the harm of the reported greater risk of new onset type 2 diabetes mellitus. Over a 6-year follow-up patients with high adherence to the prescribed statin therapy showed an increased risk of developing diabetes mellitus than patients with low adherence.39 They also showed, however, a clear-cut reduction of the risk of coronary events,37 the protective effect exceeding the diabetes mellitus risk, despite its observed greater size (+32%) than that detected in randomized trials.40 Indeed, the long-term balance may probably be even more in favor of the beneficial effect of statins than that calculated in the Lombardy HCU database because in RCTs statins have also been shown to reduce the risk of stroke.41 Furthermore, as mentioned above, patients with a high level of adherence to lipid-lowering treatment also showed a significant reduction in the incidence of dementia,38 a condition associated with a marked increase in the direct and indirect costs to be covered by health service systems.


Cost–effectiveness analyses are a major challenge for cardiovascular disease, particularly when applied to conditions such as hypertension, dyslipidemias, and diabetes mellitus in which both the risk of events and the need of treatment are spread over the lifetime. It is also an analysis of key importance for healthcare decisions because, for a given level of benefit, resource allocation to the less costly alternative is necessary,42 especially at times of financial crises. This is also an area on which HCU databases may offer an important contribution because (1) calculation of benefit can be based on a real-life long-term event incidence and (2) direct costs can comprehensively include all or most healthcare items (drugs, medical visits, hospitalizations, etc), also based on a real life.43 An example of the results that can be obtained is offered by the calculation of the cost–effectiveness of healthcare strategies that would increase the adherence to antihypertensive or lipid-lowering treatments in the Lombardy population.44,45 From the cost and event incidence data provided by the database itself, it was possible to conclude that in either case the increase of cost associated with an increased adherence was, at any adherence level, largely outweighed by the saving inherent to the event reduction, supporting the need for health authorities to engage into an effective adherence implementation policy. Methodological improvements of the model are desirable to raise the potentiality of HCU data in this direction and thus make them the basis for more and more evidence-based healthcare providers’ decisions.

Linkage With Other Databases

HCU databases can be linked with MR data or primary care center databases (as done by the Swedish Board of Health and Welfare26,27), as well as with other cohorts that are more rich in clinical information. This helps reducing the main limitations (small amount of clinical data, see below), while providing these other databases with the chance of longitudinal follow-up on outcomes they would otherwise be deprived of.46

Limitations of HCU Databases and Their Possible Correction

The Figure shows that interpretation of HCU databases need to consider several sources of bias. Three main possible biases are discussed in detail below.


Figure. Observational studies based on Healthcare Utilization Databases, investigating the relationship between therapeutic regimen and outcome that are affected by 3 sources of uncertainty, ie, exposure misclassification, outcome misclassification, and confounding.

Prescription Versus Consumption

The validity of studies performed with HCU data is based on the assumption that drug prescriptions correspond to drug consumption. There is, however, no guarantee that this is always the case, and indeed it is likely that in many patients the prescribed drugs are not consumed. This implies, however, that in the real world, discontinuation of and adherence to treatment may be even worse than the quantification obtained via the HCU databases. In other words, that the HCU data, disappointing as they appear, may only err for optimism, the possibility that they provide an erroneously unfavorable view of the real-life situation being highly unlikely.

It should also be mentioned that this type of bias, ie, that an unknown fraction of patients to whom the drug of interest is prescribed does not consume the drug, means the introduction in the database of false-positive drug users. If the medication does not affect the outcome of interest, this does not produce any distorsion of the results. On the contrary, if the drug has an effect, this type of bias tends to mask (namely to drag toward the null) the estimated drug–outcome association.

Other Types of Drug Exposure Misclassification

Drug prescription data may be misclassified because prescription is incomplete, its reading is erroneous or mistakes in drug coding are made. However, when filling drug prescriptions, pharmacists have little room for interpretation and reimbursement by Health Authorities are made on the basis of detailed, complete, and accurate claims that are submitted electronically.4749 Pharmacy dispensing information is, therefore, expected to provide highly accurate data, also because filling an incorrect report about dispensed drugs has legal consequences.50

In spite of these reassuring considerations, in several databases, incompleteness of information may generate exposure misclassification. Incompleteness often consists of missing data on (1) free sample drugs, (2) nonreimbursed over-the-counter medications,51 (3) drugs dispensed during hospitalizations,52 and (4) calculations of days covered by a given dispensation. For instance, when comparing 2 antihypertensive medications, if 1 of the 2 is (1) more often delivered by the pharmacist over-the-counter (eg, because of its lower cost), (2) dispensed also in the hospital setting, or (3) prescribed at a higher than the defined dose (eg, because this is indicated for more severe patients), a misclassification is generated, more person-time being classified as unexposed when it is in fact exposed (or vice versa) in one than the other comparison group. Thus, some forms of approximation are unavoidable for most HCU-based studies.53,54 The effect of these approximations on the validity of the results will have to be always explored by sensitivity analyses (ie, by evaluating the robustness of the findings through modifications of the approximation criteria).55,56

Outcome Misclassification

To generate reliable data, a HCU database must guarantee that the selected outcome is assessed with sufficient sensitivity and specificity. For example, if the database misses a noticeable fraction of patients experiencing an acute myocardial infarction or a stroke, sensitivity is limited, whereas if the database erroneously attributes to a myocardial infarction or a stroke, a noticeable number of hospitalizations is generated by other diseases, the limitation involves specificity. It is important to mention that for proper assessment of the relationship of a given treatment with outcome, a limited specificity is more damaging than a limited sensitivity, a 100% specificity of the outcome assessment allowing an unbiased estimate of the measured association, irrespective of the sensitivity value.57 It should also be mentioned that, based on the available literature, in HCU data diagnostic misclassifications is not as common as it might at first sight appear.58 A recent comprehensive study has shown that in the HCU databases, sensitivity was only small or moderate, whereas specificity was usually ≥95% (Table 3).58,59 In general, a high specificity may be expected when diagnoses are based on hospital records, whereas performance is likely to be worse when diagnoses are based on out-of-hospital identification of diseases (eg, by general practitioners or specialists).

Table 3. Sensitivity and Specificity of Diagnoses Made at Discharge From Hospital or Outpatient Care

Outpatient Care Billing DiagnosesHospital Discharge Diagnoses
Diabetes mellitus62.697.28899.4
Renal failure18.699.08899.4
Chronic liver disease27.699.8100100
Any cancer44.895.091100
Peptic ulcer disease27.694.692100
Congestive heart failure41.596.18599
Acute myocardial infarction25.496.894100
Stevens–Johnson syndrome95

Reprinted from Schneeweiss and Avorn.58 COPD indicates chronic obstructive pulmonary disease.

Despite these considerations, diagnostic validation is usually recommended not only for out-of-hospital but also for hospital-derived data. This may make use of findings from published validation studies performed in other populations although some60 thought that validation data generated from the in-study database are always required. In this context, however, a major difficulty is represented by the privacy regulations that in several countries prevent individual MRs from being thoroughly reviewed. Thus, diagnostic validation via sensitivity analyses that verify the robustness of the findings by applying credible sensitivity and specificity ranges to the source of diagnostic information is often the only practical option.

Another potential limitation of HCU database is that certain diseases cannot be reliably identified from hospital records because they rarely lead to hospitalization. However, the issue can be approached by identifying the addition of their therapeutic drugs to the medical prescription, provided that the drugs are only used for the disease under study. A pertinent example is the effect of statins on the onset of type 2 diabetes mellitus because (1) except perhaps in its late stage, only occasionally diabetes mellitus is the cause of hospital admission and (2) antidiabetic drugs are not used for any other disease. The use of antidiabetic medications as a proxy for diabetes mellitus onset has allowed to make interesting observations on the influence of cardiovascular treatments on the development of diabetes mellitus in the general population.39,61 It should be emphasized, however, that this approach is likely to have a low sensitivity because in several cases appearance of the disease may not be associated with specific treatment initiation. Specificity, on the other hand, may be optimized by using strict criteria for the disease identification (eg, by diagnosing new onset diabetes mellitus only in patients with several consecutive prescriptions of antidiabetic drugs. This strengthens the above-mentioned need, for the HCU database approach, to always check the validity of the results by sensitivity analysis.


There is no question that the most important concern against HCU databases is confounding. This is particularly the case for that common and often insidious and uncorrectable form of confounding, endemic to pharmacoepidemiology and healthcare research studies, that is known as confounding by indication. Namely, the observation that a given medication is associated with an increased likelihood of outcome onset for which is thought to be responsible while the explanation lies in its greater use in patients more prone to develop the outcome.

Strategies to adjust for such a bias vary depending on the amount of information included in the database. If all or most factors representing potential confounders are available, propensity-score methods simultaneously accounting for a large number of patient characteristics can be used with an effective reduction of confounding biases.62 However, because HCU databases generally have a limited amount of clinical information, methods accounting for unmeasured confounders should be adopted to avoid or at least limit the effect of bias. One such method, called user-only design, is to restrict the cohort analysis to patients under the in-study medicaments, thereby comparing only individuals in which the need to use them was initially established.63

Other methods aim at ensuring a better comparability of different therapeutic strategies by other types of restriction of the original cohort. In this respect, a pivotal study of statin treatment and 1-year mortality showed a reduced influence of confounding by using progressively stricter criteria for cohort inclusion.64 Starting from a mortality reduction of 68% among the entire cohort of statin users, the restriction of the cohort to incident users only,65 the selection of a comparison group similar to the intervention group and the exclusion of patients with contraindications and low adherence to treatment led to a final estimate of a 28% reduction, a figure similar to that obtained from the pooled estimate by RCTs.64 In general, it is thought that a set of 3 restrictions can be adopted in comparative effectiveness research with no major impairment of data generalizability.28

Restricting the investigated cohort only to patients who experience the in-study outcome (case-only design) is another possibility, its use in pharmacoepidemiology and healthcare research via methods such as the case-crossover66 and the self-controlled case series design67 being regarded as attractive (particularly in patients with transient exposure to or acute outcomes from drug use) because of the ability to evaluate time-invariant confounders. For antihypertensive therapy, an example has recently been provided by a study that compared the risk of discontinuing antihypertensive therapy in patients under generic versus brand-name drugs.68 Given the possibility that physicians would prescribe one or the other drug type according to the severity of clinical profile, only patients discontinuing treatment who experienced both generic and brand-name drug exposures during a suitable follow-up were analyzed. The results showed that treatment discontinuation did not differ in the 2 conditions, a conclusion drawn by a within-patient comparison that could be reasonably assumed to be exempted from the above-mentioned possible between patient imbalance.

Finally, new analytic techniques, such as sensitivity analysis,69 instrumental variable methods,70 or propensity score calibration,71 are increasingly applied to HCU databases to account for residual confounding. Although a description of their rationale and functioning is beyond the scope of the present review, an example of the advantages of the use of these methods may help the reader to penetrate this complex area. In a study on the Lombardy HCU database, it has been observed that, compared with hypertensive patients starting treatment with a fixed-dose combination of 2 antihypertensive drugs, those on a free or liberal combination exhibited a 15% significant increase in the risk of coronary or cerebrovascular morbid events.72 Although considering a medical explanation of this finding, the possibility of its origin from confounding (ie, a more frequent use of free combinations in patients with more severe hypertension or a worse cardiovascular risk profile in the belief that giving drugs separately makes titration to full effect easier and blood pressure reduction more balanced) prevented any reliable interpretation of the results. By omitting sources of selective prescribing (eg, because these data are not available from the HCU database used for investigating this issue) biased estimates of the drug–outcome association are systematically generated. However, biased estimates may be externally adjusted if information on physicians’ prescriptive behavior and on the strength of the confounder–outcome association (ie, to what extent patients’ clinical characteristics affect prescribing and cardiovascular risk, respectively) is available. Information on physician’s prescriptive behavior may be obtained from MR data on a sample of the population covered from the HCU database, whereas data on the confounder–outcome association may be derived from epidemiological or intervention studies (hopefully in the same or in a similar population), the 2 sets of data combined being used to estimate the bias factor (ie, the residual bias that would result from failure to check for these confounders). In addition, a Monte Carlo sampling procedure may be used to deal with the random uncertainty of externally adjusted estimates,73 the results being tested against an unmeasured confounder, selected among those known to be common in the population at study and to have a substantial effect on the outcome of interest. In the above-mentioned case, the use of the above-mentioned corrections abolished the increased cardiovascular risk seen in patients starting treatment with a free compared with a fixed-dose drug combination (odds ratio, 1.00; 95% confidence interval, 0.89–1.12).

Challenges and Concluding Recommendations

HCU databases are attractive because they provide huge amounts of data. Their representativeness of the population under investigation and capacity to offer longitudinal type of information for extended follow-ups at individual patient level make this approach useful for clinical observational research, especially on drug use and outcomes. Its application also includes studies of physician prescribing and patient compliance, as well as those focusing on safety, effectiveness, and cost–effectiveness of therapeutic strategies. HCU data cannot substitute RCTs but they may fill in gaps where RCTs will probably not be done and generate real-life–based hypotheses to be tested by RCTs.

However, quantity has little merit in the absence of a sufficient level of quality, and as far as HCU databases are concerned several examples have been given that debatable eligibility criteria, incompleteness or incorrectness of linkage variables, errors in quantification of care exposure, inaccurate identification of outcome and, above all, confounding resulted into incorrect conclusions.74,75 This may produce damage because HCU databases offer great opportunities but they also carry, as it has been emphasized in a recent editorial,76 the responsibility that their conclusion may lead to decisions that involve the health care of a large number of beneficiaries. Proper use of HCU databases need not forget challenges that are still inherent to this approach. Although technical progress allows in most instances to collect, store, and disclose quickly to analysis even huge and rapidly growing amounts of data, some concern remains on privacy and security regulations. Furthermore, completeness and quality of data continue to remain an issue for some HCU databases, the great disparity between data included in different databases (because they are usually collected under different jurisdictions for financial rather than for clinical or research purposes),77 representing a barrier to their pooling and comparisons between different regions or countries. Finally, if data are analyzed uncritically, ie, without consideration for possible misclassification and confounding, skepticism is generated and the credibility of the approach is undermined. Fortunately, data collection and analysis have undergone substantial progress in the past decade,15,58,7880 so that poor quality or incomplete databases cannot anymore be justified on technical grounds. Further important improvements are expected in the near future because preservation of privacy and security are being addressed by new authentication approaches and policies that better safeguard patient-identifiable data. Furthermore, reimbursement policies are making more and more use of automatic devices (eg, individual magnetic card) that will automatically generate data virtually free from errors or omissions. All these justify the opinion of the authors of this review that HCU data will be more and more recognized to be a powerful tool in the research area in which they belong, as long as they are (1) used with appropriate attention to their potential limits, (2) based on transparent protocols81 and recognized quality standards,82 and (3) analyzed by proper statistical methods that allow the pitfalls inherent to this approach to be at least, in part, corrected.


This article was sent to David A. Calhoun, Guest Editor, for review by expert referees, editorial decision, and final disposition.

Correspondence to Giovanni Corrao, Dipartimento di Statistica e Metodi Quantitativi, Sezione di Biostatistica, Epidemiologia e Sanità Pubblica, Università degli Studi di Milano-Bicocca, Via Bicocca degli Arcimboldi, 8, Edificio U7, 20126 Milano, Italy. E-mail


  • 1. Dieppe P, Bartlett C, Davey P, Doyal L, Ebrahim S. Balancing benefits and harms: the example of non-steroidal anti-inflammatory drugs.BMJ. 2004; 329:31–34. doi: 10.1136/bmj.329.7456.31.CrossrefMedlineGoogle Scholar
  • 2. Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs.N Engl J Med. 2000; 342:1887–1892. doi: 10.1056/NEJM200006223422507.CrossrefMedlineGoogle Scholar
  • 3. Concato J. Observational versus experimental studies: what’s the evidence for a hierarchy?NeuroRx. 2004; 1:341–347. doi: 10.1602/neurorx.1.3.341.CrossrefMedlineGoogle Scholar
  • 4. Avorn J. In defense of pharmacoepidemiology–embracing the yin and yang of drug research.N Engl J Med. 2007; 357:2219–2221. doi: 10.1056/NEJMp0706892.CrossrefMedlineGoogle Scholar
  • 5. Fitz-Simon N, Bennett K, Feely J. A review of studies of adherence with antihypertensive drugs using prescription databases.Ther Clin Risk Manag. 2005; 1:93–106.CrossrefMedlineGoogle Scholar
  • 6. Mazzaglia G, Mantovani LG, Sturkenboom MC, Filippi A, Trifirò G, Cricelli C, Brignoli O, Caputi AP. Patterns of persistence with antihypertensive medications in newly diagnosed hypertensive patients in Italy: a retrospective cohort study in primary care.J Hypertens. 2005; 23:2093–2100.CrossrefMedlineGoogle Scholar
  • 7. Van Wijk BL, Klungel OH, Heerdink ER, de Boer A. Rate and determinants of 10-year persistence with antihypertensive drugs.J Hypertens. 2005; 23:2101–2107.CrossrefMedlineGoogle Scholar
  • 8. Burke TA, Sturkenboom MC, Lu SE, Wentworth CE, Lin Y, Rhoads GG. Discontinuation of antihypertensive drugs among newly diagnosed hypertensive patients in UK general practice.J Hypertens. 2006; 24:1193–1200. doi: 10.1097/01.hjh.0000226211.95936.f5.CrossrefMedlineGoogle Scholar
  • 9. Elliott WJ, Plauschinat CA, Skrepnek GH, Gause D. Persistence, adherence, and risk of discontinuation associated with commonly prescribed antihypertensive drug monotherapies.J Am Board Fam Med. 2007; 20:72–80. doi: 10.3122/jabfm.2007.01.060094.CrossrefMedlineGoogle Scholar
  • 10. Corrao G, Zambon A, Parodi A, Poluzzi E, Baldi I, Merlino L, Cesana G, Mancia G. Discontinuation of and changes in drug therapy for hypertension among newly-treated patients: a population-based study in Italy.J Hypertens. 2008; 26:819–824. doi: 10.1097/HJH.0b013e3282f4edd7.CrossrefMedlineGoogle Scholar
  • 11. Vrijens B, Vincze G, Kristanto P, Urquhart J, Burnier M. Adherence to prescribed antihypertensive drug treatments: longitudinal study of electronically compiled dosing histories.BMJ. 2008; 336:1114–1117. doi: 10.1136/bmj.39553.670231.25.CrossrefMedlineGoogle Scholar
  • 12. Perreault S, Dragomir A, Blais L, Bérard A, Lalonde L, White M. Impact of adherence to statins on chronic heart failure in primary prevention.Br J Clin Pharmacol. 2008; 66:706–716. doi: 10.1111/j.1365-2125.2008.03269.x.CrossrefMedlineGoogle Scholar
  • 13. Hlatky MA, Ray RM, Burwen DR, et al. Use of Medicare data to identify coronary heart disease outcomes in the Women’s Health Initiative.Circ Cardiovasc Qual Outcomes. 2014; 7:157–162. doi: 10.1161/CIRCOUTCOMES.113.000373.LinkGoogle Scholar
  • 14. National Heart, Lung, and Blood Institute Working Group on Future Directions in Hypertension Treatment Trials. Major clinical trials of hypertension: what should be done next?Hypertension. 2005; 46:1–6.LinkGoogle Scholar
  • 15. Sox HC, Goodman SN. The methods of comparative effectiveness research.Annu Rev Public Health. 2012; 33:425–445. doi: 10.1146/annurev-publhealth-031811-124610.CrossrefMedlineGoogle Scholar
  • 16. Strom BLPharmacoepidemiology. 4th edn. Chichester: John Wiley & Sons Ltd; 2005.Google Scholar
  • 17. Gandhi SK, Salmon W, Kong SX, Zhao SZ. Administrative databases and outcomes assessment: an overview of issues and potential utility.J Managed Care Pharm. 1999; 5:215–222.CrossrefGoogle Scholar
  • 18. Scandinavian Simvastatin Survival Study Group. Randomized trial of cholesterol lowering in 444 patients with coronary heart disease: the Scandinavian simvastatin survival study (4S).Lancet. 1994; 344:1383–1389.MedlineGoogle Scholar
  • 19. Wilson PW, D’Agostino RB, Sullivan L, Parise H, Kannel WB. Overweight and obesity as determinants of cardiovascular risk: the Framingham experience.Arch Intern Med. 2002; 162:1867–1872.CrossrefMedlineGoogle Scholar
  • 20. Hennessy S, Carson JL, Ray WA, Strom BL. Medicaid databases., Strom BLIn: Pharmacoepidemiology. 4th ed. Chichester: John Wiley & Sons; 2005:281–94.Google Scholar
  • 21. Gelfand JM, Margolis DJ, Dattani H. The UK General Practice Research Database., Strom BLIn: Pharmacoepidemiology. 4th ed. New York: John Wiley; 2005:337–46.Google Scholar
  • 22. Suissa S, Garbe E. Primer: administrative health databases in observational studies of drug effects—advantages and disadvantages.Nature Clin Pract Rheumatol. 2007; 3:725–732.CrossrefMedlineGoogle Scholar
  • 23. Walley T, Mantgani A. The UK General Practice Research Database.Lancet. 1997; 350:1097–1099. doi: 10.1016/S0140-6736(97)04248-7.CrossrefMedlineGoogle Scholar
  • 24. van der Lei J, Duisterhout JS, Westerhof HP, van der Does E, Cromme PV, Boon WM, van Bemmel JH. The introduction of computer-based patient records in The Netherlands.Ann Intern Med. 1993; 119:1036–1041.CrossrefMedlineGoogle Scholar
  • 25. Filippi A, Vanuzzo D, Bignamini AA, Sessa E, Brignoli O, Mazzaglia G. Computerized general practice databases provide quick and cost-effective information on the prevalence of angina pectoris.Ital Heart J. 2005; 6:49–51.MedlineGoogle Scholar
  • 26. Kjeldsen SE, Stålhammar J, Hasvold P, Bodegard J, Olsson U, Russell D. Effects of losartan vs candesartan in reducing cardiovascular events in the primary treatment of hypertension.J Hum Hypertens. 2010; 24:263–273. doi: 10.1038/jhh.2009.77.CrossrefMedlineGoogle Scholar
  • 27. Hasvold LP, Bodegård J, Thuresson M, Stålhammar J, Hammar N, Sundström J, Russell D, Kjeldsen SE. Diabetes and CVD risk during angiotensin-converting enzyme inhibitor or angiotensin II receptor blocker treatment in hypertension: a study of 15,990 patients.J Hum Hypertens. 2014; 28:663–669. doi: 10.1038/jhh.2014.43.CrossrefMedlineGoogle Scholar
  • 28. Schneeweiss S. Developments in post-marketing comparative effectiveness research.Clin Pharmacol Ther. 2007; 82:143–156. doi: 10.1038/sj.clpt.6100249.CrossrefMedlineGoogle Scholar
  • 29. Heckbert SR, Kooperberg C, Safford MM, Psaty BM, Hsia J, McTiernan A, Gaziano JM, Frishman WH, Curb JD. Comparison of self-report, hospital discharge codes, and adjudication of cardiovascular events in the Women’s Health Initiative.Am J Epidemiol. 2004; 160:1152–1158. doi: 10.1093/aje/.CrossrefMedlineGoogle Scholar
  • 30. Andrade SE, Kahler KH, Frech F, Chan KA. Methods for evaluation of medication adherence and persistence using automated databases.Pharmacoepidemiol Drug Saf. 2006; 15:565–574; discussion 575. doi: 10.1002/pds.1230.CrossrefMedlineGoogle Scholar
  • 31. McCarney R, Warner J, Iliffe S, van Haselen R, Griffin M, Fisher P. The Hawthorne Effect: a randomised, controlled trial.BMC Med Res Methodol. 2007; 7:30. doi: 10.1186/1471-2288-7-30.CrossrefMedlineGoogle Scholar
  • 32. Corrao G, Zambon A, Parodi A, Merlino L, Mancia G. Incidence of cardiovascular events in Italian patients with early discontinuations of antihypertensive, lipid-lowering, and antidiabetic treatments.Am J Hypertens. 2012; 25:549–555. doi: 10.1038/ajh.2011.261.CrossrefMedlineGoogle Scholar
  • 33. Corrao G, Parodi A, Zambon A, Heiman F, Filippi A, Cricelli C, Merlino L, Mancia G. Reduced discontinuation of antihypertensive treatment by two-drug combination as first step. Evidence from daily life practice.J Hypertens. 2010; 28:1584–1590. doi: 10.1097/HJH.0b013e328339f9fa.CrossrefMedlineGoogle Scholar
  • 34. Mancia G, Zambon A, Soranna D, Merlino L, Corrao G. Factors involved in the discontinuation of antihypertensive drug therapy: an analysis from real life data.J Hypertens. 2014; 32:1708–1715; discussion 1716. doi: 10.1097/HJH.0000000000000222.CrossrefMedlineGoogle Scholar
  • 35. Mancia G, Fagard R, Narkiewicz K, et al; Task Force Members. 2013 ESH/ESC Guidelines for the management of arterial hypertension: the Task Force for the management of arterial hypertension of the European Society of Hypertension (ESH) and of the European Society of Cardiology (ESC).J Hypertens. 2013; 31:1281–1357. doi: 10.1097/ Scholar
  • 36. Cochrane ALEffectiveness and Efficiency: Random Reflection on Health Services. London: Nuffield Provincial Hospitals Trust; 1972.Google Scholar
  • 37. Corrao G, Parodi A, Nicotra F, Zambon A, Merlino L, Cesana G, Mancia G. Better compliance to antihypertensive medications reduces cardiovascular risk.J Hypertens. 2011; 29:610–618. doi: 10.1097/HJH.0b013e328342ca97.CrossrefMedlineGoogle Scholar
  • 38. Corrao G, Conti V, Merlino L, Catapano AL, Mancia G. Results of a retrospective database analysis of adherence to statin therapy and risk of nonfatal ischemic heart disease in daily clinical practice in Italy.Clin Ther. 2010; 32:300–310. doi: 10.1016/j.clinthera.2010.02.004.CrossrefMedlineGoogle Scholar
  • 39. Corrao G, Ibrahim B, Nicotra F, Zambon A, Merlino L, Pasini TS, Catapano AL, Mancia G. Long-term use of statins reduces the risk of hospitalization for dementia.Atherosclerosis. 2013; 230:171–176. doi: 10.1016/j.atherosclerosis.2013.07.009.CrossrefMedlineGoogle Scholar
  • 40. Corrao G, Ibrahim B, Nicotra F, Soranna D, Merlino L, Catapano AL, Tragni E, Casula M, Grassi G, Mancia G. Statins and the risk of diabetes: evidence from a large population-based cohort study.Diabetes Care. 2014; 37:2225–2232. doi: 10.2337/dc13-2215.CrossrefMedlineGoogle Scholar
  • 41. Lv HL, Jin DM, Liu M, Liu YM, Wang JF, Geng DF. Long-term efficacy and safety of statin treatment beyond six years: a meta-analysis of randomized controlled trials with extended follow-up.Pharmacol Res. 2014; 81:64–73. doi: 10.1016/j.phrs.2014.02.006.CrossrefMedlineGoogle Scholar
  • 42. Wang W, Zhang B. Statins for the prevention of stroke: a meta-analysis of randomized controlled trials.PLoS One. 2014; 9:e92388. doi: 10.1371/journal.pone.0092388.CrossrefMedlineGoogle Scholar
  • 43. Garber A, Weinstein M, Torrance G, Kamlet M. Theoretical foundations of cost-effectiveness analysis., Gold M, Siegel J, Russell L, Weinstein MIn: Cost-Effectiveness in Health and Medicine. New York, NY: Oxford University Press; 1996:25–53.Google Scholar
  • 44. Weinstein MC, Stason WB. Foundations of cost-effectiveness analysis for health and medical practices.N Engl J Med. 1977; 296:716–721. doi: 10.1056/NEJM197703312961304.CrossrefMedlineGoogle Scholar
  • 45. Corrao G, Scotti L, Zambon A, Baio G, Nicotra F, Conti V, Capri S, Tragni E, Merlino L, Catapano AL, Mancia G. Cost-effectiveness of enhancing adherence to therapy with statins in the setting of primary cardiovascular prevention. Evidence from an empirical approach based on administrative databases.Atherosclerosis. 2011; 217:479–485. doi: 10.1016/j.atherosclerosis.2011.04.014.CrossrefMedlineGoogle Scholar
  • 46. Scotti L, Baio G, Merlino L, Cesana G, Mancia G, Corrao G. Cost-effectiveness of enhancing adherence to therapy with blood pressure-lowering drugs in the setting of primary cardiovascular prevention.Value Health. 2013; 16:318–324. doi: 10.1016/j.jval.2012.11.008.CrossrefMedlineGoogle Scholar
  • 47. Stergachis AS. Record linkage studies for postmarketing drug surveillance: data quality and validity considerations.Drug Intell Clin Pharm. 1988; 22:157–161.CrossrefMedlineGoogle Scholar
  • 48. Levy AR, O’Brien BJ, Sellors C, Grootendorst P, Willison D. Coding accuracy of administrative drug claims in the Ontario Drug Benefit database.Can J Clin Pharmacol. 2003; 10:67–71.MedlineGoogle Scholar
  • 49. McKenzie DA, Semradek J, McFarland BH, Mullooly JP, McCamant LE. The validity of medicaid pharmacy claims for estimating drug use among elderly nursing home residents: the Oregon experience.J Clin Epidemiol. 2000; 53:1248–1257.CrossrefMedlineGoogle Scholar
  • 50. Strom BL. Overview of automated databases in pharmacoepidemiology., Strom BLIn: Pharmacoepidemiology. 4th ed. New York: Wiley; 2005:219–222.Google Scholar
  • 51. Yood MU, Campbell UB, Rothman KJ, Jick SS, Lang J, Wells KE, Jick H, Johnson CC. Using prescription claims data for drugs available over-the-counter (OTC).Pharmacoepidemiol Drug Saf. 2007; 16:961–968. doi: 10.1002/pds.1454.CrossrefMedlineGoogle Scholar
  • 52. Suissa S. Immeasurable time bias in observational studies of drug effects on mortality.Am J Epidemiol. 2008; 168:329–335. doi: 10.1093/aje/kwn135.CrossrefMedlineGoogle Scholar
  • 53. WHO Collaborating Centre for Drug Statistics Methodology. ATC Index With DDD. Oslo, Norway: WHO; 2003.Google Scholar
  • 54. Dormuth C, Schneeweiss S. Rapid monitoring of drug discontinuation rates in response to restrictions in drug reimbursement.Pharmacoepidemiol Drug Saf. 2004; 13:S310–S311.Google Scholar
  • 55. Chu H, Wang Z, Cole SR, Greenland S. Sensitivity analysis of misclassification: a graphical and a Bayesian approach.Ann Epidemiol. 2006; 16:834–841. doi: 10.1016/j.annepidem.2006.04.001.CrossrefMedlineGoogle Scholar
  • 56. Fox MP, Lash TL, Greenland S. A method to automate probabilistic sensitivity analyses of misclassified binary variables.Int J Epidemiol. 2005; 34:1370–1376. doi: 10.1093/ije/dyi184.CrossrefMedlineGoogle Scholar
  • 57. Kelsey JL, Whittemore AS, Evans AS, Thompson WDMethods in Observational Epidemiology. 2nd ed. New York: Oxford University Press; 1996.Google Scholar
  • 58. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics.J Clin Epidemiol. 2005; 58:323–337. doi: 10.1016/j.jclinepi.2004.10.012.CrossrefMedlineGoogle Scholar
  • 59. Wilchesky M, Tamblyn RM, Huang A. Validation of diagnostic codes within medical services claims.J Clin Epidemiol. 2004; 57:131–141. doi: 10.1016/S0895-4356(03)00246-4.CrossrefMedlineGoogle Scholar
  • 60. Jick H, Jick SS, Derby LE. Validation of information recorded on general practitioner based computerised data resource in the United Kingdom.BMJ. 1991; 302:766–768.CrossrefMedlineGoogle Scholar
  • 61. Currie O, Mangin D, Williman J, McKinnon-Gee B, Bridgford P. The comparative risk of new-onset diabetes after prescription of drugs for cardiovascular risk prevention in primary care: a national cohort study.BMJ Open. 2013; 3:e003475. doi: 10.1136/bmjopen-2013-003475.CrossrefMedlineGoogle Scholar
  • 62. Borah BJ, Moriarty JP, Crown WH, Doshi JA. Applications of propensity score methods in observational comparative effectiveness and safety research: where have we come and where should we go?J Comp Eff Res. 2014; 3:63–78. doi: 10.2217/cer.13.89.CrossrefMedlineGoogle Scholar
  • 63. Corrao G, Ghirardi A, Segafredo G, Zambon A, Della Vedova G, Lapi F, Cipriani F, Caputi A, Vaccheri A, Gregori D, Gesuita R, Vestri A, Staniscia T, Mazzaglia G, Di Bari M; BEST Investigators. User-only design to assess drug effectiveness in clinical practice: application to bisphosphonates and secondary prevention of fractures.Pharmacoepidemiol Drug Saf. 2014; 23:859–867. doi: 10.1002/pds.3650.CrossrefMedlineGoogle Scholar
  • 64. Schneeweiss S, Patrick AR, Stürmer T, Brookhart MA, Avorn J, Maclure M, Rothman KJ, Glynn RJ. Increasing levels of restriction in pharmacoepidemiologic database studies of elderly and comparison with randomized trial results.Med Care. 2007; 45(10 suppl 2):S131–S142. doi: 10.1097/MLR.0b013e318070c08e.CrossrefMedlineGoogle Scholar
  • 65. Ray WA. Evaluating medication effects outside of clinical trials: new-user designs.Am J Epidemiol. 2003; 158:915–920.CrossrefMedlineGoogle Scholar
  • 66. Maclure M, Fireman B, Nelson JC, Hua W, Shoaibi A, Paredes A, Madigan D. When should case-only designs be used for safety monitoring of medical products?Pharmacoepidemiol Drug Saf. 2012; 21(suppl 1):50–61. doi: 10.1002/pds.2330.CrossrefMedlineGoogle Scholar
  • 67. Farrington CP. Relative incidence estimation from case series for vaccine safety evaluation.Biometrics. 1995; 51:228–235.CrossrefMedlineGoogle Scholar
  • 68. Corrao G, Soranna D, La Vecchia C, Catapano A, Agabiti-Rosei E, Gensini G, Merlino L, Mancia G. Medication persistence and the use of generic and brand-name blood pressure-lowering agents.J Hypertens. 2014; 32:1146–1153; discussion 1153. doi: 10.1097/HJH.0000000000000130.CrossrefMedlineGoogle Scholar
  • 69. Schneeweiss S. Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics.Pharmacoepidemiol Drug Saf. 2006; 15:291–303. doi: 10.1002/pds.1200.CrossrefMedlineGoogle Scholar
  • 70. Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable.Epidemiology. 2006; 17:268–275. doi: 10.1097/01.ede.0000193606.58671.c5.CrossrefMedlineGoogle Scholar
  • 71. Stürmer T, Schneeweiss S, Avorn J, Glynn RJ. Adjusting effect estimates for unmeasured confounding with validation data using propensity score calibration.Am J Epidemiol. 2005; 162:279–289. doi: 10.1093/aje/kwi192.CrossrefMedlineGoogle Scholar
  • 72. Corrao G, Nicotra F, Parodi A, Zambon A, Heiman F, Merlino L, Fortino I, Cesana G, Mancia G. Cardiovascular protection by initial and subsequent combination of antihypertensive drugs in daily life practice.Hypertension. 2011; 58:566–572. doi: 10.1161/HYPERTENSIONAHA.111.177592.LinkGoogle Scholar
  • 73. Corrao G, Nicotra F, Parodi A, Zambon A, Soranna D, Heiman F, Merlino L, Mancia G. External adjustment for unmeasured confounders improved drug-outcome association estimates based on health care utilization data.J Clin Epidemiol. 2012; 65:1190–1199. doi: 10.1016/j.jclinepi.2012.03.014.CrossrefMedlineGoogle Scholar
  • 74. Weiss NS. The new world of data linkages in clinical epidemiology: are we being brave or foolhardy?Epidemiology. 2011; 22:292–294. doi: 10.1097/EDE.0b013e318210aca5.CrossrefMedlineGoogle Scholar
  • 75. Ray WA. Improving automated database studies.Epidemiology. 2011; 22:302–304. doi: 10.1097/EDE.0b013e31820f31e1.CrossrefMedlineGoogle Scholar
  • 76. Coloma PM, Schuemie MJ, Trifirò G, Gini R, Herings R, Hippisley-Cox J, Mazzaglia G, Giaquinto C, Corrao G, Pedersen L, van der Lei J, Sturkenboom M; EU-ADR Consortium. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project.Pharmacoepidemiol Drug Saf. 2011; 20:1–11. doi: 10.1002/pds.2053.CrossrefMedlineGoogle Scholar
  • 77. Hernán MA. With great data comes great responsibility: publishing comparative effectiveness research in epidemiology.Epidemiology. 2011; 22:290–291. doi: 10.1097/EDE.0b013e3182114039.CrossrefMedlineGoogle Scholar
  • 78. Berger ML, Mamdani M, Atkins D, Johnson ML. Good research practices for comparative effectiveness research: defining, reporting and interpreting nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report–Part I.Value Health. 2009; 12:1044–1052. doi: 10.1111/j.1524-4733.2009.00600.x.CrossrefMedlineGoogle Scholar
  • 79. Cox E, Martin BC, Van Staa T, Garbe E, Siebert U, Johnson ML. Good research practices for comparative effectiveness research: approaches to mitigate bias and confounding in the design of nonrandomized studies of treatment effects using secondary data sources: the International Society for Pharmacoeconomics and Outcomes Research Good Research Practices for Retrospective Database Analysis Task Force Report–Part II.Value Health. 2009; 12:1053–1061. doi: 10.1111/j.1524-4733.2009.00601.x.CrossrefMedlineGoogle Scholar
  • 80. Johnson ML, Crown W, Martin BC, Dormuth CR, Siebert U. Good research practices for comparative effectiveness research: analytic methods to improve causal inference from nonrandomized studies of treatment effects using secondary data sources: the ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report–Part III.Value Health. 2009; 12:1062–1073. doi: 10.1111/j.1524-4733.2009.00602.x.CrossrefMedlineGoogle Scholar
  • 81. Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM, Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide. Rockville, MD: Agency for Healthcare Research and Quality (US), AHRQ Methods for Effective Health Care; 2013.Google Scholar
  • 82. Dreyer NA, Schneeweiss S, McNeil BJ, Berger ML, Walker AM, Ollendorf DA, Gliklich RE; GRACE Initiative. GRACE principles: recognizing high-quality observational studies of comparative effectiveness.Am J Manag Care. 2010; 16:467–471.MedlineGoogle Scholar


eLetters should relate to an article recently published in the journal and are not a forum for providing unpublished data. Comments are reviewed for appropriate use of tone and language. Comments are not peer-reviewed. Acceptable comments are posted to the journal website only. Comments are not published in an issue and are not indexed in PubMed. Comments should be no longer than 500 words and will only be posted online. References are limited to 10. Authors of the article cited in the comment will be invited to reply, as appropriate.

Comments and feedback on AHA/ASA Scientific Statements and Guidelines should be directed to the AHA/ASA Manuscript Oversight Committee via its Correspondence page.