Features

Data Integrity

FDA guidance on real-world data: how it affects clinical trial design and impacts patients.

By: Karen Ooms

Joint Chief Operating Officer, Quanticate

The FDA’s new draft guidance regarding the integration of real-world data (RWD) and real-world evidence (RWE) into clinical research, product approvals, and post-approval monitoring of drugs seeks to further clarify the agency’s expectations on the topic.

With a focus on clarifying the status, role and requirements of non-interventional study designs, it also lays out certain key considerations for integrating RWD into approval processes in a manner that aligns with agency expectations for sponsor and investigator conduct. The goal is to reduce the risk of bias in data source collection and analysis.

Importantly, the draft guidance also facilitates the agency’s goal of encouraging the use of RWE to support approvals of new indications for drugs already approved under existing indications or to satisfy post-approval study requirements.

This article analyses how the guidance will affect clinical trial design and data capture and analysis, and the broader impact it will have on patients.

The 21st Century Cures Act facilitated the use of RWE to accelerate product development and bring products to market more efficiently as innovators are always looking for ways to reduce costs or improve data analysis—increasing the volume of data can improve understanding of drug effectiveness.

While incredibly valuable, RWE opens doors to biases that need to be accounted for by choosing adequate analytical methods and designs in non-interventional studies since many of the challenges related to bias may not be happening when using randomized controlled trials (RCT). Taking measures to decrease bias involves picking the right design and proper analytical methods.

Data source, data integrity and data standards

Administrative claims data, Electronic Health Records (EHR), patient registries on specific treatments or diseases, patient-generated data from questionnaires or surveys or data from two or more sources that are linked together are among the examples of RWD sources. To confirm the quality of the data source, the main thing to consider is whether the data is representative of the population and that means the data should provide enough information about the exposure and outcome. However, there could still be a possibility of systematic bias. According to Marc Berger, Pfizer`s former head of RWD and analytics, one way to minimize the bias could be to use multiple datasets. He believes one should never trust a single study.

Data integrity is defined as the completeness, accuracy and consistency of the data. According to FDA, it plays an important role to ensure safety, efficacy and quality of the investigated drugs. Any violation related to data integrity can cause several regulatory actions. There have been methods developed by statisticians to ensure data integrity including the methods to design the questionnaires, data collection methods, subsequent releasing and even machine-learning algorithms. It however is of importance to engage statisticians at the right time to take the proper assessments regarding the data integrity.

Once quality and integrity are confirmed, the data needs to be submitted to the agency and this is where data standards must be respected. It is essential that the data be presented in a clear and functional format. It is of importance that the data format allows the agency as well as any reviewer to reproduce any results and assess the reliability and relevance of the data. Therefore, FDA outlined a draft on how a sponsor should submit RWD-based data and map it to the Clinical Data Interchange Standards Consortium (CDISC) and Data Tabulation Model (SDTM). It has been advised by FDA that sponsors contact the review division of this organization and give a detailed description of their data transformation approach. But to ensure that data has the right standards and format three steps are required; Abstraction, Transformation and Harmonization. Data should be simplified in order to extract the necessary components (abstraction), it then should be transformed into the proper format (transformation) and then the standard terminologies should be used throughout the data (harmonization).

Trial design

FDA states that RWE can be investigated through several study designs like RCT, pragmatic trials or observational studies whether retrospective or prospective, but they also mention that the evidence from traditional CT would not be counted as RWE, unlike most pragmatic and observational studies. Neil Pearce talks about different applicable designs in epidemiological studies, among which, case-control, cohort and self-controlled case series are the most known designs for RWD. However, choosing the best study design depends on the research question of interest, type of outcome or exposure, how frequent they are and knowledge of the potential sources of biases.
For instance, cohort studies are mostly used when sponsors have observational data unless determined otherwise by the researcher and the research question. Case-control studies could be a source of bias as they pick cases and controls from the population and compare the outcome among two groups. Self-controlled methods are leveraged when sponsors face rare outcomes and/or treatments and thus each case can be its own control. When choosing the trial design, sponsors should be attentive to the inclusion/exclusion criteria, the necessary washout period, and possible comorbidities and have a good covariate assessment.

Analytical methods

It is crucial to take adequate statistical methods when working with RWD to be able to properly use the data and provide answers for research and regulatory questions. For instance, there have been several developments in statistical methods to replicate the RCTs when one is dealing with observational data. According to FDA guidelines, the type of study design depends on the type of intervention, the population of interest, the quality of the data, the sample size (especially in rare outcomes) and the nature of the disease. It is also important that investigators take proper action where blinding is not feasible, especially when outcomes are rare, to minimize any bias. It is also important to account for unmeasured confounders, especially in non-randomized trials. The strategies for sensitivity analysis should also be determined.

Source data verification for RWD

In the late 90s, FDA issued a guideline and introduced the most effective way to investigate the accuracy of the data provided to this organization. The aim of this guideline was to clarify that reviewing individual records and other related documents and comparing them to the reports provided by the investigator to the sponsor is the most effective way to make sure the data is qualified. Source data verification (SDV) is the quality control process that ensures the collected data are reliable enough so that the study can be re-constructed and reproduced and thus that it is accurate, complete and verified. From the statistical point of view, SDV by statistical sampling (SDVSS) is a method that allows sponsors to automatically perform the SDV by taking a proper sample from the data and verify is against the original data. The following are to be verified during the SDV process:
  • The source data is accurate, legible, complete and consistent, enduring and available;
  • The principal investigator (PI) has carefully reviewed and signed any inclusion and exclusion criteria, any intervention and/or any other safety information;
  • All inclusion and exclusion criteria are met;
  • There were no deviations from the protocol;
  • Serious Adverse Events (SAE) were reported as planned; and
  • If HER was used, the details of system information like the validation status or system name are needed to be provided.

Conclusion

To sum up, when using RWD and RWE:
  • Sponsors need to make sure the data is reliable and relevant and is able to represent the medical question we want to investigate that includes enough information about the outcome and interventions;
  • When using medical claims data, make sure they are coded properly based on International Classification of Diseases (ICD);
  • Make sure the laboratory tests are gone through the consistency assessment and are completed;
  • Make sure the data is capable to answer the research question and is actually representing the population;
  • Be aware of the possible challenges for instance the unstandardized format of HER data, the coarseness nature of some medical claims data, the diversity of populations in RWD which brings a large number of confounders, low validity of some claims data etc.; and
  • Be aware of potential biases.
In general, to get the full potential of the RWD, it is advised to respect the following points:
  1. Make sure the data is compatible with international data standards such as ICD.
  2. Perform the quality assurance (QA) at the initial steps and wherever needed.
  3. Pick the most proper method of detailed data entry.
  4. Having a proper strategy towards unstructured data, for instance, implying the Natural Language Processing (NLP) tools.
  5. When using EHR, making sure the benefits of patients are reserved.
  6. Using the benefits of population diversity in RWD, something which is missed in most RCTs, by capturing important information from data-source of interest like HER. 

Keep Up With Our Content. Subscribe To Contract Pharma Newsletters