Data Collection And Validation
Data validation efforts associated with the adjustments for changes in review activities involved comparing the PDUFA IV Workload Adjuster data with the data provided from CDER and CBER for the following seven data elements: 1) Labeling Supplements; 2) Annual Reports; 3) NDA/BLA Meetings Scheduled; 4) NDA/BLA Applications; 5) SPAs; 6) IND Meetings Scheduled and 7) IND Applications. The data provided by CDER and CBER included submission counts for NDAs/BLAs and INDs for PDUFA years 2002 through 2008. CDER and CBER also provided activity factor counts for the five review activities mentioned above for PDUFA years 2002 through 2008. The CDER and CBER time reporting data for FY 2006 through FY 2008 was deemed out of scope of this evaluation due to the sensitive nature of the data.
Data requests were submitted to CDER and CBER to obtain the counts associated with the seven data elements. The data was obtained by each center from the systems where the information is stored. The data validation efforts are summarized in Figure 3 below for each column in the PDUFA IV Workload Adjuster.
Column | Data Element | Data Validation |
---|---|---|
1 | Submission Counts: - Labeling Supplements - Annual Reports - NDA/BLA Meetings Scheduled - NDA/BLA Applications - SPAs - IND Meetings Scheduled - IND Applications | Validated against data received from CDER and CBER |
2a | Submission Counts: - Labeling Supplements - Annual Reports - NDA/BLA Meetings Scheduled - NDA/BLA Applications - SPAs - IND Meetings Scheduled - IND Applications | |
2b | Activity Factor Data: - Labeling Supplements - Annual Reports - NDA/BLA Meetings Scheduled - SPAs - IND Meetings Scheduled | |
2c | Derived from Columns 2a and 2b | Not Applicable |
3 | Derived from Columns 1 and 2c | Not Applicable |
4 | Weighting Factors: - NDAs/BLAs - INDs | Validated against weighting factors obtained from the Standard Cost Model. The Standard Cost model is beyond the scope of this project, therefore no validation is performed on any of the Standard Cost Model data |
5 | Derived from Columns 3 and 4 | Not Applicable |
Figure 3 – Summary of Data Validation Efforts for the PDUFA IV Workload Adjuster
The results of the data validation for each of the seven data elements are shown in the tables below (Figures 4 through 10), and is followed by an explanation of any variances found.
Labeling Supplements
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 761 | 74 | 835 | 815 | 20 | 2.40% |
2003 | 773 | 93 | 866 | 836 | 30 | 3.46% |
2004 | 1,011 | 70 | 1,081 | 1,040 | 41 | 3.79% |
2005 | 776 | 50 | 826 | 787 | 39 | 4.72% |
2006 | 885 | 49 | 934 | 902 | 32 | 3.43% |
2007 | 1,024 | 53 | 1,077 | 1,037 | 40 | 3.71% |
2008 | 912 | 48 | 960 | 914 | 46 | 4.79% |
Figure 4 – Data Validation Results for Labeling Supplements
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 2,672 | 226 | 2,898 | 2,897 | 1 | 0.03% |
2003 | 2,651 | 275 | 2,926 | 2,925 | 1 | 0.03% |
2004 | 2,575 | 187 | 2,762 | 2,761 | 1 | 0.04% |
2005 | 2,651 | 158 | 2,809 | 2,808 | 1 | 0.04% |
2006 | 2,581 | 191 | 2,772 | 2,771 | 1 | 0.04% |
2007 | 2,676 | 194 | 2,870 | 2,868 | 2 | 0.07% |
2008 | 2,672 | 222 | 2,894 | 2,891 | 3 | 0.10% |
Figure 5 – Data Validation Results for Annual Reports
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 342 | 74 | 416 | 416 | 0 | 0.00% |
2003 | 412 | 80 | 492 | 492 | 0 | 0.00% |
2004 | 366 | 55 | 421 | 399 | 22 | 5.23% |
2005 | 328 | 42 | 370 | 346 | 24 | 6.49% |
2006 | 328 | 58 | 386 | 372 | 14 | 3.63% |
2007 | 285 | 44 | 329 | 316 | 13 | 3.95% |
2008 | 238 | 62 | 300 | 296 | 4 | 1.33% |
Figure 6 – Data Validation Results for NDA/BLA Meetings Scheduled
NDA/BLA Meetings Scheduled
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 342 | 74 | 416 | 416 | 0 | 0.00% |
2003 | 412 | 80 | 492 | 492 | 0 | 0.00% |
2004 | 366 | 55 | 421 | 399 | 22 | 5.23% |
2005 | 328 | 42 | 370 | 346 | 24 | 6.49% |
2006 | 328 | 58 | 386 | 372 | 14 | 3.63% |
2007 | 285 | 44 | 329 | 316 | 13 | 3.95% |
2008 | 238 | 62 | 300 | 296 | 4 | 1.33% |
Figure 7 – Data Validation Results for NDA/BLA Applications
NDA/BLA Applications
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 90 | 9 | 99 | 99 | 0 | 0.00% |
2003 | 113 | 8 | 121 | 115 | 6 | 4.96% |
2004 | 129 | 6 | 135 | 138 | -3 | -2.22% |
2005 | 105 | 9 | 114 | 117 | -3 | -2.63% |
2006 | 128 | 4 | 132 | 133 | -1 | -0.76% |
2007 | 109 | 15 | 124 | 116 | 8 | 6.45% |
2008 | 141 | 4 | 145 | 138 | 7 | 4.83 |
Figure 8 – Data Validation Results for SPAs
IND Meetings Scheduled
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 758 | 283 | 1,041 | 1,040 | 1 | 0.10% |
2003 | 1,016 | 316 | 1,332 | 1,338 | -6 | -0.45% |
2004 | 1,338 | 196 | 1,534 | 1,537 | -3 | -0.20% |
2005 | 1,649 | 196 | 1,845 | 1,819 | 26 | 1.41% |
2006 | 1,733 | 200 | 1,933 | 1,942 | -9 | -0.47% |
2007 | 1,670 | 191 | 1,861 | 1,861 | 0 | 0.00% |
2008 | 1,533 | 210 | 1,743 | 1,767 | -24 | -1.38% |
Figure 9 – Data Validation Results for IND Meetings Scheduled
Active INDs
PDUFA Year | CDER Data [a] | CBER Data [b] | Total CDER & CBER [a] + [b] = [c] | PDUFA IV Workload Adjuster [d] | Variance [c] - [d] = [e] | Percentage Difference [e] / [c] = [f] |
---|---|---|---|---|---|---|
2002 | 3,659 | 1,320 | 4,979 | 4,982 | -3 | -0.06% |
2003 | 3,764 | 1,352 | 5,116 | 5,123 | -7 | -0.14% |
2004 | 4,766 | 888 | 5,654 | 5,661 | -7 | -0.12% |
2005 | 5,047 | 843 | 5,890 | 5,900 | -10 | -0.17% |
2006 | 5,385 | 829 | 6,214 | 6,252 | -38 | -0.61% |
2007 | 5,168 | 824 | 5,992 | 5,843 | 149 | 2.49% |
2008 | 5,521 | 864 | 6,385 | 5,832 | 553 | 8.66% |
Figure 10 – Data Validation Results for IND Applications
According to the FDA, the discrepancies found between values that were used in the PDUFA IV Workload Adjuster and the values provided to us by the FDA for verification purposes were due to querying the source databases at different points in time. Variances may occur in the data because of the dynamic nature of the source data and constant updates to the data. Properties and attributes of a given application may be adjusted as necessary during the course of a review and any updates to that application are then made within the system accordingly. Examples affecting submission counts and complexity factors may include:
- Data Entry Lag: It may take a few days from the receipt of an application to its entry into the database. If the database is queried by a user during this window, the results of the query may underestimate the total submission count for that application type. For example, if three new submissions are received on June 30th (the cut-off date for the PDUFA year used in the Workload Adjuster), they may not be captured in a query that is run immediately after that date. FDA analysts try to mitigate this risk by deferring the database query until the last possible moment.
- Data Update Lag: Since individual application submissions include many fields that need to be accurately flagged, these critical flags may not be known until substantial review of the submission package has been performed; this process can take up to two months (for example, an NDA needs to be categorized as a New Molecular Entity (NME) or a non-NME, and also as to whether or not it includes clinical data). Even after adding flags there may be some time before FDA has time to update the submission in the system because of competing priorities. If the database is queried while these submissions are being processed, the results of the query may not represent the final number of submissions until all entries have been updated. It is important to note that the data update lag will affect the NDA/BLA sub-types but it will not have a significant impact on the overall count.
- FDA Data Reporting Method: Under the current methodology for collecting and reporting the data used in the PDUFA workload adjuster, the data is collected for each 12 month period ending on June 30th, and recorded in mid July. To make periods from earlier years comparable with later years, the amounts from earlier years are not restated - they always remain fixed at the values initially reported for each year. Naturally, restating these values for purposes of our evaluation produced values that generally differed slightly from the values initially recorded and repeated here without change for the associated year.
- Human Entry Errors: Errors may occur by the user when the application is entered into the system or updated. Multiple levels of quality assurance are performed on a frequent basis, but resource constraints make this difficult at times.
- New Data Systems: During the past few years, FDA’s CDER has been moving towards a new consolidated database for submission types, namely, the Document Archiving, Reporting, and Regulatory Tracking System (DARRTS). CDER IND applications were moved to DARRTS in 2007 and CDER NDA applications are planned for migration in 2009. This new database features a greater reliance on automated updates to data based on clearly defined business rules which reduces the need for manual data entry (which would help minimize data entry and update lags as well as human entry errors). This database also includes a robust trace-back feature that will document changes made to the database. However, moving to a new database does introduce several issues that may affect the stability of the data:
- When data are migrated from one system to another, there is a possibility that data may be lost or altered during the move. FDA exercises extensive quality assurance measures to maintain data integrity during the migration.
- Any new database structure will also require the development of new queries that would provide the same results that were obtained from the older database. FDA has experienced challenges building comparable queries using the new Business Object query interface. Training is being provided to support the development of queries capable of navigating the intricacies of the data being queried. Until all data migration issues have been addressed, some variability in the query results will be expected.
We performed a simulation analysis of the adjustments for changes in review activities for the PDUFA IV Workload Adjuster by utilizing the data received in support of our validation effort (see the 2nd, 3rd and 4th columns of figures 4-10) as new inputs. Based on the results of the simulations we believe that the data variances do not represent a significant impact to the adjustments for changes in review activities. However, we believe that FDA could implement additional procedures to reduce these data variances (see observations section below).