Estimating GFR Using Cystatin Alone & in Combination with Creatinine:
analysis of 3,418 with CKD
"Cystatin C is being considered as a potential replacement for serum creatinine as a filtration marker....The aim of this study is to develop and compare the accuracy of a GFR-estimating equation that includes only cystatin C level with equations that include cystatin C, creatinine, or both levels adjusted for age, sex, and race.....The key findings are that in populations with CKD, cystatin C level alone provides GFR estimates that are more accurate than serum creatinine level alone and nearly as accurate as serum creatinine level, age, sex, and race, thus providing an alternative estimate of GFR that is not linked to serum creatinine level and muscle mass....An equation including serum cystatin C level in combination with serum creatinine level, age, sex, and race provides the most accurate estimates.....GFR-estimating equations using cystatin C level have the promise to provide more accurate estimates of GFR than equations using serum creatinine level. Implementation of these equations in routine clinical practice requires standardization of the cystatin C assay, further investigation of factors other than GFR that influence cystatin C level, and availability of widespread and cost-effective assays for additional markers. At the present time, the equations developed here may provide more accurate estimates in people for whom estimates based on serum creatinine level are likely to be inaccurate because of conditions affecting muscle mass or diet or to estimate change in GFR over time in people with changing muscle mass or diet."
Estimating GFR Using Serum Cystatin C Alone and in Combination With Serum Creatinine: A Pooled Analysis of 3,418 Individuals With CKD
American Journal of Kidney Diseases March 2008
Lesley A. Stevens, MD, MS, Josef Coresh, MD, PhD, MPH, Christopher H. Schmid, PhD, Harold I. Feldman, MD, MSCE, Marc Froissart, MD, PhD, John Kusek, PhD, Jerome Rossert, MD, PhD, Frederick Van Lente, PhD, Robert D. Bruce III, BA, Yaping (Lucy) Zhang, MD, Tom Greene, PhD, Andrew S. Levey, MD
Refers to article:
Cystatin C: Research Priorities Targeted to Clinical Decision Making
Michael G. Shlipak
American Journal of Kidney Diseases
March 2008 (Vol. 51, Issue 3, Pages 358-361)
Serum cystatin C was proposed as a potential replacement for serum creatinine in glomerular filtration rate (GFR) estimation. We report the development and evaluation of GFR-estimating equations using serum cystatin C alone and serum cystatin C, serum creatinine, or both with demographic variables.
Test of diagnostic accuracy.
Setting & Participants
Participants screened for 3 chronic kidney disease (CKD) studies in the United States (n = 2,980) and a clinical population in Paris, France (n = 438).
Measured GFR (mGFR).
Estimated GFR using the 4 new equations based on serum cystatin C alone, serum cystatin C, serum creatinine, or both with age, sex, and race. New equations were developed by using linear regression with log GFR as the outcome in two thirds of data from US studies. Internal validation was performed in the remaining one third of data from US CKD studies; external validation was performed in the Paris study.
GFR was measured by using urinary clearance of iodine-125-iothalamate in the US studies and chromium-51-EDTA in the Paris study. Serum cystatin C was measured by using Dade-Behring assay, standardized serum creatinine values were used.
Mean mGFR, serum creatinine, and serum cystatin C values were 48 mL/min/1.73 m2 (5th to 95th percentile, 15 to 95), 2.1 mg/dL, and 1.8 mg/L, respectively. For the new equations, coefficients for age, sex, and race were significant in the equation with serum cystatin C, but 2- to 4-fold smaller than in the equation with serum creatinine. Measures of performance in new equations were consistent across the development and internal and external validation data sets. Percentages of estimated GFR within 30% of mGFR for equations based on serum cystatin C alone, serum cystatin C, serum creatinine, or both levels with age, sex, and race were 81%, 83%, 85%, and 89%, respectively. The equation using serum cystatin C level alone yields estimates with small biases in age, sex, and race subgroups, which are improved in equations including these variables.
Study population composed mainly of patients with CKD.
Serum cystatin C level alone provides GFR estimates that are nearly as accurate as serum creatinine level adjusted for age, sex, and race, thus providing an alternative GFR estimate that is not linked to muscle mass. An equation including serum cystatin C level in combination with serum creatinine level, age, sex, and race provides the most accurate estimates.
Glomerular filtration rate (GFR) is an important indicator of kidney function, critical for the detection, evaluation, and management of chronic kidney disease (CKD). GFR cannot be practically measured for routine clinical or research purposes, and serum creatinine therefore is often used to estimate GFR. Several factors affect serum creatinine level other than GFR, including its generation from muscle metabolism. GFR-estimating equations, such as the Modification of Diet in Renal Disease (MDRD) Study equation, include age, sex, and race to account for average differences in muscle mass in subgroups; however, the magnitude of the association of muscle mass with age, sex, and race varies among populations, compromising the generalizability of the equations. Furthermore, incorporation of age, sex, and race in the estimating equation does not account for variation in creatinine generation caused by diet or other clinical conditions, such as illnesses complicated by malnutrition, inflammation, or deconditioning, that also affect muscle mass. These other causes of creatinine generation lead to imprecision in the estimates.1, 2
Cystatin C is being considered as a potential replacement for serum creatinine as a filtration marker. Most studies have shown that serum cystatin C levels more closely correlated with GFR than serum creatinine; however, the few studies that compared serum cystatin C with estimates based on serum creatinine, age, sex, and race showed them to be comparable.3, 4, 5 No study examined whether the addition of age, sex, race, and creatinine level to an equation based on cystatin C level would improve GFR estimates based on cystatin level alone.
The aim of this study is to develop and compare the accuracy of a GFR-estimating equation that includes only cystatin C level with equations that include cystatin C, creatinine, or both levels adjusted for age, sex, and race.
In this study, we pooled data from 3,418 patients with CKD in 3 research studies and 1 clinical population to develop and compare GFR-estimating equations using serum creatinine, cystatin C, or both levels. Strengths of the study include the large study population, calibration of creatinine assays in each study to standardized values, measurement of cystatin C in a single laboratory, multiple-period urinary clearances of validated filtration markers for measurement of GFR, and use of a separate external validation data set. The key findings are that in populations with CKD, cystatin C level alone provides GFR estimates that are more accurate than serum creatinine level alone and nearly as accurate as serum creatinine level, age, sex, and race, thus providing an alternative estimate of GFR that is not linked to serum creatinine level and muscle mass. Nevertheless, the addition of age, sex, and race to cystatin C level reduced bias in some subgroups defined by these variables, and an equation that used both serum creatinine and cystatin C levels with age, sex, and race was better than equations that used only 1 of these markers.
Cystatin C is an endogenous 13-kDa protein filtered by glomeruli and reabsorbed and catabolized by tubular epithelial cells with only small amounts excreted in urine and reported to be generated at a relatively constant rate irrespective of muscle mass. Thus, it was anticipated that cystatin C level would provide a better estimate of GFR than estimating equations based on serum creatinine level. We found a weaker association of age, sex, and race with cystatin C level than with serum creatinine level, consistent with this a priori hypothesis, as well as with most published cystatin C-based estimating equations that do not include terms for age, sex, or race.4, 5, 21, 22, 23
Both markers provide independent information for the estimation of GFR. The combination of both markers in an equation with age, sex, and race provided the most accurate estimate in our data set and was equivalent to the average of estimates from the equation that used cystatin C level alone and the MDRD Study equation. More work is needed to determine optimal use of the combination or sequential use of filtration markers to provide even more accurate GFR estimates.
Several findings in our study suggest that cystatin C levels are affected by factors other than GFR. First, variation in cystatin C levels alone among subgroups defined by age, sex, and race was observed. For the same level of eGFR, serum cystatin levels were 9% lower for women than men, 6% higher for blacks than whites, and 9% lower for 40-year-olds compared with 20-year-olds. These differences may reflect differences in the generation of cystatin C among these groups. Accordingly, there was a small improvement in performance of the equation with the addition of these variables. Second, cystatin C-based equations slightly overestimated mGFR at eGFR greater than 90 mL/min/1.73 m2 (>1.5 mL/s/1.73 m2), whereas the creatinine-based equations remained unbiased. Finally, precision was lower for the cystatin C-based estimating equations that did not include creatinine level. Previous studies also showed preliminary evidence for non-GFR determinants of cystatin C level, including nonrenal elimination and differences in generation among individuals related to such factors as inflammation, steroid use, and thyroid disease, and some studies showed an effect of body composition.4, 22, 24, 25, 26, 27, 28 Variation in these factors among individuals likely accounts for the lower precision of the cystatin C-based estimates. The absence of urinary excretion made it difficult to rigorously evaluate cystatin C level as a filtration marker and examine its non-GFR determinants. The specific nature and magnitude of these factors is not known and requires further study.
The performance of equations using serum creatinine and cystatin C levels is known to vary among populations. For example, the MDRD Study equation performs well in patients with CKD, but is less accurate in potential kidney donors, young people with type 1 diabetes, and patients with substantially reduced muscle mass.2, 13, 29 Rule et al4 showed that a cystatin C-based estimating equation performed differently among patients with native kidney disease, kidney transplant recipients, and potential kidney donors. Therefore, we cannot draw conclusions about the relative performance of serum cystatin C or creatinine levels in other populations. This has important implications for use of these equations in screening populations for CKD because the equations were not tested in populations without known CKD.
Serum cystatin C level was shown to have a stronger association with mortality and cardiovascular disease than serum creatinine level, particularly in studies of older adults.30, 31, 32, 33 In the Cardiovascular Health Study (CHS), participants with eGFR greater than 60 mL/min/1.73 m2 (>1.0 mL/s/1.73 m2; based on the MDRD Study equation) and cystatin C level greater than 1.0 mg/L (>75 nmol/L) had a worse outcome than participants with a cystatin C level less than 1.0 mg/L (<75 nmol/L). A cystatin C level of 1.0 mg/L (75 nmol/L) corresponds to an eGFR of 77 mL/min/1.73 m2 (1.3 mL/s/1.73 m2) according to the cystatin equation developed here.31 The similarity of the performance of equations based on serum creatinine and cystatin C levels at this level of eGFR in the present study suggests that the stronger association of cystatin C level with adverse outcomes in the CHS may be caused by factors other than GFR that affect serum cystatin C or creatinine level.34 An alternative explanation is that the study populations in this report differ from the CHS. CHS participants were recruited from Medicare eligibility lists and therefore reflect the general population of older adults, including the frail elderly, whereas studies included here mostly included patients with CKD who were not frail. Therefore, factors that may confound the relationship between serum creatinine or cystatin C level with GFR may not be reflected in the current estimating equations. Studies of GFR measurements in older adults with and without CKD and across the range of health and functional status are required to determine whether cystatin C level is a better filtration marker in this population.
Previous studies showed wide variability in eGFRs for the same level of cystatin C.35 For example, cystatin C levels equivalent to an eGFR of 60 mL/min/1.73 m2 (1.0 mL/s/1.73 m2) by using the equations published by Rule et al4 and Grubb et al3 would be 1.09 mg/L (81.6 nmol/L) and 1.40 mg/L (105 nmol/L) compared with 1.23 mg/L (92.1 nmol/L) derived from our equation that uses cystatin C level alone, respectively. These differences may be related to the variation among populations discussed or differences among assays or GFR measurement methods. The high level of variation in the cystatin C assay is beginning to be recognized, and standardization and calibration of clinical laboratories will be important to obtain accurate GFR estimation using cystatin C level, as shown for creatinine level.36
There are several limitations to this analysis. First, as discussed, the study population was composed mainly of patients with CKD. However, the intent of the current work is to compare estimating equations based on cystatin C level with the best available equation based on serum creatinine level. To maximize this comparison, we did not test all possible transformations of cystatin C or other variables in addition to age, sex, and race (eg, diabetes) or their interactions.
Second, only 1 external validation data set was used; therefore, our findings may not be applicable to other populations. In particular, the small bias observed in the external validation data set could be related to differences between exogenous filtration markers used for clearance measurements, iothalamate for the development data set, and EDTA for the validation data set. Previous studies showed both filtration markers to provide values similar to inulin clearance; however, a direct comparison cannot be performed because the 2 radioactive tracers can interfere with one another in the counting, EDTA is not approved for use in the United States, and iothalamate is not available in Europe.10, 11, 12 In addition, differences between markers are likely to be manifested by bias. Because bias is expected when applying equations in a new data set, the presence of a small bias in the external validation data set suggests that the difference between filtration markers for clearance measurements is not likely to be an important factor in the result. Use of only 1 external validation data set also means we were unable to test whether the coefficient for black race differs between blacks in the United States and Europe.
Third, equation coefficients were derived from a pooled analysis of individual studies, rather than from a representative population, with some studies representing a substantial portion of certain demographic groups (eg, blacks in AASK). Therefore, it is possible that findings observed within a demographic group may reflect study differences and not characteristics of that group. Study participants likely were selected on the basis of previous serum creatinine values, and given the colinearity between creatinine and cystatin C levels, this may lead to bias in coefficients in all equations. However, all studies were of CKD populations and previous studies suggested that differences among subgroups based on demographic characteristics were minimal for populations with native kidney disease.2, 13 Pooling across studies probably is preferable to using a single study in the absence of data from large representative samples.
Fourth, equations were not compared with respect to classification of patients with mGFR less than versus greater or equal to 60 mL/min/1.73 m2. Our study population had a mean GFR well less than 60 mL/min/1.73 m2; thus, these analyses would not be very sensitive to differences in accuracy of the equations. In addition, these comparisons would not take into account error in mGFR. Finally, these equations were not tested for assessment of change in GFR over time.
GFR-estimating equations using cystatin C level have the promise to provide more accurate estimates of GFR than equations using serum creatinine level. Implementation of these equations in routine clinical practice requires standardization of the cystatin C assay, further investigation of factors other than GFR that influence cystatin C level, and availability of widespread and cost-effective assays for additional markers. At the present time, the equations developed here may provide more accurate estimates in people for whom estimates based on serum creatinine level are likely to be inaccurate because of conditions affecting muscle mass or diet or to estimate change in GFR over time in people with changing muscle mass or diet.
Table 1 lists clinical characteristics by study for the development and internal and external validation data sets, as well as for the overall population. All patients were considered to have CKD. Mean GFR was 48 mL/min/1.73 m2 (5th to 95th percentile, 15 to 95; [0.8 mL/s/1.73 m2; 5th to 95th percentile, 0.3 to 1.6]). Mean ± SD serum concentrations of creatinine and cystatin C were 2.1 ± 1.1 mg/dL (186 ± 97 μmol/L) and 1.8 ± 0.8 mg/L (135 ± 60 nmol/L), respectively. Correlation between them was 0.85.
Table 2 lists coefficients and performance of the 6 equations in the development data set. Exploration of a quadratic form of log (cystatin C) did not yield better results than the simple logarithmic transformation and is not reported here. Coefficients for variables in the equation including creatinine level, age, sex, and race are similar to those in the MDRD Study equation. When included in the equation with cystatin C level, age, sex, and race variables were significant at P less than 0.001; however, their estimated coefficients had magnitudes estimated at 2- to 4-fold smaller than for the equation with serum creatinine level. Coefficients for serum cystatin C and creatinine were approximately 2-fold smaller and similar to each other when included in the same equation than when included in separate equations.
The model fit was better in the equation that included only cystatin C level than in the equation using only creatinine level. The addition of age, sex, and race as predictor variables substantially improved the fit of the model for creatinine (P < 0.001) and slightly improved the model for cystatin C (P < 0.001). The best-fitting equation included both creatinine and cystatin levels with age, sex, and race. We elected to carry forward the equations that used cystatin C level alone; cystatin C level with age, sex, and race; creatinine level, age, sex, and race; and, finally, cystatin C and creatinine levels with age, sex, and race for further evaluation in the internal and external validation data sets.
Performance in Validation Data Sets
Table 3 lists the performance in the internal and external validation data sets of 4 new equations, the MDRD Study equation, and the average of the cystatin C level alone and MDRD Study equations. In the internal validation data set, RMSE of all equations was similar to that in the development data set, with serum creatinine-based equations showing more stability between the development and interval validation data sets than cystatin C-based equations.
In the external validation data set, equations based on cystatin C level overestimated mGFR, whereas equations based on serum creatinine level underestimated mGFR. The equation incorporating both cystatin C and creatinine levels was unbiased, as was the average of GFR estimates from the equation based on cystatin C level alone and the MDRD Study equation. Precision (interquartile range) was better in models using creatinine level compared with models not using creatinine level when expressed on the percentage scale, but was similar across all models when expressed on the raw scale. P30 was higher for the equation based on creatinine level, age, sex, and race, as well as for the equation based on cystatin C level, age, sex, and race, compared with the equation based on cystatin C level alone. Accuracy was highest for the equation based on all variables and for the average of the cystatin C-only equation and the MDRD Study equation. Given the similarity in performance of equations across the 3 data sets, data were combined and final coefficients were determined. Table 4 lists the 3 cystatin C-based equations fit by using the combined data sets.
Performance by eGFR Level and Age, Sex, and Race
Figure 1 shows the performance of the 4 main equations by eGFR level in the combined data set. All equations show minimal bias and similar precision up to eGFRs of 90 mL/min/1.73 m2 (1.5 mL/s/1.73 m2). There is greater overestimation at the higher range of eGFRs using equations based on cystatin C level compared with equations based on creatinine level.
Table 5 lists performance of the equations within subgroups. There are greater differences in bias, precision, and accuracy among young, middle-age, and old subgroups; between men and women; and between blacks and nonblacks in equations that use cystatin C level without adjustment for demographic variables. The addition of age, sex, and race reduces bias in some subgroups, particularly individuals of older age, females, and blacks.
Using the equation based on cystatin C level alone, serum levels corresponding to eGFRs of 45, 60, 75, and 90 mL/min/1.73 m2 (0.75, 1.0, 1.3, and 1.5 mL/s/1.73 m2) are 1.57, 1.23, 1.02, and 0.88 mg/L (118, 92.1, 76.4, and 65.9 nmol/L), respectively. In the equation that includes cystatin C level, age, sex, and race, serum cystatin C levels for an eGFR of 60 mL/min/1.73 m2 (1.0 mL/s/1.73 m2) were 1.21 mg/L (89.9 nmol/L) for a 60-year-old white man, 1.12 mg/L (83.9 nmol/L) for a 60-year-old white woman, 1.27 mg/L (95.1 nmol/L) for a 60-year-old black man, and 1.17 mg/L (87.6 nmol/L) for a 60-year-old black woman.
Sources of Data
CKD Epidemiology Collaboration (CKD-EPI) is a research group formed to develop and validate improved estimating equations for GFR by pooling data from research studies and clinical populations (hereafter referred to as "studies"). (See Appendix text and Appendix Fig 1 for description of data sets included in CKD-EPI.) The present analysis is based on pooled individual-level patient data from the MDRD Study, African American Study of Kidney Disease (AASK), Collaborative Study Group (CSG), and a clinical population in Paris, France. Each study was described previously.1, 6, 7, 8, 9, 10 The first 3 were used for model development and internal validation, and the fourth study was used for external validation.
GFR was measured as 4-period urinary clearance of iodine-125 (125I)-iothalamate in the MDRD Study, AASK, and CSG, and as 5 periods of urinary clearance of chromium-51 (51Cr)-EDTA in Paris. Comparisons of 125I-iothalamate and 51Cr-EDTA clearances to urinary clearance of inulin, the reference standard for GFR measurements, showed high correlation.10, 11, 12 Serum creatinine assays were calibrated to standardized serum creatinine values at the Cleveland Clinic Research Laboratory (CCRL; Cleveland, OH). Results of the calibration procedure for the MDRD Study, AASK, and CSG were described previously.13, 14 Calibration of the Paris data set was performed similarly by using 215 frozen specimens measured at CCRL. Samples for all 4 studies had been frozen at _70°C until 2005 to 2006, when serum cystatin C was measured at the CCRL by using a particle-enhanced immunonephelometric assay (N Latex Cystatin C; Dade Behring, Deerfield, IL). With a range of 0.23 to 7.25 mg/L (17.2 to 543.0 nmol/L), this assay currently is the most precise automated assay across the clinical concentration range.15 Interassay coefficients of variation for the assay were 5.05% and 4.87% at mean concentrations of 0.97 and 1.90 mg/L (72.7 and 142.3 nmol/L), respectively. Serum cystatin C was reported as robust to multiple freeze-thaw cycles.16
Model Development and Evaluation
We developed new models in the development data set (n = 1,935), assessed the stability of new models in the internal validation data set (n = 1,045), and compared model performance in the external validation data set (n = 438; Appendix Fig 1). To improve the precision and generalizability of the final estimating equations, we used data from the total study population (n = 3,134) to revise the regression coefficients in new models.
Models were developed by using least squares linear regression. We restricted variables in model development to serum cystatin C level, serum creatinine level, age, sex, and race. As in the MDRD Study equation, serum creatinine, serum cystatin C, and GFR values were log transformed to capture the multiplicative relationship between GFR and level of the filtration marker and equalize variance across the range of GFRs. Age also was log transformed to be consistent with the form used in the MDRD Study equation. Race was defined as "black" or "other" and assigned as in the individual studies. GFR was adjusted for body surface area as milliliters per minute per 1.73 m2.17 In equations that included age, sex, and race, all variables were initially included, but were maintained only for P less than 0.001. For optimal comparisons between equations based on serum creatinine and cystatin C levels, a model with serum creatinine level, age, sex, and race was developed (equivalent to "refitting" coefficients from the MDRD Study equation). GFR was also estimated by using the MDRD Study equation and as the average of estimates from the model that used cystatin C level alone and the MDRD Study equation. For these computations, we used the MDRD Study equation reexpressed for use with serum creatinine (Scr) values standardized to isotope dilution mass spectroscopy (GFR = 175 _ standardized Scr_1.154 _ age_0.203 _ 1.212 [if black] _ 0.742 [if female]).1
Measured (mGFR) and estimated (eGFR) were compared for each patient graphically by plotting mGFR and the difference (mGFR _ eGFR) against eGFR. Bias was expressed as the difference (mGFR _ eGFR) and percentage of difference ([mGFR _ eGFR]/mGFR _ 100]), with positive values indicating a lower eGFR than mGFR (underestimation). Precision was expressed as interquartile range for differences and root mean square error (RMSE) calculated on the logarithmic scale. Accuracy was expressed as percentage of eGFR within 30% of mGFR (P30). A difference in P30 values between equations reflects a difference in the magnitude of outliers. Accuracy of GFR estimates to within 30% of mGFR has been used as a benchmark for evaluation of GFR estimates for use in clinical practice.18, 19 Percentage of difference, RMSE, and P30 account for the expectation that the absolute magnitudes of errors increase proportionately to the level of GFR; however, these measures overemphasize errors at the lower GFR range. Bias was not shown for the development data set because it is close to zero for equations evaluated in the population in which they were developed. Confidence intervals were computed by using bootstrap methods (2,000 bootstraps) for difference and percentage of difference, interquartile ranges, and RMSE and by using the binomial approximation to estimate SEs for the method for accuracy (P30).20 We selected RMSE as the primary measure of model fit in the development phase and as the primary measure of model validation in the internal and external validation phases. We evaluated performance for all patients and for subgroups defined by age, sex, and race. Differences among equations in their performance were determined by examination of nonoverlapping confidence intervals.
Analyses were computed using R (version 2; Free Software Foundation Inc, Boston, MA) and SAS software (version 9.1; SAS Institute, Cary, NC). Smoothed estimates of the mean in the figures were created by using the lowest function in R.
Role of the Funding Source
CKD-EPI is funded by grants from the National Institute of Diabetes, Digestive and Kidney Disease (NIDDK) as part of a cooperative agreement in which the NIDDK has substantial involvement in the design of the study and collection, analysis, and interpretation of the data. The NIDDK was not required to approve publication of the finished manuscript. The institutional review boards of all participating institutions approved the study.