tenofovir & abacavir - Effects of nucleoside reverse transcriptase inhibitor backbone on the efficacy of first-line boosted highly active antiretroviral therapy based on protease inhibitors: meta-regression analysis of 12 clinical trials in 5168 patients
HIV Medicine Early View July 2009
A Hill 1 and W Sawyer 2
1 Pharmacology Research Laboratories, University of Liverpool, Liverpool, UK and 2 Department of Statistics, MetaVirology Ltd, London, UK
Correspondence: Andrew Hill, Pharmacology Research Laboratories, University of Liverpool, 70 Pembroke Place, Liverpool L69 3GF, UK. Tel: +44 7834 364 608; fax: +44 208 675 1716; e-mail: firstname.lastname@example.org
from Jules: I am sending you 2 recently published new studies on this question, the one below which is a met-analysis and e review from the CHIC cohort in the UK, both of these reports find different outcomes with CHIV finding no difference between TDF & ABC, and then this study finding: For the trials of lopinavir/ritonavir (LPV/r), atazanavir/ritonavir (ATV/r) and fosamprenavir/ritonavir (FAPV/r) using either TDF/FTC or ABC/3TC, the HIV RNA responses were significantly lower when ABC/3TC was used, relative to TDF/FTC, for all patients (P=0.0015) and for patients with baseline viral load <100 000 copies/mL (70.1%vs. 80.6%, P=0.0161), and was borderline for those with viral load >100 000 copies/mL (67.5%vs. 71.5%, P=0.0523).
"Possible reasons for the apparent difference in efficacy between TDF/FTC and ABC/3TC are unclear. The lack of human leucocyte antigen (HLA) testing in the trials included could have increased anxiety over potential ABC hypersensitivity reactions, leading to discontinuation from trials of ABC-based HAART. There may be a true difference in potency between the two nucleoside analogue combinations. TDF and FTC have longer half-lives than ABC and 3TC, but it is unknown whether this could translate into differences in clinical efficacy. Longer-term follow-up in the ACTG 5202 trial , and other clinical trials of these nucleoside analogues, could improve our understanding. There is a very low risk of developing resistance to PIs after virological failure of first-line boosted PI-based HAART ."
Tenofovir/emtricitabine (TDF/FTC) and abacavir/lamivudine (ABC/3TC) are widely used with ritonavir (RTV)-boosted protease inhibitors (PIs) as first-line highly active antiretroviral therapy (HAART), but there is conflicting evidence on their relative efficacy. The ACTG 5202 and BICOMBO trials suggested higher efficacy for TDF/FTC, whereas the HEAT trial showed no efficacy difference between the nucleoside reverse transcriptase inhibitor (NRTI) backbones.
A systematic MEDLINE search identified 21 treatment arms in 12 clinical trials of 5168 antiretroviral-naïve patients, where TDF/FTC (n=3399) or ABC/3TC (n=1769) was used with RTV-boosted PI. For each NRTI backbone and RTV-boosted PI, the percentage of patients with viral load <50 HIV-1 RNA copies/mL at week 48 by standardized Intent to Treat, Time to Loss of Virological Failure (ITT TLOVR) analysis were combined using inverse-variance weighting. The effect of baseline HIV RNA, CD4 cell count and choice of NRTI backbone were examined using a weighted analysis of covariance.
Across all the trials, HIV RNA suppression rates were significantly higher for those with baseline viral load below 100 000 copies/mL (77.2%) vs. above 100 000 copies/mL (70.9%) (P=0.0005). For the trials of lopinavir/ritonavir (LPV/r), atazanavir/ritonavir (ATV/r) and fosamprenavir/ritonavir (FAPV/r) using either TDF/FTC or ABC/3TC, the HIV RNA responses were significantly lower when ABC/3TC was used, relative to TDF/FTC, for all patients (P=0.0015) and for patients with baseline viral load <100 000 copies/mL (70.1%vs. 80.6%, P=0.0161), and was borderline for those with viral load >100 000 copies/mL (67.5%vs. 71.5%, P=0.0523).
This systematic meta-regression analysis suggests higher efficacy for first-line use of a TDF/FTC NRTI backbone with boosted PIs, relative to use of ABC/3TC. However, this effect may be confounded by differences between the trials in terms of baseline characteristics, patient management or adherence.
Combinations of two nucleoside reverse transcriptase inhibitors (NRTIs) with either a boosted protease inhibitor (PI) or a non-nucleoside reverse transcriptase inhibitor (NNRTI) are recommended as first-line highly active antiretroviral therapy (HAART) . The two NRTI combinations used most widely in Europe and North America are tenofovir/emtricitabine (TDF/FTC) and abacavir/lamivudine (ABC/3TC). However, three large clinical trials comparing TDF/FTC with ABC/3TC have found conflicting conclusions on the relative efficacy of these two NRTI backbones. The ACTG 5202 trial  compared first-line use of TDF/FTC with ABC/3TC, each used with either efavirenz (EFV) or atazanavir/ritonavir (ATV/r), in a 2 x 2 factorial design. There were 797 patients in ACTG 5202 with viral load above 100 000 HIV-1 RNA copies/mL at baseline. In this stratum, patients taking ABC/3TC showed higher rates of virological failure than those taking TDF/FTC (P=0.0003). The Data Safety Monitoring Board recommended that patients taking ABC/3TC with high baseline HIV RNA levels should be switched to TDF/FTC . The BICOMBO trial also compared these two NRTI combinations for patients with viral load <50 copies/mL at baseline. This trial showed a trend of higher efficacy in the TDF/FTC arm, but this trend was not statistically significant . However, the HEAT trial, comparing first-line TDF/FTC and ABC/3TC in combination with lopinavir/ritonavir (LPV/r), showed no difference in efficacy between the NRTI backbones at week 48 . Summary results are shown in Table 1.
I'm not sure what the 399 & 398 represent in this table.
The purpose of this systematic review was to analyse the 48-week HIV RNA efficacy data from first-line clinical trials of RTV-boosted PIs by NRTI backbone in a standardized efficacy endpoint: the percentage of patients with viral loads below 50 copies/mL using the Food and Drug Administration (FDA) Time to Loss of Virological Response (TLOVR) algorithm .
Patients and methods
A systematic MEDLINE review was conducted for prospective clinical trials of HAART regimens containing RTV-boosted HIV PIs in antiretroviral-naïve, HIV-infected individuals published between 1 January 2000 and 1 March 2008. This search used the generic names of each PI, followed by 'clinical trial' and 'naïve'.
This search was extended further by a review of the proceedings and abstract books of the following international scientific conferences, organized during the above-mentioned index period: the Conference on Retroviruses and Opportunistic Infections (CROI), the Interscience Conference on Antimicrobial Agents and Chemotherapy (ICAAC), the European Conference on Clinical Aspects and Treatment of HIV Infection (EACS), the International AIDS Conference (also known as the World AIDS Conference), the International AIDS Society (IAS) Conference on HIV Pathogenesis and Treatment, the International Conference on Drug Therapy in HIV Infection (ICDT) and the Annual Meeting of the Infectious Diseases Society of America (IDSA).
Finally, the latest US FDA-approved package inserts for each PI currently licensed for the treatment of HIV infection in treatment-naïve, HIV-infected adults were examined and listed trials were reviewed.
Trials derived from this systematic review of public-domain data and conference presentations were included in this analysis if they met all of the following eligibility criteria:
· They had to include at least 50 chronically infected treatment-naïve, HIV-infected individuals aged 16 years or above at any stage of HIV infection.
·The minimum duration of follow-up reported for these trials at the moment of inclusion in the systematic review had to be 48 weeks.
· Efficacy data had to be reported for the 48-week timepoint using the FDA-endorsed TLOVR algorithm for the virological response (% of patients with a plasma viral load <50 copies/mL).
· They had to evaluate, in at least one treatment arm, HAART regimens comprising an RTV-boosted PI (a PI co-administered with ≤200 mg/day of RTV) and a fixed combination of two NRTIs: either ABC or TDF in combination with 3TC or FTC.
Information on each trial was abstracted on the following trial characteristics and results: (1) trial design, treatment regimens compared in the trial and the daily dosages of their components; (2) baseline characteristics [number of individuals enrolled, percentage male, percentage of White participants, percentage of patients with confirmed AIDS diagnosis (CDC category C), age, log10 plasma HIV RNA and CD4 cell counts]; and (3) response rates [percentages of patients with plasma viral load <50 copies/mL at 48 weeks in the Intent to Treat (ITT) TLOVR populations]. Abstractions were performed by one reviewer and were confirmed by a second; any discrepancies were reconciled by conference with the study team.
Baseline log10 viral load and CD4 cell count were reported as a mixture of means and medians. For log10 HIV RNA, in the five study arms where both means and medians were used, the difference between them was small (<0.12 in all cases and in no consistent direction); it was considered safe to mix them freely, using medians as means when needed. In the nine cases where baseline CD4 cell count was reported using both means and medians, a linear relationship was observed between them that was significant using a regression method (P=0.0035, mean=59.5+0.811 x median). This relationship was used to estimate means from medians when the mean was not available, thus standardizing these data prior to the main analysis.
HIV RNA suppression below 50 copies/mL at week 48 was summarized using the FDA's TLOVR algorithm. This is a detailed definition of an intent to treat, switch equals failure analysis, where patients are classified as responders after two consecutive viral load levels below 50 copies/mL, and then virological failures after two consecutive viral load levels above 50 copies/mL. Patients who discontinue randomized treatment, either because of adverse events or for other reasons, are also classified as virological failures . A limitation of this endpoint is that it can include a high percentage of patients who discontinue clinical trials because of adverse events or other reasons .
The primary efficacy endpoint was HIV RNA suppression below 50 copies/mL at week 48 for all the trials except REDUCE, SHARE and SOLO, where the endpoint was suppression below 400 copies/mL. In the HEAT trial, patients needed to show HIV RNA suppression below 50 copies/mL by week 24; failure was classified as a rebound above 200 copies/mL at subsequent visits. Analysis of the percentage of patients with HIV RNA suppression below 50 copies/mL at week 48, using the TLOVR algorithm, was available for all the trials.
In the estimation of average effects across groups of studies, inverse-variance weights were used. Analysis of covariance of study arm results (meta-regression) was used to assess differences in efficacy by the NRTI backbone used. The models included terms for baseline mean HIV RNA level, CD4 cell count and age, the proportion of men and White people in the study, and the boosted PI used. Because the number of studies was not large, only those terms that were significant were retained in the models. The exclusion of terms was confirmed by adding them singly to the final model to ensure that they remained non-significant. All analyses used the Generalized Linear Models (PROC GLM) procedure in SAS version 9.1 (SAS Institute Inc., Cary, NC).
The search identified 12 clinical trials including first-line treatment with an RTV-boosted PI and either TDF/FTC or ABC/3TC. Table 2 shows summary baseline data for these trials; the percentage of patients with viral load <50 copies/mL at week 48 by the FDA TLOVR algorithm is also shown for each treatment arm.
Six trials (GEMINI , ARTEMIS , KLEAN , ALERT [10,11], CASTLE  and BI1182.33 ) were direct head-to-head comparisons of RTV-boosted PIs. The SOLO trial  compared RTV-boosted PIs with unboosted PIs. The Abbott 418  and Abbott 730  trials were comparisons of twice-daily vs. once-daily lopinavir/ritonavir (LPV/r). The HEAT trial  evaluated LPV/r with TDF/FTC and ABC/3TC. The REDUCE trial  compared two boosting doses of RTV in combination with fosamprenavir once daily. Finally, the SHARE trial  was an open-label evaluation of fosamprenavir/RTV once-daily. For three of the boosted PIs [atazanavir/ritonavir (ATV/r), LPV/r and fosamprenavir/ritonavir (fAPV/r)] data were available on first-line use with both TDF/FTC and ABC/3TC. For darunavir/ritonavir (DRV/r) and ritonavir-boosted saquinavir (SQV/r), the only first-line data available were for use with TDF/FTC (Table 2).
Data on first-line use of tipranavir/RTV in the BI1183.33 trial were not included because this treatment arm was discontinued for safety reasons . In the ACTG 5142 trial (which included first-line treatment with two NRTIs+LPV/r), a 'switch included' analysis was performed, which included HIV RNA responses after patients had started second-line treatments . This measure of efficacy tends to generate higher percentages with viral load <50 copies/mL, compared with the FDA TLOVR algorithm [5,6]. Therefore, it was not possible to include the efficacy estimates from this trial in the meta-regression analysis. Data from the Abbott 863 trial and BMS-089 trials of first-line LPV/r and ATV/r [20,21] were not included, because the NRTI backbone used was stavudine (d4T)/3TC, which is no longer recommended for first-line use . Finally, data from the ARIES trial of ABC/3TC/ATV/r were not included because only 36-week data were available .
For the 12 trials included in the meta-regression analysis, the baseline mean HIV RNA level ranged from 4.7 to 5.2 log10 copies/mL. The baseline mean CD4 count ranged from 153 to 229 cells/uL. The mean baseline age was from 34 to 40 years, with the majority of patients male. The percentage of White patients in each trial ranged from 36% to 78% (Table 2).
As in most meta-analyses, the main comparisons made were between groups and cross-study. The interpretation of all results should be made with the caveat that there was a wide range of baseline patient characteristics and all trials not were conducted identically. While statistical models to account for baseline variables and the usage of the ITT TLOVR endpoint may help to reduce the impact of any baseline imbalance, this is not guaranteed.
Table 2 and Fig. 1 show the percentage of patients with HIV RNA suppression below 50 copies/mL at week 48 by the NRTI and PI used. For the 10 trials that used LPV/r, ATV/r or fAPV/r with either ABC/3TC or TDF/FTC the HIV RNA responses were significantly lower when ABC/3TC was used [68.8%; 95% confidence interval (CI) 65.8-71.8%] compared with when TDF/FTC was used (76.1%; 95% CI 73.2-79.0%). This association was statistically significant in a multivariate analysis adjusting for baseline HIV RNA and the PI used (P=0.0015). Baseline CD4 cell count, age, gender and race were not included in the final model. There was no significant correlation between the median baseline CD4 cell count and rates of HIV RNA suppression across the trials.
Table 3 and Fig. 2 show HIV RNA suppression below 50 copies/mL at week 48 by baseline HIV RNA level. Across all the trials, HIV RNA suppression rates were significantly higher for those with baseline HIV RNA below 100 000 copies/mL (77.2%, 95% CI 74.6-79.8%) than for those above 100 000 copies/mL (70.9%, 95% CI 68.1-73.7%) (P=0.0005, adjusting for PI and backbone used).
In the 10 trials using TDF/FTC or ABC/3TC with LPV/r, ATV/r and fAPV/r, the HIV RNA responses were significantly lower when ABC/3TC was used relative to TDF/FTC for the lower baseline HIV RNA level (70.1%vs. 80.6%; P=0.0161; difference=10.5%; 95% CI 3.3-17.7%), and was borderline significant for the higher HIV RNA level (67.5%vs. 71.5%; P=0.0523; difference=4.0%; 95% CI -0.2% to 8.2%), after adjusting for the PI used. Published results giving summaries of age, gender and race by baseline HIV RNA were not available and so terms for these were not included in the models. There were some NRTI/PI combinations with small sample sizes, making comparison by baseline HIV RNA level unreliable (Fig. 2). For example, there were only 53 patients who used the combination of TDF/FTC with FPV/r: 24 patients with HIV RNA above 100 000 copies/mL and 29 with HIV RNA below 100 000 copies/mL. In contrast, for the combination of TDF/FTC with LPV/r, data were available from 1798 patients, making the 95% CIs around the efficacy estimates smaller.
Table 1 shows summary results from three randomized trials of TDF/FTC vs. ABC/3TC. ACTG 5202 compared TDF/FTC vs. ABC/3TC for treatment-naïve patients who were also taking either EFV or ATV/r (primary endpoint virological failure). Only results from patients with HIV RNA>100 000 copies/mL are available. HEAT compared TDF/FTC/LPV/r vs. ABC/3TC/LPV/r in treatment-naïve patients (primary endpoint viral load <50 copies/mL at week 48). BICOMBO compared TDF/FTC with ABC/3TC for patients with HIV RNA <50 copies/mL at baseline (primary endpoint viral load <200 copies/mL at week 48).
The HEAT trial was also analysed at week 96 , and showed non-inferiority of ABC/3TC/LPV/r vs. TDF/FTC/LPV/r for the endpoint of HIV RNA suppression below 50 copies/mL. Only 97/281 (35%) of endpoints in the primary intent-to-treat analysis of the HEAT trial were virological failures. The number of true virological failures was the same in the ABC/3TC/LPV/r and TDF/FTC/LPV/r treatment arms (49 and 48, respectively).
Table 4 shows summary 48-week results of randomized trials of four PIs with LPV/r. The comparator PI was DRV/r 800/100 mg qd in ARTEMIS, ATV/r 300/100 mg qd in CASTLE, fAPV/r 700/100 mg bid in KLEAN and SQV/r 1000/100 mg bid in GEMINI. All trials showed non-inferior efficacy for the new PI vs. LPV/r. At week 48, ARTEMIS showed a significant efficacy benefit for DRV/r over LPV/r for patients with baseline viral load >100 000 copies/mL (79%vs. 67%, P<0.05). In multivariate analysis of the 12 trials, the effect of NRTI backbone on efficacy remained statistically significant after adjustment for the PI used.
This systematic review of first-line clinical trials of two NRTIs and boosted PIs, with standardized viral load <50 copies/mL efficacy data using the FDA TLOVR algorithm, suggests higher efficacy for first-line use of a TDF/FTC NRTI backbone, relative to use of ABC/3TC. This apparent difference in efficacy was seen for patients with baseline HIV RNA levels below and above 100 000 copies/mL. A similar difference in efficacy was shown in a systematic review of first-line trials of TDF/FTC/EFV vs. ABC/3TC/EFV . Across the trials included in this systematic review, there were significantly lower HIV RNA suppression rates for patients with viral load >100 000 copies/mL at baseline. However, this evidence does not have the strength of randomized comparative trials. With the TLOVR efficacy endpoint, the majority of treatment failures are discontinuations for non-virological reasons.
There is conflicting evidence on the relative efficacy of TDF/FTC vs. ABC/3TC from three direct head-to-head randomized trials. Two trials suggest higher efficacy for TDF/FTC, and one trial shows non-inferior efficacy for ABC/3TC. In the ACTG 5202 trial, the primary efficacy endpoint was HIV RNA levels above 1000 copies/mL at weeks 16-24, or HIV RNA above 200 copies/mL after week 24. Patients who switched treatment without virological failure were still counted as successes in this analysis. In contrast, the FDA TLOVR algorithm defines failure as either two consecutive HIV RNA levels above 50 copies/mL by week 48 or discontinuation of treatment for any reason, such as adverse events, pregnancy or loss to follow-up. As a result, only a minority of endpoints in the FDA TLOVR algorithm are virological failures in studies of first-line HAART . As a result, apparent differences in treatment efficacy between trials, analysed by the TLOVR algorithm, could be influenced by differences in trial procedures to manage adverse events or maintain adherence.
This meta-regression analysis needs to be repeated, including only virological endpoints. The published reports of these trials do not consistently classify the failures by TLOVR into virological vs. non-virological endpoints. 'Observed data' or 'On treatment' analyses have not been standardized across HIV clinical trials. Some of these analyses include all virological failures that occur during the trial, even if patients then discontinue , while others include only observed data at fixed timepoints - for example the patients still in the trial at week 48 . This type of observed data analysis can overestimate efficacy, owing to a 'survivor effect' whereby patients who continue treatment are most likely to be benefiting from it. In future, it would be better to count the cumulative number of virological failures, even if patients then discontinue from the trials.
Four PIs (ATV/r, DRV/r, fAPV/r and SQV/r) have shown non-inferior efficacy vs. LPV/r at week 48. In the ARTEMIS trial, DRV/r showed superiority over LPV/r at week 96, both for the ITT analysis and for only virological endpoints. In the CASTLE trial, the apparent superiority of ATV/r over LPV/r at week 96 is the result of higher discontinuation rates in the LPV/r arm, with no difference seen for virological endpoints. The meta-analysis was stratified for the boosted PI used with either TDF/FTC or ABC/3TC.
The key limitation of the trials included in this meta-regression analysis is the lack of testing for HLA-B-5701, which could lead to lower rates of ABC hypersensitivity: a new clinical trial, including this testing at screening, is in progress . At week 36, the ARIES trial has shown higher rates of efficacy for ABC/3TC-based HAART (80%) than the trials of ABC/3TC, which are included in this meta-analysis. In addition, the multivariate analysis adjusts for baseline differences between the trials using only the overall means, which may not account for sub-sets of patients with extreme values (for example very low CD4 cell counts). There may be other differences between the trials - in country selection, adherence, patient management - that could explain the difference in efficacy between the NRTIs, but could not be adjusted for in the multivariate analysis.
More efficacy data were available at week 96 for clinical trials of TDF/FTC-based HAART [24-26] compared with ABC/3TC-based HAART [27,28], but the efficacy results remained similar between weeks 48 and 96 for these four trials with data available. A recent survey suggested that the efficacy of HAART including ABC/3TC was similar for patients with baseline HIV RNA above and below 100 000 copies/mL . However, this survey did not include data from two large trials of ABC/3TC-based HAART - SOLO and ARIES [14,23] - that both show lower rates of efficacy for patients with baseline HIV RNA levels above 100 000 copies/mL. The results of this meta-regression analysis also show significantly lower response rates for patients with baseline HIV RNA levels above 100 000 copies/mL for both TDF/FTC and ABC/3TC. High baseline HIV RNA could be a marker for more advanced disease, leading to less tolerance of drug-related adverse events or poor adherence.
Possible reasons for the apparent difference in efficacy between TDF/FTC and ABC/3TC are unclear. The lack of human leucocyte antigen (HLA) testing in the trials included could have increased anxiety over potential ABC hypersensitivity reactions, leading to discontinuation from trials of ABC-based HAART. There may be a true difference in potency between the two nucleoside analogue combinations. TDF and FTC have longer half-lives than ABC and 3TC, but it is unknown whether this could translate into differences in clinical efficacy. Longer-term follow-up in the ACTG 5202 trial , and other clinical trials of these nucleoside analogues, could improve our understanding. There is a very low risk of developing resistance to PIs after virological failure of first-line boosted PI-based HAART .
New clinical trials that incorporate testing for HLA-B-5707 could lessen the potential for the ABC hypersensitivity reaction, and this may lower discontinuation rates . However, ABC has also been associated with an increased risk of cardiovascular events in the DAD cohort study , and it is unclear whether HLA testing would lessen this risk. The long-term risks of osteoporosis and renal failure on TDF are unknown, but follow-up in clinical trials suggests that these risks may be small [32,33].
In summary, this meta-regression analysis of 48-week efficacy data from 12 clinical trials in 5168 patients suggests higher rates of HIV RNA suppression below 50 copies/mL for first-line boosted PI-based HAART using TDF/FTC as an NRTI backbone, compared with the use of ABC/3TC. However, the reasons for this apparent difference in efficacy are unclear. It could be influenced by differences in discontinuation rates for adverse events as well as differences in rates of true virological failure.