Ultra-Deep Pyrosequencing of Hepatitis B Virus Quasispecies from Nucleoside and Nucleotide Reverse-Transcriptase Inhibitor (NRTI)-Treated Patients and NRTI-Naive Patients
The Journal of Infectious Diseases March 16 2009;199:000-000
Severine Margeridon-Thermet,1 Nancy S. Shulman,1 Aijaz Ahmed,1 Rajin Shahriar,1 Tommy Liu,1 Chunlin Wang,1 Susan P. Holmes,2 Farbod Babrzadeh,3 Baback Gharizadeh,3 Bozena Hanczaruk,4 Birgitte B. Simen,4 Michael Egholm,4 and Robert W. Shafer1
Departments of 1Medicine and 2Statistics, Stanford University, and 3Stanford Genome Technology Center, Stanford, California; 4454 Life Sciences, a Roche Company, Branford, Connecticut
"UDPS has increased sensitivity for detecting low-prevalence variants, including minority drug-resistance mutations, as well as G-to-A hypermutation and dual genotype infection at a level of sensitivity not previously possible. The expanded perspective on emerging and latent HBV drug resistance provided by UDPS may make it possible to improve the strategic use of HBV drugs to treat this lifelong infection."
"The clinical significance of low-prevalence NRTI-resistance mutations in both NRTI-treated and NRTI-naive patients cannot be ascertained from this study because it was predominantly cross-sectional. Moreover, among the patients for whom follow-up data were available, treatment with a TDF-containing regimen led to virological suppression in 5 of 6 patients. Therefore, retrospective studies of the association between low-prevalence mutations and subsequent clinical response are necessary before UDPS can be considered for routine clinical use."
The dynamics of emerging nucleoside and nucleotide reverse-transcriptase inhibitor (NRTI) resistance in hepatitis B virus (HBV) are not well understood because standard dideoxynucleotide direct polymerase chain reaction (PCR) sequencing assays detect drug-resistance mutations only after they have become dominant. To obtain insight into NRTI resistance, we used a new sequencing technology to characterize the spectrum of low-prevalence NRTI-resistance mutations in HBV obtained from 20 plasma samples from 11 NRTI-treated patients and 17 plasma samples from 17 NRTI-naive patients, by using standard direct PCR sequencing and ultra-deep pyrosequencing (UDPS). UDPS detected drug-resistance mutations that were not detected by PCR in 10 samples from 5 NRTI-treated patients, including the lamivudine-resistance mutation V173L (in 5 samples), the entecavir-resistance mutations T184S (in 2 samples) and S202G (in 1 sample), the adefovir-resistance mutation N236T (in 1 sample), and the lamivudine and adefovir-resistance mutations V173L, L180M, A181T, and M204V (in 1 sample). G-to-A hypermutation mediated by the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like family of cytidine deaminases was estimated to be present in 0.6% of reverse-transcriptase genes. Genotype A coinfection was detected by UDPS in each of 3 patients in whom genotype G virus was detected by direct PCR sequencing. UDPS detected low-prevalence HBV variants with NRTI-resistance mutations, G-to-A hypermutation, and low-level dual genotype infection with a sensitivity not previously possible.
Hepatitis B virus (HBV) is a circular, partially double-stranded DNA virus, which contains a reverse-transcriptase (RT) enzyme and replicates via an RNA intermediate. HBV RT has a high mutation rate and is responsible for generating a quasispecies of innumerable related virus variants that provides the breeding ground for HBV drug resistance and, possibly, for vaccine-escape mutations .
Current methods of characterizing HBV sequence variability include direct polymerase chain reaction (PCR) sequencing, clonal sequencing, and point mutation assays. Direct PCR sequencing, also referred to as gpopulation-basedh sequencing, detects (on average) mutations present in >20% of the circulating virus population. The sequencing of multiple virus clones has a higher sensitivity for detecting low-prevalence HBV mutations, but it is costly and labor intensive. Point mutation assays such as the reverse hybridization LiPA DR (Innogenetics), which can detect resistant variants that make up as little as 5% of the virus population, have a higher sensitivity than direct PCR sequencing, but these assays detect only single mutations and are subject to a greater number of false-positive and false-negative results than is sequencing [2, 3].
The advent of new sequencing technologies has made it possible to generate much more sequence data than could be generated with standard dideoxynucleotide sequencing . One of these new technologies achieves this throughput by massively parallelizing PCR amplification and pyrosequencing on a picoliter scale (454 Life Sciences) . The use of this technology to sequence multiple genetic variants in a heterogeneous pool of amplified DNA molecules, such as those from a virus quasispecies, is called ultra-deep pyrosequencing (UDPS) . We used UDPS to characterize genetic variation in HBV RT genes isolated from virus in plasma samples obtained from patients who had been treated with nucleoside and nucleotide reverse-transcriptase inhibitors (NRTIs) and NRTI-naive patients.
Rationale for UDPS of HBV quasispecies.HBV has a high replication rate, with estimated production of as many as 1012 viruses/day in NRTI-naive patients . During replication, HBV mutations-caused by both cellular RNA polymerases, which synthesize viral pregenomic RNA, and HBV RT, which converts viral pregenomic RNA to negative-strand DNA-arise at a frequency of 1 per 104-105 nucleotides . The combination of HBVfs high rate of replication and its high rate of mutation is responsible for the virus existing in patients as a quasispecies of innumerable related genomic variants .
Despite the high HBV mutation rate, the emergence of NRTI resistance occurs more slowly for HBV than for HIV-1 and HCV, 2 other viruses that exist as quasispecies. Although 3TC resistance in HIV-1-infected patients emerges within 2-3 weeks of monotherapy , only one-half of HBV-infected patients develop detectable resistance after 2-3 years of therapy . The slow emergence of NRTI resistance in HBV may result from the combination of incomplete inhibition of virus replication by some NRTI inhibitors, slow turnover of covalently closed circular DNA in chronically infected hepatocytes, and the constraints on HBV evolution imposed by its overlapping reading frames and the host immune response . We hypothesized that a method for detecting low-prevalence genetic variants would be particularly useful for HBV because emerging mutations may be obscured by the ongoing production of nonmutated and mutated variants for a prolonged period..
The sensitivity of UDPS for the detection of minority HIV-1 variants depends on the number of virus templates that can be successfully extracted and amplified from a plasma sample; its specificity depends on the number of errors generated during PCR amplification and pyrosequencing . In this study, sensitivity was not limited by virus recovery because plasma HBV levels and the number of extracted amplifiable viral genomic templates were high. Sensitivity was limited primarily by the mismatch error rate of 0.1% because, on the basis of the distribution of errors in the control samples, we could be highly confident only of those minority variants present at levels >1%-2%. A sensitivity about 10-fold higher can be achieved if PCR amplification is performed with the high-fidelity enzyme Pfu rather than the Taq-Pwo enzyme blend that we used (R.W.S., unpublished data, and ). Unfortunately, high-fidelity enzymes often have low processivity, particularly when used to amplify heterogeneous viral nucleic acids . Nonetheless, the expanded perspective on the HBV quasispecies that we obtained by UDPS made it possible to detect clinically relevant NRTI-resistance mutations, coinfection with >1 HBV genotype, and low levels of hypermutated variants that were not detected by direct PCR sequencing.
In 10 plasma samples from 5 of the 11 NRTI-treated patients, UDPS detected additional NRTI-resistance mutations not detected by standard sequencing. The accessory 3TC-resistance mutation V173L was detected in 5 samples in which the primary 3TC-resistance mutations rtL180M and rtM204V/I were detected by direct PCR sequencing, a finding that does not influence HBV cross-resistance . However, in 5 other samples, UDPS detected cross-resistance mutations, including the ADV-resistance mutation rtN236T, the ETV-resistance mutations rtT184S and rtS202G, and the 3TC-ADV resistance mutations rtV173L, rtL180M, rtA181T, and rtM204V.
Of 17 NRTI-naive patients, 2 had NRTI-resistance mutations detectable only by UDPS, including rtM204I at a level of 1.3% in one patient and rtA181T and rtM204I at levels of 1.0% in another patient. Clonal sequencing confirmed the presence of these mutations, but 2 of 3 clones with these mutations had stop codons and G-to-A hypermutation.
The clinical significance of low-prevalence NRTI-resistance mutations in both NRTI-treated and NRTI-naive patients cannot be ascertained from this study because it was predominantly cross-sectional. Moreover, among the patients for whom follow-up data were available, treatment with a TDF-containing regimen led to virological suppression in 5 of 6 patients. Therefore, retrospective studies of the association between low-prevalence mutations and subsequent clinical response are necessary before UDPS can be considered for routine clinical use.
Eight plasma samples from 3 patients showed coinfection with genotype G and A viruses. Genotype G is characterized by a 36-bp insertion in the core gene and 2 precore stop codons . The most intriguing aspect of this genotype is that it appears to always be accompanied by genotype A coinfection [21, 22]. Coinfection with >1 non-G genotype has been reported, but the frequency of this phenomenon is not known [33, 34]. In this study, genotype A coinfection was evident by UDPS in 6 of 8 samples that contained genotype G and was confirmed by clonal sequencing in the 1 genotype G sample submitted for clonal sequencing. UDPS is useful for detecting HBV coinfections and for determining whether, in coinfected persons, there are differences in the treatment responsiveness of viruses belonging to different genotypes.
G to A hypermutation of viral genomes results from an innate antiviral defense mechanism mediated by the APOBEC family of cytidine deaminases . These enzymes are capable of causing extensive C-to-U deamination of negative-stranded viral RNA, which results in lethal G-to-A hypermutation. In vitro studies have suggested that APOBEC3C is the cytidine deaminase most likely to edit HBV genomes [36, 37]. G-to-A hypermutation has been observed in a small proportion of genomes recovered from 2 of 4 HBV-infected patients in one study  and 5 of 18 patients in another study . Unrecognized G-to-A hypermutation may be responsible for overestimation of the proportion of NRTI-resistance mutations caused by G-to-A changes (A181T, A194T, and M204I). Point mutation assays, in particular, may be subject to false-positive predictions of resistance because they lack information on the nucleotide sequence that surrounds the target mutations.
S protein mutations.
The overlap between the RT and S-protein coding regions has raised the concern that treatment-selected RT mutations may disrupt S protein conformational epitopes responsible for HBV vaccine responsiveness [14-16]. We found few of the commonly reported putative vaccine-escape mutations. The most striking S protein-related finding was the presence in 17 samples of 40 stop codons at 13 positions. Mutations at 3 of the 13 S protein positions were associated with RT amino acid mutations, whereas the other mutations were usually caused by silent RT mutations . Although a large variety of defective HBV genomes have been reported , the finding here of 40 stop codons in the S protein and none in the overlapping RT protein suggests that S protein stop codons result from as yet unexplained evolutionary selection pressures.
HBV replication in vivo is characterized by a high turnover and mutation rate. HBV population genetics in vivo, however, are poorly understood because they are influenced by the slow cellular turnover of hepatocytes and the constrained evolution caused by HBVfs overlapping reading frames and, possibly, by the host immune response. UDPS has increased sensitivity for detecting low-prevalence variants, including minority drug-resistance mutations, as well as G-to-A hypermutation and dual genotype infection at a level of sensitivity not previously possible. The expanded perspective on emerging and latent HBV drug resistance provided by UDPS may make it possible to improve the strategic use of HBV drugs to treat this lifelong infection.
Patients and samples.
We performed direct PCR sequencing and UDPS on 17 plasma samples from 17 NRTI-naive, HBV-infected patients and 20 plasma samples from 11 NRTI-treated, HBV-infected patients. Six of the NRTI-treated patients were HIV-1 infected; 2-3 samples obtained at different times were available for these patients. The median plasma HBV DNA level was 9.9 x 10-5th IU/mL (range, 2.8 x 10-4th -3.8 x 10-7th IU/mL) for the samples from NRTI-naive patients and 3.8 x 10-6th IU/mL (range, 1.1 x 10-4th -2.5 x 10-8th IU/mL) for the samples from NRTI-treated patients.
Samples from NRTI-naive patients: direct PCR sequencing and UDPS.
Of the 17 NTRTI-naive patients, 11 were infected with genotype B HBV strains, 3 with genotype A strains, and 3 with genotype C strains. Direct PCR sequencing detected a mean of 5.9 mutations/sample (range, 0-13 mutations/sample) (table 2). UDPS detected a mean of 4.6 additional mutations/sample (range, 0-14 additional mutations/sample) that were not detected by direct PCR sequencing. Among the NRTI-resistance mutations, M204I was detected in 1.3% of the sequence reads in the sample from subject A7; A181T and M204I were present in 1.0% of the sequence reads in the sample from subject E6. For sample A7, M204I was found in 1 of 64 clones sequenced by dideoxynucleotide sequencing; however, that clone had 2 stop codons and G-to-A hypermutation. For the sample from subject E6, A181T and M204I were each found in 1 of 80 molecular clones; the clone with M204I had G-to-A hypermutation and 2 stop codons.
Samples from NRTI-treated patients: direct PCR sequencing and UDPS.
Of the 11 NRTI-treated patients, 5 had received lamivudine (3TC) alone; 2 had received 3TC and adefovir (ADV); 2 had received 3TC, ADV, and entecavir (ETV); 1 had received ADV alone; and 1 had received 3TC, tenofovir (TDF), and emtricitabine (FTC). Four patients had had a sample obtained while they were not taking NRTIs for a period of 3-19 months. Three patients were infected with genotype G HBV strains, 3 with genotype A strains, 3 with genotype B strains, 1 with a genotype C strain, and 1 with a genotype D strain.
Direct PCR sequencing detected a mean of 9.9 mutations/sample (range, 0-23 mutations/sample)-including NRTI-resistance mutations-in 16 samples obtained from 9 of 11 patients (table 3). The 4 samples without NRTI-resistance mutations were collected from patients who were not receiving NRTI therapy at the time the sample was obtained. The direct PCR sequences were dominated by 3TC-resistance mutations: L180M and M204V/I in 9 samples; V173L, L180M, and M204V in 5 samples; L180M, A181V (a mutation associated with ADV and 3TC resistance), and M204V in 1 sample; and V173L alone in 1 sample.
UDPS identified additional NRTI-resistance mutations in 10 of 20 samples from 5 of 11 NRTI-treated patients. The most common additional mutation was V173L, which was found in 5 samples for which the direct PCR sequence contained L180M and M204V/I.. The ETV-resistance mutation T184S was detected at a prevalence of 2%-3% in 2 samples from a 3TC-treated patient for whom the HBV direct PCR sequence contained L180M and M204V. The 3TC-resistance mutation L80V and the ETV-resistance mutation S202G were detected at prevalences of 1.5% and 9.9%, respectively, in a patient who had received 3TC, ADV, and ETV. V173L/M, L180M, A181T, and M204V were detected in a patient who had received 3TC for 52 months but had not been receiving therapy for 27 months at the time the sample was obtained.. The ADV-resistance mutation N236T was detected at a prevalence of 6.6% in a patient who had received 3TC and ADV but had not been receiving therapy for 19 months at the time the sample was obtained. Clonal sequencing performed on this sample detected N236T in 2 of 89 clones.
Of the 6 HIV-1-infected patients, 5 subsequently attained plasma HBV DNA levels below the limit of quantification (about 60 IU/mL), including 3 patients who received TDF and FTC (subjects 1284, 4089, and 26278), 1 patient who changed to TDF and 3TC (subject 1329), and 1 patient who changed to TDF and ETV (subject 7774). One HIV-1-infected patient treated with TDF and FTC (subject 16375) experienced a decrease in plasma HBV levels from 2 million IU/mL to a persistently detectable level of viremia that has ranged from 300 to 2000 IU/mL.
Coinfections with genotypes G and A.
Eight samples from 3 patients were found to have genotype G sequences by direct PCR sequencing. Five of these samples from 2 patients contributed more than half of the low-prevalence mutations observed among the NRTI-treated patients (mean, 22.2 additional mutations/sequence).. The vast majority of these low-prevalence mutations were consensus genotype A variants, a result that is consistent with reports that genotype G viruses usually occur in combination with genotype A viruses [21, 22]. Limiting dilution clonal sequencing of viruses from 1 genotype G sample (subject 1284-1) demonstrated genotype A viruses in 4 of 36 clones.
G-to-A hypermutation was present in sequence reads for 0.49% (range, 0-2.5%) and 0.66% (range, 0-3.2%) of NRTI-naive and NRTI-treated patients, respectively (p=.5). Among the 6 patients from whom multiple samples were obtained, the percentage of hypermutated sequence reads was somewhat more similar within samples from a single patient than among samples from different patients, which suggests that the proportion of viruses with G-to-A hypermutation may be a property of the infecting virus or host environment (p=.06 , by the Fligner-Killeen test of homogeneity of variances). Of the hypermutated sequence reads, 37% had >1 stop codons in RT, compared with <1% in the nonhypermutated sequence reads.
Table 4 shows the influence of G-to-A hypermutation at the 3 codons at which a G-to-A change is responsible for a drug-resistance mutation (A181T, A194T, and M204I). Without filtering out sequence reads that met the criteria for G-to-A hypermutation, 8 of these mutations in samples from NRTI-naive patients and 6 in samples from NRTI-treated patients were detected as minor variants at a prevalence of >1.0%. After hypermutated sequence reads were excluded, 3 mutations in samples from NRTI-naive patients and 2 in samples from NRTI-treated patients had a prevalence of >1.0%.
S protein mutations.
The direct PCR and UDPS RT sequences were translated in a +1 reading frame to identify mutations of possible significance in the overlapping S protein. Table 5 shows 13 stop codons present in 10 samples from NRTI-naive patients and 27 stop codons present in 7 samples from NRTI-treated patients. Three of the 13 residues with stop codons were always associated with RT amino acid mutations, of which 2 (rtA181T and rtV191I) have previously been reported [23, 24].
Of the mutations reported to be associated with decreased hepatitis B surface antigen immunologic reactivity [14-16], only sM133L and sP120T were detected in this study. sM133L was present as part of a mixture with wild type in the direct PCR sequence for 2 NRTI-naive patients and as a minor variant detectable only by UDPS for 2 NRTI-naive patients and 1 NRTI-treated patient. sP120T was detected as a minority variant in 1 NRTI-treated patient.
Patients and samples.
Plasma samples that contained HBV were obtained from individuals attending the Stanford University HIV and Hepatitis Clinics. The samples from individuals at the HIV Clinic were obtained from 1998 through 2007. The samples from individuals at the Hepatitis Clinic were obtained in 2007. Samples were selected for UDPS if they showed plasma HBV DNA levels >10,000 IU/mL [7, 8] and if the complete nucleoside treatment history of the person from whom the sample was obtained was known.
Direct PCR and clonal sequencing.
HBV DNA was extracted from 400 μL of plasma by using the QIAamp Ultrasens Virus Minikit. PCR was performed in a 50-μL reaction mixture that contained 1 μL extracted DNA, 1 X Expand High FidelityPLUS Reaction Buffer, 2 mmol/L MgCl2, 0.2 mmol/L dNTPs, 0.5 mmol/L primers Pol1 and Pol2 (table 1), and 2.5 U of Expand High FidelityPLUS DNA polymerase (Roche Applied Sciences), a mixture of Taq and the proofreading enzyme Pwo. Standard sequencing of PCR products was performed with primers Pol1, Pol2, Pol3, and Pol4 (table 1) and the BigDye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems). Two approaches were used to confirm the authenticity of low-abundance (minor) variants detected by UDPS. For 4 samples, multiple molecular clones were sequenced. For 1 sample, limiting dilution clonal sequencing was performed.
Primers pairs F1-R1, F2-R2, F3-R3, and F4-R4 were used to amplify 4 overlapping, about 400-bp HBV RT fragments (table 1). Each primer consisted of a 5' 19-nucleotide UDPS adaptor, 1 of 4 patient-specific barcodes (AAT, CGT, ACT, and TTA), and a 3' HBV-specific sequence. PCR reactions were performed in a 50-μL reaction mixture that contained 1 μL extracted DNA, 1 X Expand High FidelityPLUS Reaction Buffer, 2 mmol/L MgCl2, 0.2 mmol/L dNTPs, 0.5 mmol/L forward primer, 0.5 mmol/L reverse primer, and 2.5 U Expand High FidelityPLUS DNA polymerase. Limiting dilution PCR analysis confirmed for all samples that >300 amplifiable virus templates were submitted for UDPS.
PCR products were purified by using AMPure beads (Agencourt Biosciences), quantified using Quant-iT Picogreen dsDNA reagent (Invitrogen), and pooled at equimolar concentrations. Clonal amplification on beads (i.e., emulsion PCR) was performed using reagents that enabled sequencing in both the forward and reverse directions (emPCR kits II and III; 454 Life Sciences). Beads containing DNA were isolated and counted on a Multisizer 3 Coulter Counter (Beckman Coulter). UDPS was performed on a Genome Sequencer FLX (454 Life Sciences), and each sample pool was loaded in 1 region of a 70 mm x 75 mm PicoTiter plate (454 Life Sciences) fitted with a 4-lane gasket. Three PicoTiter plates were used to sequence 40 samples, including 37 clinical samples and 3 plasmid control samples..
UDPS generated a median of 15,710 sequence reads of >200 bp/sample. The median sequence read length was 229 bp. Because the sequence reads encompassed the entire RT sequence of 1032 bp plus 196 surrounding base pairs, the median coverage was >2900 sequence reads per nucleotide. Sequence reads were aligned to a consensus genotype sequence of HBV nucleotides by using a variant of the Smith-Waterman algorithm, which took into account the Phred-equivalent quality scores for optimal positioning of insertions and deletions. For all samples, the consensus sequence generated from the UDPS sequence reads matched the sequence generated by direct PCR sequencing.
HBV genotypes and mutations.
The HBV genotypes of direct PCR and clonal sequences were determined using the National Center for Biotechnology Information viral genotyping resource  and the HBV STAR program . A set of 1204 curated HBV sequences obtained from the Hepatitis Virus Database server  was used to create genotype-specific consensus sequences for RT and S protein. Mutations were defined as amino acid differences from the genotype-specific consensus sequence. Established NRTI-resistance mutations included the following RT mutations: rtL80V/I, rtI169T, rtV173L, rtL180M, rtA181TV, rtT184S/A/I/L/F/G, rtA194T, rtS202G/I, rtM204V/I/S, rtN236T, and rtM250V [3, 13]. Mutations at 10 S envelope protein positions were considered possible vaccine-escape mutations, including sP120T, sI/T126N/A, sQ129H, sM133L, sK141E, sP142S, sD144A, sG145R, sF158Y, and sF161Y [14-16].
Statistical analyses performed to identify authentic minority variants.
To distinguish authentic minority variants from technical artifacts, we estimated the technical error rate and then identified a threshold above which mutations detected by UDPS were unlikely to have resulted from a technical artifact. The technical error rate was estimated by PCR amplification and UDPS of an HBV plasmid RT clone. The mean error rate was estimated by comparing each UDPS sequence read to the plasmid control sequence in each of the 3 UDPS sequencing runs. The overall plasmid sequence error rate was 0.27%; single base-pair insertions and deletions comprised 56% of errors, yielding a mismatch error rate of 0.12%. Mismatch errors approximated a Poisson distribution and were rarely observed in >1.0% of nucleotides. Over the 3 runs, a mean of 4.3 nucleotides per sequence had a mismatch error rate >1.0%, and a mean of 1.3 nucleotides per sequence had a mismatch error rate >2.0%. We used this empirically observed distribution of mismatch errors to distinguish sequence errors from authentic minor variants by excluding as possible technical errors all mutations present in <2.0% of sequence reads unless they occurred at positions for which an association with NRTI resistance has been established. For mutations that occurred at such positions, an exclusionary cutoff of <1.0% was used because of the a priori interest in mutations at these positions.
We performed a second analysis to exclude sequence reads that were suspected to have resulted from G-to-A hypermutation mediated by the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC) family of cytidine deaminases, a phenomenon that has recently been reported to occur at low levels in a high proportion of plasma samples from HBV-infected patients [17-19]. We calculated the ratio of G-to-A differences between each UDPS sequence and the direct PCR sequence divided by all non-G-to-A differences (G→A/non-G→A). Sequence reads with 4 G-to-A differences from the consensus direct PCR sequence and a G→A/non-G→A ratio >1.0 were considered to be hypermutated. This ratio is about 6 times higher than would be expected by chance, on the basis of a published HBV nucleotide substitution model .