BioProject: PRJNA737147 (Datasets)
Protocol approval: 85/2012 (UP)
Ethics approval: S137/2012 (UP)
Funding: National Research Foundation (Grant: 82831)
Poliomyelitis Research Foundation (Grant: 12/41 MSc)
Collaborators: Dr S.M. Bowyer, Prof S.H. Mayaphi
Degree: Master of Science (M.Sc.)
Hepatitis B Virus (HBV) is a DNA virus and was first identified in the 1960s. It is one of at least 18 species within the Hepadnaviridae family which has five genera, two of which are well characterised (Ortho- and Avihepadnaviridae) and three (Meta-, Herpeto-, and Parahepadnaviridae) which have only been recognised recently: with new members still being discovered. These viruses infect the hepatocytes of the liver in the five most common classes of vertebrates: mammals, birds, fish, reptiles, and amphibians. This virus belongs to the genus Orthohepadnavirus which infects the liver of mammals, including the related viruses that infect many non-human primates as well as those infecting squirrels, woodchucks, bats, and equines. Diagnostic tests to determine HBV infection and monitor disease progression measure four viral components found in serum samples; HBV DNA, HBsAg (s-antigen), HBeAg (e-antigen) and serum transaminase (ALT). These markers vary in titer and may all but disappear, depending on which type (acute vs. chronic) and phase the infection is in. The partially double-stranded DNA genome consists of a minus-strand, which spans the full genome, and a plus-strand of DNA spanning approximately two thirds of the genome. Upon infection of the liver cells, the genome is converted to covalently closed circular DNA (cccDNA) of which the plus strand is used for the transcription of viral proteins. Viral replication takes place via an RNA intermediate and reverse transcriptase, which lacks proof-reading and is known to have a high error rate, causing a nucleotide exchange rate of 2.1 x 10-5 to 4.5 x 10-5 per annum. The Hepadnaviridae family, along with the Spumaretrovirinae subfamily of the Retroviridae family, represents the only other animal virus with a DNA genome known to replicate by the reverse transcription of a viral RNA intermediate.
RNA viruses such as the hepatitis C virus and influenza virus, and reverse transcriptase dependent viruses such as HBV and the human immunodeficiency virus (HIV), show a high degree of intra-host variability. This is likely due to the high replication capacity yet low fidelity and lack of proofreading activity of viral polymerases. Thus, much like RNA and retroviruses, an intra-host virus population, referred to as the viral quasispecies, arises during HBV infections. This quasispecies consists of major, intermediate and minor variants which occur at frequencies of >20, 5-20, and <5 %, respectively. These variants can then be transmitted to subsequent hosts leading to exponential changes at the DNA level between hosts. Studies on the nucleic acid sequence of HBV DNA has led to further classification into eight different genotypes or genetic subtypes, denoted A to H, based on pair wise differences >8 and <14%; with further sub classifications into subgenotypes, denoted numerically, based on pair wise differences >4 and <8%. This classification system has become the standard reference nomenclature for distinguishing different strains of HBV. To date, HBV A1 remains the dominant and endemic subgenotype among South Africans, and those strains of whom the full genome sequences are available usually cluster together in a clade which is loosely termed “African A1”. Moreover, those infected with subgenotype A1 have markedly lower levels of HBV DNA in both HBeAg and anti-HBeAg positive phases. Beyond having a unique subgenotype circulating South Africa, another common feature of infection has been observed; HBeAg is lost very early in the infection of South Africans which results in only 5% of those infected having HBeAg in adulthood. A recent study in the research laboratory of the Department of Medical Virology, University of Pretoria, which observed the impact of HIV on HBV infection in South Africa reported a threefold greater prevalence in the HIV positive cohort compared to the HIV negative controls and subsequent genotyping and phylogenetic analyses identified several isolates clustering together with an atypical outliers in a subclade of subtype A1. Since the core primers were used for specimens which did not amplify with the surface primers most have low to very low viral loads and data is only available in both regions for one specimen. Nonetheless, a similar clade of outliers clustered together for sequences based on the surface region. Protein analysis showed interesting change in the core specimens at the nucleotide level and a blast search showed only a 95% similarity between the best match and study outliers.
The Hepatitis B Virus is a blood-borne virus and roughly 50 – 200 times more infectious than HIV. HBV infections have become a public health problem worldwide, with approximately 2 billion people with markers of past infection and an estimated 240 million patients who are currently chronically infected. Infection with this virus leads to a spectrum of liver diseases from subclinical, acute and fulminant hepatitis to an asymptomatic “carrier” state and/or chronic hepatitis which may lead to cirrhosis and hepatocellular carcinoma (HCC). Clearly, further studies on the different genotypes and subgenotypes of HBV and how this relates to clinical consequences in Africa are warranted, however, as selection occurs on the entire viral population, and progresses over the course of infection, the dynamics of viral evolution cannot be understood from the fittest strain alone. Therefor, quasispecies reconstructions present a unique avenue to gain novel insights into the emergence of novel or atypical strains.
To characterise the full genome of unique, atypical laboratory specimens as well as rare or unusual genotypes of the hepatitis B virus identified in urban cohorts in previous studies recruited from a secondary referral hospital in Pretoria, South Africa. Characterisation will be done to access:
Intra-host variation of quasispecies.
Inter-host variation of transmitted strains.
To establish a full genome PCR assay, using optimised methods and primers, for typical African subtype A1 specimens from patients with a high viral load.
Use this assay to perform full genome PCR and sequencing on unusual African specimens identified in previous studies by PCR and sequencing of the Core and Surface region.
Reconstruct the viral meta-populations, known as quasispecies, to assess the extent of intra-host variation.
Mapping known and unknown change (from this study) onto a linear template reference genome annotated according to X02763 to assess the extent of inter-host variation.
Perform phylogenetic analysis of sequences to determine the prevalence of genotypes in the African cohort and for detailed characterization of these specimens.
Further analysis of this sequence data, and epitope regions, for known and unknown variation, possibly due to seroconversion, recombination, or drug-resistance.
HBV DNA will be extracted from plasma samples on the MagNA Pure LC instrument for automated DNA extraction with the MagNA Pure total NA extraction kit according to manufacturer’s instructions. Should only small sample volumes, with a relatively low viral load, be available, the QIAamp MinElute Virus Spin kit will be used according to manufacturer’s instructions for manual column-based extraction. Amplification for full-length HBV genome will be performed using long range PCR per the method first described by Günther et al. (1995), with some modification, by means of the Expand high-fidelity PCR assay. Success of amplification will be monitored by TBE-Agarose electrophoresis. Samples will initially be subjected to a PCR reaction with primers that selectively amplify all known genotypes excluding subgenotype A1 to screen for dual infections with other genotypes that may occur as a co-infection with the typical HBV A1. PCR amplicons will be purified with the DNA Clean & Concentrator™-25 kit according to manufacturers’ instructions. Purified amplicons and relevant controls will be sent to Inqaba biotech (Pretoria), for next generation sequencing on the MiSeq sequencer (Illumina), at a depth of at least 10 %. The FASTQ sequence files will be analysed for quality and assembled to a reference in Geneious (www.geneious.com). Read quality will be assessed for each file to determine quality score distributions, read length distribution, overrepresented sequences and k-MERs. The viral quasispecies will be reconstructed for each specimen where sufficient read-depth or coverage is available by the three previously validated Java executable algorithms: QuRe V.0.99971, QuasiRecomb V.1.2 and k-GEM V.0.3.1. Each algorithm is executed from the command prompt interface with the Java developers’ kit V.1.7.0-25. Phylogenetic analysis will be done by means of the maximum likelihood method, as implemented in PhyML V.3.1 and variant calling of SNP's done in Geneious for read assemblies and VarScan for reconstructed quasispecies.
Le Clercq, L.-S., Bowyer, S.M. and Mayaphi, S.H. (2023). Intra-host quasispecies reconstructions resemble inter-host variability of transmitted chronic hepatitis B virus strains. bioRxiv. https://doi.org/10.1101/2023.05.15.540814
Le Clercq, L.-S., Bowyer, S.M., Mayaphi, S.H. (2014). Full Genome Amplification and Quasispecies Reconstruction of the Hepatitis B Virus from Next Generation Sequencing Data. In Poster Presented at: Faculty Day 2014 (Faculty of Health Sciences). University of Pretoria, Pretoria. URI: http://hdl.handle.net/2263/91879 DOI: https://doi.org/10.13140/RG.2.2.24117.04322
Le Clercq, L.-S., Bowyer, S.M., Mayaphi, S.H. (2021). Full genome PCR amplification of all African Hepatitis B Virus genotypes. protocols.io. https://dx.doi.org/10.17504/protocols.io.bvykn7uw
Le Clercq, L.S. (2014). Molecular Characterization of Full Genome Hepatitis B Virus Sequences from an Urban Hospital cohort in Pretoria, South Africa. Masters dissertation, University of Pretoria. URI: http://hdl.handle.net/2263/43142 DOI: https://doi.org/10.13140/RG.2.2.33619.71204 ISBN: 978-0-7961-1314-6 (Print), 978-0-7961-1315-3 (Electronic)