Analysis of genomic-length HBV sequences to determine genotype and subgenotype reference sequences

Anna L McNaughton, Peter A Revill, Margaret Littlejohn, Philippa C Matthews*, M Azim Ansari*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

3 Downloads (Pure)


Hepatitis B virus (HBV) is a diverse, partially double-stranded DNA virus, with 9 genotypes (A-I), and a putative 10th genotype (J), characterized thus far. Given the broadening interest in HBV sequencing, there is an increasing requirement for a consistent, unified approach to HBV genotype and subgenotype classification. We set out to generate an updated resource of reference sequences using the diversity of all genomic-length HBV sequences available in public databases. We collated and aligned genomic-length HBV sequences from public databases and used maximum-likelihood phylogenetic analysis to identify genotype clusters. Within each genotype, we examined the phylogenetic support for currently defined subgenotypes, as well as identifying well-supported clades and deriving reference sequences for them. Based on the phylogenies generated, we present a comprehensive set of HBV reference sequences at the genotype and subgenotype level. All of the generated data, including the alignments, phylogenies and chosen reference sequences, are available online ( as a simple open-access resource.

Original languageEnglish
Pages (from-to)271-283
Number of pages14
JournalJournal of General Virology
Issue number3
Publication statusPublished - 5 Mar 2020


  • HBV
  • reference sequences
  • whole genome
  • phylogenetics


Dive into the research topics of 'Analysis of genomic-length HBV sequences to determine genotype and subgenotype reference sequences'. Together they form a unique fingerprint.

Cite this