SUPERFAMILY 1.75 including a domain-centric gene ontology method

David A de Lima Morais, Hai Fang, Owen J L Rackham, Derek Wilson, Ralph Pethica, Cyrus Chothia, Julian Gough

Research output: Contribution to journalArticle (Academic Journal)peer-review

110 Citations (Scopus)

Abstract

The SUPERFAMILY resource provides protein domain assignments at the structural classification of protein (SCOP) superfamily level for over 1400 completely sequenced genomes, over 120 metagenomes and other gene collections such as UniProt. All models and assignments are available to browse and download at http://supfam.org. A new hidden Markov model library based on SCOP 1.75 has been created and a previously ignored class of SCOP, coiled coils, is now included. Our scoring component now uses HMMER3, which is in orders of magnitude faster and produces superior results. A cloud-based pipeline was implemented and is publicly available at Amazon web services elastic computer cloud. The SUPERFAMILY reference tree of life has been improved allowing the user to highlight a chosen superfamily, family or domain architecture on the tree of life. The most significant advance in SUPERFAMILY is that now it contains a domain-based gene ontology (GO) at the superfamily and family levels. A new methodology was developed to ensure a high quality GO annotation. The new methodology is general purpose and has been used to produce domain-based phenotypic ontologies in addition to GO.
Original languageEnglish
Pages (from-to)D427-34
JournalNucleic Acids Research
Volume39
Issue numberDatabase issue
DOIs
Publication statusPublished - Jan 2011

Keywords

  • Sequence Analysis, Protein
  • Phylogeny
  • Phenotype
  • Software
  • Genes
  • Databases, Protein
  • Proteins
  • Protein Structure, Tertiary

Fingerprint

Dive into the research topics of 'SUPERFAMILY 1.75 including a domain-centric gene ontology method'. Together they form a unique fingerprint.

Cite this