Serine Codon-Usage Bias in Deep Phylogenomics: Pancrustacean Relationships as a Case Study

Omar Rota-Stabelli, Nicolas Lartillot, Herve Philippe, Davide Pisani*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)

82 Citations (Scopus)

Abstract

Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine-TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolved.

Original languageEnglish
Pages (from-to)121-133
Number of pages13
JournalSystematic Biology
Volume62
Issue number1
DOIs
Publication statusPublished - Jan 2013

Keywords

  • phylogenomics
  • ARTHROPOD RELATIONSHIPS
  • AMINO-ACID REPLACEMENT
  • serine
  • Codon-usage bias
  • 21-states CAT model
  • SISTER GROUP
  • SUBSTITUTION MODELS
  • nucleotide composition bias
  • COMPOSITIONAL HETEROGENEITY
  • Pancrustacea
  • MIXED MODELS
  • PROTEIN-CODING SEQUENCES
  • NUCLEOTIDE COMPOSITION
  • MITOCHONDRIAL GENOMES
  • BAYESIAN PHYLOGENETIC INFERENCE
  • 23-states CAT model

Cite this