Data from: Cyanobacteria and the Great Oxidation Event: evidence from genes and fossils

  • Bettina E. Schirrmeister (Contributor)
  • Muriel Gugger (Contributor)
  • Philip C J Donoghue (Contributor)



NOTE: PLEASE ALSO SEE THE CORRIGENDUM TO THE ORIGINAL ARTICLE, PUBLISHED AT Cyanobacteria are among the most ancient of evolutionary lineages, oxygenic photosynthesizers that may have originated before 3.0 Ga, as evidenced by free oxygen levels. Throughout the Precambrian, cyanobacteria were one of the most important drivers of biological innovations, strongly impacting early Earth's environments. At the end of the Archean Eon, they were responsible for the rapid oxygenation of Earth's atmosphere during an episode referred to as the Great Oxidation Event (GOE). However, little is known about the origin and diversity of early cyanobacterial taxa, due to: (1) the scarceness of Precambrian fossil deposits; (2) limited characteristics for the identification of taxa; and (3) the poor preservation of ancient microfossils. Previous studies based on 16S rRNA have suggested that the origin of multicellularity within cyanobacteria might have been associated with the GOE. However, single-gene analyses have limitations, particularly for deep branches. We reconstructed the evolutionary history of cyanobacteria using genome scale data and re-evaluated the Precambrian fossil record to get more precise calibrations for a relaxed clock analysis. For the phylogenomic reconstructions, we identified 756 conserved gene sequences in 65 cyanobacterial taxa, of which eight genomes have been sequenced in this study. Character state reconstructions based on maximum likelihood and Bayesian phylogenetic inference confirm previous findings, of an ancient multicellular cyanobacterial lineage ancestral to the majority of modern cyanobacteria. Relaxed clock analyses provide firm support for an origin of cyanobacteria in the Archean and a transition to multicellularity before the GOE. It is likely that multicellularity had a greater impact on cyanobacterial fitness and thus abundance, than previously assumed. Multicellularity, as a major evolutionary innovation, forming a novel unit for selection to act upon, may have served to overcome evolutionary constraints and enabled diversification of the variety of morphotypes seen in cyanobacteria today.,Chroococcidiopsis sp. PCC 8201Protein sequences of Chroococcidiopsis sp. PCC 8201. Annotated with Prokka ( sp. PCC 8501Protein sequences of Geitlerinema sp. PCC 8501. Annotated with Prokka ( sp. PCC 73110Protein sequences of Leptolyngbya sp. PCC 73110. Annotated with Prokka ( sp. PCC 7434Protein sequences of Chroococcidiopsis sp. PCC 7434. Annotated with Prokka ( sp. PCC 7704Protein sequences of Pseudanabaena sp. PCC 7704. Annotated with Prokka ( redekei PCC 9416Protein sequences of Limnothrix redekei PCC 9416. Annotated with Prokka ( sp. PCC 8002Protein sequences of Symploca sp. PCC 8002. Annotated with Prokka ( sp. PCC 9635Protein sequences of Synechocystis sp. PCC 9635. Annotated with Prokka ( sp. PCC 8201Genome sequence of Chroococcidiopsis sp. PCC 8201Geitlerinema sp. PCC 8501Genome sequence of Geitlerinema sp. PCC 8501Leptolyngbya sp. PCC 73110Genome sequence of Leptolyngbya sp. PCC 73110Chroococcidiopsis sp. PCC 7434Genome sequence of Chroococcidiopsis sp. PCC 7434Pseudanabaena sp. PCC 7704Genome sequence of Pseudanabaena sp. PCC 7704Limnothrix redekei PCC 9416Genome sequence of Limnothrix redekei PCC 9416Symploca sp. PCC 8002Genome sequence of Symploca sp. PCC 8002Synechocystis sp. PCC 9635genome sequence of Synechocystis sp. PCC 9635APPENDIX S1. Names of genes used in the phylogenomic analysesHighly conserved proteins, which have been identified via similarity searches. In total 756 proteins are present in all 65 cyanobacterial taxa used in this study. Concatenated gene alignments were used for the phylogenomic tree reconstruction.APPENDIX S2. Multiple sequence alignmentMultiple sequence alignment used for tree reconstruction presented in phylip format.APPENDIX S3. Trees presented in this study(a) Phylogenomic Maximum Likelihood tree. (b) Ribosomal Maximum Likelihood tree. (c) Trees with divergence dates.,
Date made available17 Sept 2015

Cite this