Skip to content

Digging deeper - A new data mining workflow for improved processing and interpretation of high resolution GC-Q-TOF MS data in archaeological research

Research output: Contribution to journalArticle

  • Ansgar Korf
  • Simon S Hammann
  • Robin Schmid
  • Matti Froning
  • Heiko Hayen
  • Lucy J E Cramp
Original languageEnglish
JournalScientific Reports
Issue number767
DateAccepted/In press (current) - 16 Dec 2019
DatePublished - 31 Jan 2021


Gas chromatography-mass spectrometry profiling is the most established method for the analysis of organic residues, particularly lipids, from archaeological contexts. This technique allows the decryption of hidden chemical information associated with archaeological artefacts, such as ceramic pottery fragments. The molecular and isotopic compositions of such residues can be used to reconstruct past resource use, and hence address major questions relating to patterns of subsistence, diet and ritual practices in the past. A targeted data analysis approach, based on previous findings reported in the literature is common but greatly depends on the investigator’s prior knowledge of specific compound classes and their mass spectrometric behaviour, and poses the risk of missing unknown, potentially diagnostic compounds. Organic residues from post-prehistoric archaeological samples often lead to highly complex chromatograms, which makes manual chromatogram inspection very tedious and time consuming, especially for large data sets. This poses a significant limitation regarding the scale and interpretative scopes of such projects. Therefore, we have developed a non-target data mining workflow to extract a higher number of known and unknown compounds from the raw data to reduce investigator’s bias and to vastly accelerate overall analysis time. The workflow covers all steps from raw data handling, over feature selection, and compound identification up to statistical interpretation.

    Research areas

  • Analytical chemistry, Archaeology, Data mining, Mass spectrometry

Download statistics

No data available



  • Full-text PDF (final published version)

    Rights statement: This is the final published version of the article (version of record). It first appeared online via Springer Nature at . Please refer to any applicable terms of use of the publisher.

    Final published version, 1.56 MB, PDF document

    Licence: CC BY


View research connections

Related faculties, schools or groups