TY - JOUR
T1 - Ten simple rules for writing Dockerfiles for reproducible data science
AU - Nüst, Daniel
AU - Sochat, Vanessa
AU - Marwick, Ben
AU - Eglen, Stephen J
AU - Head, Tim
AU - Hirst, Tony
AU - Evans, Benjamin D
PY - 2020/11/10
Y1 - 2020/11/10
N2 - Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, the main drivers for using these containers are transparency and support of reproducibility; in turn, a workflow's reproducibility can be greatly affected by the choices that are made with respect to building containers. In many cases, the build process for the container's image is created from instructions provided in a Dockerfile format. In support of this approach, we present a set of rules to help researchers write understandable Dockerfiles for typical data science workflows. By following the rules in this article, researchers can create containers suitable for sharing with fellow scientists, for including in scholarly communication such as education or scientific papers, and for effective and sustainable personal workflows.
AB - Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, the main drivers for using these containers are transparency and support of reproducibility; in turn, a workflow's reproducibility can be greatly affected by the choices that are made with respect to building containers. In many cases, the build process for the container's image is created from instructions provided in a Dockerfile format. In support of this approach, we present a set of rules to help researchers write understandable Dockerfiles for typical data science workflows. By following the rules in this article, researchers can create containers suitable for sharing with fellow scientists, for including in scholarly communication such as education or scientific papers, and for effective and sustainable personal workflows.
U2 - 10.1371/journal.pcbi.1008316
DO - 10.1371/journal.pcbi.1008316
M3 - Article (Academic Journal)
C2 - 33170857
SN - 1553-734X
VL - 16
SP - e1008316
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 11
ER -