Automating Archives: The challenges of investigating large-scale web archival infrastructures

Jessica Ogden, Shawn Walker, Ed Summers

Research output: Contribution to conferenceConference Paperpeer-review


In this paper we reflect on the motivations and methodological challenges of investigating the World’s largest web archive, the Internet Archive’s Wayback Machine (IAWM). Using a mixed methods approach, we report on a pilot project centred around documenting the inner-workings of ‘Save Page Now’ (SPN) - an Internet Archive tool that allows users to initiate the creation and storage of ‘snapshots’ of web resources in perpetuity by the Wayback Machine. By improving our understanding of SPN and its role in shaping the Wayback Machine, this work both reveals the myriad ways in which the tool is being used, but also highlights the challenges of designing and operationalising a study of the mostly obscured and dynamic processes that support this information infrastructure at scale. In doing so, the paper welcomes and invites further discussion surrounding investigations of hidden processes that enable large-scale and ‘semi-public’ digital infrastructures. The project is currently in the early data analysis and write-up phase.
Original languageEnglish
Publication statusAccepted/In press - 7 Oct 2019
EventMapping the "How" of Collaborative Action: ACM Conference on Computer-Supported Cooperative Work and Social Computing 2019 - Austin, United States
Duration: 10 Nov 201910 Nov 2019


WorkshopMapping the "How" of Collaborative Action
Abbreviated titleCSCW'19
Country/TerritoryUnited States
Internet address


Dive into the research topics of 'Automating Archives: The challenges of investigating large-scale web archival infrastructures'. Together they form a unique fingerprint.

Cite this