Who, What and Where
: Myth-Busting the Digital Economy with Web Data

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)


To make appropriate decisions, policy makers require accurate, up-to-date statistics. To
date, however, most statistics have notoriously struggled to capture the digital economy.
This thesis contributes to a multifaceted understanding of the digital economy by proposing
novel data and methodological pipelines to complement traditional statistical surveys. This
will in turn allow us to characterise and de-mystify the social aspects of the digital economy,
with a focus on demographic and spatial inequalities.
Firstly, we showcase the potential of web data for gaining an atomistic understanding of the
economy. We use web data and methods from Natural Language Processing, specifically
Hierarchical Dirichlet Processes, to identify digital economic activities and to characterise
the demographic features of digital entrepreneurs. This allows us to uncover the persistence
of occupational segregation by race and gender in the digital.
Secondly, we improve the understanding of the digital economy by proposing a research
pipeline able to granularly classify digital economic activities. Such pipeline applies contextualised weak supervision to web text data. We subsequently reflect on temporally
informed extensions of our pipeline using web archival data. We underscore the power of
our proposed methodology for research in economic complexity and evolutionary economic
Thirdly, we integrate the methods and data from the previous chapter to investigate the
spatial features of the digital economy through Inhomogeneous Poisson Point Processes.
This study merges the findings from the previous chapters on the identification of digital
economic activities with econometrics driven experiments to understand whether the digital
economic firms exhibit traditional clustering patterns. Moreover, we uncover whether such
patterns are due to endogenous or exogenous factors, empirically contributing to the ageold discussion over the death of space.
Overall, this thesis finds evidence that web data enables researchers and policy makers to
gain an improved understanding of the digital economy, while at the same time contributing
to de-bunk myths around the equalizing powers of digital technologies for participation in
the economy. We conclude by highlighting the challenges of incorporating such pipelines
into practice and include suggestions for further work in this direction.
Date of Award19 Mar 2024
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorEmmanouil Tranos (Supervisor) & Levi J Wolf (Supervisor)

Cite this