Big data experiments with the archived Web: Methodological reflections on studying the development of a nation’s Web

Niels Brügger, Janne Nielsen, Ditte Laursen

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

This article explores how archived Web sources can be used for historical studies of an entire national Web domain and its development over time. It presents the methodological challenges of large-scale studies using Web archive content and discusses the limitations and potential of this new type of study of Web history. It uses the entire Danish Web domain .dk from 2006 to 2015, as it has been preserved in the Danish national Web archive, as a case to exemplify how ‘a nation’ can be delimited on the Web and how an analytical design for this type of big data analysis using archived Web can be developed. This includes considering the characteristics of the archived Web as a historical source for academic studies as well as the specific characteristics of the data sources used. Our findings reveal some of the ways in which a nation’s digital landscape can be mapped by examining Web site sizes and hyperlinks, and we focus on discussing how these results shed light on the methodological challenges, reflections and choices that are an integral part of large-scale Web archive studies. The study demonstrates that hardware and software as well as human competences from various disciplines make it possible to perform large-scale historical studies of one of the biggest media sources of today, the World Wide Web.
Original languageEnglish
JournalFirst Monday (Chicago)
Volume25
Issue number3
ISSN1396-0466
DOIs
Publication statusPublished - 2020

Cite this