Looking Back, Looking Forward: New Strategies for Coverage of a National Web Sphere

Bidragets oversatte titel: Se tilbage, se fremad: Nye strategier for dækning af den nationale web-sfære

Publikation: KonferencebidragKonferenceabstrakt til konferenceForskningpeer review

33 Downloads (Pure)


Harvest of national web spheres has now existed for at least a decade, where the internet has been changing rapidly both with respect to contents and use. This has challenged not only the techniques of the harvesting, but also where to look for relevant web material for a national webs sphere.
This presentation starts with an historical overview of changes in collection strategies at the Danish web archive and ends by description of the latest implementations made in Denmark to cover web materials outside the national top level domain.
During the last decade an increasing amount of national web pages have moved to generic Top Level Domains like .com or .org (Mjøs 2012). This challenge has grown as the use of foreign web hotels, blogs and social media like twitter and Facebook has exploded, and where hosts are geographically located outside the country’s boarders.
The challenge is far bigger than anticipated, as a study last year indicated that different methods found different web material. The study used one approach based on Internet Archives data and one approach based on out-links from a national web archive (Zierau 2015). The conclusion is that more methods should be used to find data and embed them in a web archive.
This presentation will include a description of an operational setup to meet this challenge (to be implemented in 2016). The setup is designed to deal with different (present and future) sources. The sources can be any URL set to be investigated. It can also be derived data (e.g. text extracts) to be investigated. Finally, it can be known national URLs that need preparation before ingestion to a web archive. The setup output will be seeds, domains and sub-domains that can be feed into national bulk and selective harvests for the national web sphere.
[book] Mjøs, O. J. (2012). Music, social media and global mobility: MySpace, Facebook, YouTube. Routledge Advances in Internationalizing Media Studies.
[presentation/abstract] Zierau, E. (2015). Identifying National Parts of the Internet Outside a Country’s Top Level Domain. Presented at IIPC GA 2015, Stanford, California, USA.
Bidragets oversatte titelSe tilbage, se fremad: Nye strategier for dækning af den nationale web-sfære
Publikationsdato14 apr. 2016
Antal sider1
StatusUdgivet - 14 apr. 2016
BegivenhedIIPC Web Archiving Conference 2016 - Radisson Blu Saga Hotel, Reykjavík, Island
Varighed: 13 apr. 201615 apr. 2016


KonferenceIIPC Web Archiving Conference 2016
LokationRadisson Blu Saga Hotel