Socio-cultural challenges in collections digital infrastructures

Marco Humbel, Julianne Nyhan, Nina Pearlman, Andreas Vlachidis, JD Hill, Andrew Flinn

Journal of Documentation

ISSN: 0022-0418

Article publication date: 3 September 2024

Issue publication date: 2 January 2025

Downloads

329

pdf (436 KB)

Article
Supplementary Material

Abstract

Purpose

This paper aims to explore the accelerations and constraints libraries, archives, museums and heritage organisations (“collections-holding organisations”) face in their role as collection data providers for digital infrastructures. To date, digital infrastructures operate within the cultural heritage domain typically as data aggregation platforms, such as Europeana or Art UK.

Design/methodology/approach

Semi-structured interviews with 18 individuals in 8 UK collections-holding organisations and 2 international aggregators.

Findings

Discussions about digital infrastructure development often lay great emphasis on questions and problems that are technical and legal in nature. As important as technical and legal matters are, more latent, yet potent challenges exist too. Though less discussed in the literature, collections-holding organisations' capacity to participate in digital infrastructures is dependent on a complex interplay of funding allocation across the sector, divergent traditions of collection description and disciplinaries’ idiosyncrasies. Accordingly, we call for better social-cultural and trans-sectoral (collections-holding organisations, universities and technological providers) understandings of collection data infrastructure development.

Research limitations/implications

The authors recommend developing more understanding of the social-cultural aspects (e.g. disciplinary conventions) and their impact on collection data dissemination. More studies on the impact and opportunities of unified collections for different audiences and collections-holding organisations themselves are required too.

Practical implications

Sustainable financial investment across the heritage sector is required to address the discrepancies between different organisation types in their capacity to deliver collection data. Smaller organisations play a vital role in diversifying the (digital) historical canon, but they often struggle to digitise collections and bring catalogues online in the first place. In addition, investment in existing infrastructures for collection data dissemination and unification is necessary, instead of creating new platforms, with various levels of uptake and longevity. Ongoing investments in collections curation and high-quality cataloguing are prerequisites for a sustainable heritage sector and collection data infrastructures. Investments in the sustainability of infrastructures are not a replacement for research and vice versa.

Social implications

The authors recommend establishing networks where collections-holding organisations, technology providers and users can communicate their experiences and needs in an ongoing way and influence policy.

Originality/value

To date, the research focus on developing collection data infrastructures has tended to be on the drive to adopt specific technological solutions and copyright licensing practices. This paper offers a critical and holistic analysis of the dispersed experience of collections-holding organisations in their role as data providers for digital infrastructures. The paper contributes to the emerging understanding of the latent factors that make infrastructural endeavours in the heritage sector complex undertakings.

Keywords

Citation

Humbel, M., Nyhan, J., Pearlman, N., Vlachidis, A., Hill, J. and Flinn, A. (2025), "Socio-cultural challenges in collections digital infrastructures", Journal of Documentation, Vol. 81 No. 1, pp. 56-85. https://doi.org/10.1108/JD-12-2023-0263

Publisher

:

Emerald Publishing Limited

1. Introduction

This paper explores the accelerations and constraints libraries, archives, museums and heritage organisations (“collections-holding organisations”) [1] face in their role as collection data providers for digital infrastructures. To date, such infrastructures operate within the cultural heritage domain typically as data aggregation platforms of national, international or domain-specific scope. Examples are for instance Japan Search, Cultura Italia, Europeana or Art UK (Paltrinieri, 2021). We address the following research questions:

(1)
What are collections-holding organisations' experiences of participating in digital infrastructure projects?
(2)
Which factors enable and impede collections-holding organisations from unifying siloed collections?
(3)
Which risks and opportunities do collections-holding organisations perceive in large-scale infrastructure investments, such as the UK's Towards a National Collection (TaNC) programme?

After 2 decades of investments in mass digitisation projects collections' metadata, digital images of objects and machine-readable full-text documents are available at an unprecedented scale. Scholars thus speak about “collections as data – a conceptual orientation to collections that renders them as ordered information, stored digitally that are inherently amendable to computation” (Padilla et al., 2019, p. 7). Accordingly, funders' attention is moving towards finding the most “suitable infrastructure components and methods” for accessing collection data at scale and, crucially, reaping their potential (Ahnert et al., 2023, pp. 23–24). The literature shows that this potential is being sublimated in different ways by different stakeholders (McGillivray et al., 2020, pp. 11–13). Researchers are asking questions latent to data, or they are using digital collections to develop critical understandings of the capabilities and impact of technology on collections (Van Strien et al., 2020; Beelen et al., 2023). For collections-holding organisations, aggregated mass digitisation opens opportunities to research collection histories and biases in new ways (MacDonald, 2023). In the creative sector, digital collections are used for artistic interventions, inspiration and content production (Terras et al., 2021). There is also a trajectory within governmental policies to justify mass digitisation investments as a means for boosting the digital economy. For the European Commission heritage collections are for instance “[…] an important contributor to the European economy, fostering innovation, creativity and economic growth […] and the reuse of such [mass digitised] content can generate new jobs not only in the cultural heritage sector but also in other cultural and creative sectors, including for instance the video game and film industries” (European Commission, 2021: para. 4). To bolster research and the commercial potential of digital collections there is a strong push towards Open Access to cultural heritage from governmental and non-governmental bodies [2]. A prime example is the OpenGLAM initiative which advocates for Open Access to cultural heritage and calls collections-holding organisations to adopt open licenses for metadata and digital surrogates of objects [3]. The OpenGLAM initiative received financial support from the European Commission, the Open knowledge Foundation and Creative Commons (OpenGLAM Initiative, 2023). Similarly, Open Access has increasingly become a condition for receiving funding for digitisation, as in the case of the UK National Lottery Heritage Fund for instance [4]. Yet in tandem, some UK governmental policies call collections-holding organisations to attract external funding for mass digitisation by collaborating with commercial enterprises in so-called private-public partnerships. This practice places digital collections either behind paywalls entirely or under restrictive licensing conditions and creates a certain cognitive dissonance for collections-holding organisations who are implicitly called on to pursue two apparently contradictory paths simultaneously (Ahnert et al., 2023, pp. 24–25).

For those who seek pathways to unlocking the perceived latent creative, economic and cultural value of the mass digitisation activities undertaken in the past decades, digital infrastructures for unifying digitised collections have taken on a decisive significance. Accordingly, significant investments in digital infrastructures are being made by players in the governmental-, private-, heritage- and academic-sector. The Australian Research Data Commons invested part of an 8.9 million Australian Dollar programme (2021–2023) in the Trove aggregation platform to improve data-driven access to collections for humanities and social science researchers (ARDC, 2023a; ARDC, 2023b). In 2022, a consortium led by Europeana was awarded a multi-million Euro service contract by the European Commission for deploying the so-called “common European data space for cultural heritage”. The aim is to expand the existing Europeana aggregation infrastructure and its interoperability with the wider planned data space ecosystem [5]; invest in high-quality digitisation and data enrichment; and capacity building and training; and public engagement (European Commission, 2022). Europeana and the data space are also expected to become important dissemination platforms for adjacent digital infrastructures, such as the Horizon Europe-funded “European Collaborative Cloud for Cultural Heritage” – a planned cloud infrastructure aimed at heritage professionals for data sharing and leveraging digital research tools (European Commission et al., 2022, pp. 8;24; European Commission, 2024). Within these projects and schemes, aggregations are imagined as spaces to bring together 2 decades of mass-digitised collections and to facilitate cross-search. Yet they are more significant than this: aggregations promise to satisfy the needs and expectations of communities of interest that cut across disciplines, sectors and society. Thus, aggregation infrastructures are also spaces where uses and boundaries converge and even collide. Given the gaps and omissions in mass digitisation efforts (Zaagsma, 2023; Mak, 2014) aggregation infrastructures will also shape who and what is remembered, and how digital collections can and cannot be used in future.

The UK context illustrates the scale of investment being put into stimulating access to mass digitised collections through digital infrastructures. At the time of writing in 2023, at least two UK Research and Innovation (UKRI) funded schemes are underway. Both have overlapping and potentially competing ambitions: The five-year (2019–2024) £18.9m Arts and Humanities Research Council's (AHRC) TaNC programme seeks to “[…] take the first steps towards creating a unified virtual “national collection” by dissolving barriers between different collections – opening UK heritage to the world” (TaNC, 2023). The stakes in the programme's deliverables are high. After all, AHRC claims the programme to be “[…] the largest investment of its kind to be undertaken to date, anywhere in the world […]” [6]. Through TaNC, the UK aims to secure a pioneering position in interdisciplinary research by “set[ting] global standard[s]” for the creation of digital collections (UKRI, 2021). But TaNC is not the only major investment seeking to unify collections at scale in the UK. The non-profit company, Open Data Institute (ODI) gained £8m from UKRI's Innovative UK for setting up a Research and development programme “[…] to support innovation, improve data infrastructure and encourage ethical data sharing” (ODI, 2023). As part of this programme, the ODI commissioned the UK Collections Trust from 2020 to 2021, to research how museum collections could be “joined-up” and access provided “seamlessly” (Himmelsbach, 2021). Art UK, a major aggregation platform with artworks from over 54,000 artists, takes together with the University of Leicester a leading role in the project. Their endeavours will result in the Museum Data Service (MDS) which is described as a “real world digital infrastructure”, which “[…] will act as a data repository for tens of millions of raw object records drawn from UK museums and other public collections” (Art UK, 2022). The research presented in this article is conducted as part of “The Sloane Lab: Looking back to build future shared collections,” which is one of TaNC's five Discovery Projects. In this paper, we address however issues about connecting and using collections at scale which go beyond the immediate focus of TaNC.

Given the fundamental role collections-holding organisations are ought to have in these digital infrastructures as those who curate collections and bring them into machine-processable forms in the first place, surprisingly little research is concerned with their perspectives and capabilities to do so. A number of surveys benchmark the progress and extent of digitisation and data dissemination in the heritage sector (Nauta et al., 2017; McCarthy and Wallace, 2018; Gosling et al., 2022; Estermann, 2018). Some research has focussed specifically on small and activist-led collections-holding organisations' perspectives on digital infrastructure developments (Caswell and Jules, 2017; Gosling et al., 2022, pp. 29–32; Humbel, 2023). A recent study identified the technical requirements of institutions with collections from the natural sciences in the context of the planned Swiss Virtual Natural History aggregator (Petrus et al., 2023). To date, the experiences of collections-holding organisations in their role as data providers for collection data infrastructure development are not well known and are undertheorised. Moreover, information pertinent to such an endeavour is dispersed across an abundance of case-study reports and grey literature (see Section 2).

To move towards a critical and holistic understanding of collections-holding organisations' perspectives on digital infrastructure projects the Sloane Lab conducted a series of semi-structured interviews with curators, archivists, librarians and IT specialists who are responsible for collection management, digitisation and collections aggregation. Digital infrastructure projects have a history of being stalled (see Section 5). In this paper, we demonstrate that if mass digitisation projects' communities of interest want to better understand the potentials, limitations and implications of collection data aggregation then it is vital to listen to the people who are in collections-holding organisations placed to facilitate digital infrastructure development. Doing so is indispensable to move towards an “infrastructural inversion” of heritage aggregation projects, “[…] recognizing the depths of interdependence of technical networks and standards, on the one hand, and the real work of politics and knowledge production on the other” (Bowker and Star, 1999, p. 34).

The paper is organised as follows: Section 2 unpacks aspects of the current discourse on digital infrastructure development in the heritage sector that remain underdeveloped. Section 3 describes the methodology; Section 4 is the analysis of interview consultations. Section 5 responds to the research questions by discussing the findings of the interviews. This paper concludes in Section 6 with a set of recommendations for future digital heritage infrastructure development and research.

2. Literature review

In the cultural heritage domain, the term “aggregator” refers to a system where data of participating collections-holding organisations are centrally collected, structured into a common format (e.g. the Europeana Data Model – EDM) and published on an online platform (Europeana, 2010, p. 2). In principle “aggregation” does not signpost a particular technological approach beyond the centralised unification of collections. An aggregator may crawl data through bots or import data through a protocol such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).[7] Collections-holding organisations may also proactively push data to aggregators through an Application Programming Interface (API) (Collections Trust, 2019, pp. 8–9).[8] Decentralised data aggregation models can be also implemented, involving agencies serving as data aggregators at either the national or domain level. Examples are Cultura Italia or the UK legacy aggregator Culture Grid (see Section 5), which bundle digital collections on a national level and push them to their own platform and Europeana [9]. The Sloane Lab is an aggregator too. Data of contemporary and historical catalogue records are converted into the Resource Description Framework (RDF) format and ingested in a central Knowledge Base (Nyhan et al., 2023, p. 24).

It is well known that a lack of sustainable funding, expertise, equipment, personnel and sector-wide strategies for coordinating digitisation projects impact collections-holding organisations' capacity to digitise (Pandey and Kumar, 2020, pp. 29–30) and by extension to participate in aggregation infrastructures. However, the question of how collections can be unified at scale is mostly discussed as a question that can be solved by finding the appropriate technology and legal policy. The research agenda of TaNC's Foundation Projects had for instance a strong emphasis on the potential of technologies and standards such as the International Image Interoperability Framework (IIIF), Persistent Identifiers (PIDs) and Linked Open Data (LOD) for creating aggregated collections (Padfield et al., 2022; Winters et al., 2022; Kotarski et al., 2022); and means for enhancing collections discovery through Artificial Intelligence (AI) or geo-spatial information (Rees et al., 2022; Angelova, 2021). Another strand of TaNC looked at the implications of restrictive licensing schemes for reusing collections and copyright management at scale (McNeill, 2022; Wallace, 2022). TaNC's research emphasis [10] is symptomatic of the core challenges mass digitisation and subsequent aggregation projects keep grappling with the harmonisation of intellectual property licensing practices as well as developing interoperable technical tools, standards and systems (Thylstrup, 2018, pp. 66–67) (see Figure 1).

However, examples like TaNC's Provisional Semantics project suggest that research on “structures, practices, policies, collections and cultures” open crucial research arenas for enriching infrastructure development beyond the legal and technical sphere (Pringle et al., 2022, p. 10). This can extend to contesting institutionalised racism and injustices embedded in collections, argues the project. Indeed, even in the context of the seemingly more mundane and practical concerns of how to unify collections at scale, Science and Technology Studies established that sociocultural factors “[…] often are greater determinants of interoperability than is technology, per se […]” (Borgman, 2015, p. 47). In this context, Edwards et al. observe how infrastructures tend to be conceptualised as “laundry lists” which aim to facilitate interdisciplinary information management and work first and foremost through technological means (2007, p. 5). The planned MDS is for instance anticipated to work as follows:

The technical solution offers a sustainable, low-cost mechanism to aggregate, manage, and index collection metadata from UK collections by “harvesting” this “raw” or “native” data and storing it in a cloud-based Data Repository. Initially, this will not include images. MDS will provide tools for public users and authorised partners to find, select and make use of the data in third-party systems and services, subject to rights and licensing restrictions. There are five key parts to the process: HARVEST, STORE & MANAGE, INDEX, DISCOVER, USE [capital letters in the original] (Art UK, 2022).

Finding “technical solutions” for infrastructure development is important. What gets lost with this focus are the challenges of crossing temporal, institutional and disciplinary boundaries (Bowker et al., 2009, pp. 100–01). In the academic and grey literature on collections unification, the sociocultural factors that can limit collections-holding organisations' ability to exchange and aggregate data are present, but rarely the focal point of discussion. At best, non-technical challenges are mentioned incidentally, for example, with a brief reference to different levels of collections description, disciplinary or local variations of metadata standards, working habits, staff knowledge, available resources for cataloguing, collection management systems' support of standards or the formalising of partnership agreements (Darcovich et al., 2019, p. 9; Kostelic, 2017, pp. 68–69; Maron and Feinberg, 2018, pp. 688–90; Moulaison Sandy and Freeland, 2016, pp. 48–49; Renshaw and Liew, 2021, pp. 708–10). In other words, collections' interoperability is shaped by the histories of collections' arrangement and organisation (Sloan and Nyhan, 2021, pp. 209–14). And yet the focus of the discourse on aggregation projects tends to be on which tools and licensing policies enable the unification of collections. This can obscure the labour, agency, conventions and practices of the people who are responsible, behind the scenes, for collection interpretation, management and care (Sherratt, 2015, pp. 7–8).

Yet the need to shift emphasis in infrastructure development is starting to be recognised. Interviews with TaNC's Foundation Project Investigators brought up that: “Most of the current calls for digital projects seem to push for developing tools that are sometimes not linked to actual needs; there needs to be more reflection around these tools and the practices behind them, and tools [sic] development should be an integral part of research and curatorial practices” (Paltrinieri, 2023, p. 23). What, then, could a move towards a more holistic research agenda for infrastructure development look like? And what can be the gain from this shift of focus?

Star and Ruhleder argue that infrastructure development entails a complex interplay between the social and technological, the local and the global. Crucially, functional infrastructures emerge from the bottom up. Global technology and standards need to be able to accommodate local practices and conventions. The standing question is thus not “what” an infrastructure is (e.g. an aggregator), but “when” something becomes an infrastructure (1996, pp. 112–14). Ideally, an infrastructure is characterised by being “embedded” in social and technological arrangements, not built on top of them. By extension, a working infrastructure becomes “transparent” and visible only when it breaks. Infrastructures also have some important caveats that need to be considered in their development and deployment. Conventions are hallmarks of infrastructures and familiarisation; they urge new members to become part of existing communities of practice. These existing conventions and practices are often inherited and may have strengths, such as allowing connections with other infrastructures. But inherited conventions may also be weaknesses that lock infrastructure participants in certain practices and at worst may force the adaptation of practices favoured by more powerful participants (Star and Ruhleder, 1996, p. 113). Thus, for infrastructure development, sociocultural questions need to be considered alongside technical ones. Cognizance of this is important for infrastructural actors and agents who wish to be more responsive towards the local and situated needs of an infrastructure's stakeholders, rather than setting boundaries in advance (Edwards et al., 2007, p. 7). Understanding infrastructures as more than technologies and standards allows us to push forward in developing nuanced understandings of the interests, limitations and potentials that converge in them and that must in turn support the interests of numerous diverse stakeholders.

While outside the scope of this paper such examinations call on us “[…] to reflect critically on what kind of social values and ways of thinking and working are embedded in planned infrastructures […]” and crucially how “[…] to contribute to the reconfiguration of the global representation of digital knowledge” (Pawlicka-Deger, 2021, p. 540). Recent examples of such approaches are for example Tzouganatou's analysis of APIs as accelerators and barriers for participatory memory practices (2021) or the study of the craft and precarity of maintenance work for collection databases (Thomer and Rayburn, 2023).

3. Methodology

The Sloane Lab determined semi-structured interviews as the most appropriate data collection method. This decision was based on our recognition of the limited availability of qualitative studies focussing on perspectives of collections-holding organisations and their role as data providers for aggregation infrastructures. Semi-structured interviews have the potential to contextualise and extend information that is otherwise either not available to the public at all or dispersed across the grey literature (Hauswedell et al., 2020, p. 140). The format of semi-structured interviews provides a framework for a focussed conversation between interviewer and interviewee. But in contrast to more structured interview approaches, interviewees have in the semi-structured format the possibility to gain a more active role in the process of “knowledge production” by having the space to expand on issues they identify as relevant (Brinkmann, 2018, pp. 579–80).

We used a “key knowledgeable” sampling strategy to identify interviewees. The approach is a purposeful sampling strategy “[…] to create a group of cases that provide information-rich data-gathering and analysis possibilities” on “highly specialized” subjects areas (Patton, 2015, pp. 405; 408–09), such as digital collection aggregations and infrastructures. In total, we identified and contacted 18 collections-holding organisations and aggregators based in the United Kingdom, the European Union and Australia. Of these 18 organisations 10 agreed to participate in an interview. Table 1 presents a pseudonymised overview of the interviewed organisations.

The interview questionnaire [14] was derived from a review of the literature (partly formalised in section two) and through consultation with Sloane Lab team members with extended experience conducting qualitative data collection methods (Humbel, 2022; Nyhan and Flinn, 2016). The questionnaire covered the following thematic areas that are lesser explored in the literature:

(1)
Rationale for participating in unified collections and infrastructures
(2)
Experience with former collaborations
(3)
Technical capabilities and data interoperability
(4)
Legal aspects
(5)
Anticipated use and benefits of unified collections and infrastructures

Our research is approved by the Research Ethics Committee of University College London (ethics ID: 22509/001). All interviews were facilitated by the paper's first author and were held in June and July 2022 online via Microsoft Teams. Interviewees received the questionnaire, an information sheet and consent form prior to the interview. All interviewees gave consent to participate, record and transcribe the interview. For the recording, we used a H4n Pro recording device. The interview recordings were transcribed manually. Subsequently, the full transcriptions were returned to interviewees who were asked to inform us within four weeks about clarifications, amendments or whether they wished to withdraw from transcripts. This approach does not imply that interviewees agree with our findings or the arguments we present in this paper.

For our data analysis, we applied what some call the “Miles, Huberman and Saldaña approach” for qualitative data analysis (Punch, 2014, p. 6173). The approach offers a robust framework for qualitative data analysis without being bound “to any one particular genre of qualitative research” (Miles et al., 2020, p. 6). This framework consists of three simultaneous streams of activities: “data condensation”, “data display” and “drawing and verifying conclusions” (Miles et al., 2020, pp. 68–10). We used the “first-cycle” coding methods of descriptive and In Vivo coding (verbatim codes derived from responses) to explore the content of each interview. The descriptive coding served for the initial familiarisation with the data, and In Vivo coding for gain a deeper understanding of the interviewee's perspectives. Following In Vivo coding's ethos of honouring the interviewees' voice we use direct quotes in section four relatively frequently (Saldaña, 2016, pp. 6102–10). In a second step, we grouped the codes into thematic categories. This process of data condensation was accompanied by a reflective reading of the subject-specific literature and narrating the findings deriving from the codes into analytic memos. Data display in mind maps supported the process of data condensation. We verified emerging observations by checking how patterns repeat or deviate across the interviews (Miles et al., 2020, pp. 679–91; 296–303). Table 2 exemplifies first cycle codes and thematic units.

4. Data

4.1 Motivation for aggregated collections

Most of our interviewees acknowledge the potential of aggregators to enhance collection discoverability and to reach people “[…] who would never think of coming to this type of [herbaria] collection” (CHO4), for instance. Most interviewees participate in a wide range of aggregation infrastructures. These platforms are either domain-specific (e.g.: the Global Biodiversity Information Facility - GBIF, Art UK, Early English Books Online) or dedicated to a specific geographic scope (e.g.: Scran, Archives Hub or Europeana). Some interviewees understand the concept of aggregation broadly and mentioned platforms developed outside the cultural heritage domain for sharing bespoke dataset types, such as Flickr Commons for images or Hugging Face for machine learning models. Our interviewees' core user groups are still expected to use the institutional catalogue. There was scepticism about whether aggregations are of relevance for their core audiences beyond offering enhanced collections discoverability (CHO2; CHO3; CHO6; CHO8). This is particularly the case for the community archives domain. CHO1 observes that reaching out to new audiences and promoting collections through aggregations are potential benefits for community archives. However, because community archives' main target group tends to be “their local community” it is, according to our interviewee, questionable as to whether aggregations are of relevance to community archives and whether they can withstand a cost-benefit analysis. Some community archives may also not be aware that aggregations could be a way of collection dissemination (CHO1).

Some interviewees recognise the potential of aggregation for facilitating high-level analysis, creative use of historically dispersed collections or mapping phenomena like biodiversity. Such applications of collection data are, however, seen as an area of interest for research-oriented audiences with technical skill sets, such as the Digital Humanities domain. Meta-analysis of historic collections is an area of interest mirrored in the ambitions of AP1 who describe the opportunities of aggregated collection data in “[…] better illuminate[ing] past relationships that have been obscured through institutional records”. For AP2, in contrast, the potential of collection data extends beyond the research domain, into education, tourism, and information services like maps and infotainment systems. AP1's experiences with their project partners suggest that there could be an interest in aggregated collections for heritage organisations themselves, beyond serving just their audiences. Not discussed directly in interviews, this could be realised in terms of organisations leveraging data-driven insights on questions relating to collection policy, such as the extent of cultural diversity reflected within collections, for example.

Some of our interviewees illuminated how, for smaller collections-holding organisations, aggregations are the sole means of allowing their collection to be searched online (CHO1; CHO5). Indeed, infrastructures like the HOPE aggregator (Heritage of the People's Europe), hosted by the International Association for Labour History Institutions (IALHI), were explicitly built with the intention of supporting organisations with limited capacity for making collections available online (Siebinga et al., 2012, p. 5). The same applies to regional initiatives. Ulf Preuss describes, in the context of a regional aggregator for the German Digital Library, a “cooperative approach” where larger organisations support smaller partners in the provision of expertise and infrastructure, as well as in skill building. Large project partners also took over administrative tasks and funding acquisition (2016, p. 66; 10–11). The benefits of aggregation infrastructures for large collections-holding organisations can go beyond improving collection discoverability.

CHO2 explains how motivations for participating in aggregation infrastructures include facilitating ways of data access and display which go beyond the possibilities of their in-house catalogue. Information enrichment through other collections or other third-party resources, such as authority files, via linked data is seen as a benefit too (CHO4; CHO6). Reports commissioned by TaNC and Collections Trust suggest that services provided by an aggregator, like data enrichment, long-term digital preservation or copyright clearance could act as participation incentives for organisations (Gosling et al., 2022, pp. 634–35; Collections Trust, 2019, p. 650; Wallace, 2022, pp. 655–58; McNeill, 2022, pp. 624–26). Some aggregators like PHOTOCONSORTIUM or the Digital Repository of Ireland (DRI) advertise such offers [15]. In response to the question of whether such additional services are part of aggregators' responsibilities, interviewees expressed a need for aggregators to lower the threshold for data submission as much as possible. Everything else beyond facilitating data unification and dissemination would for most interviewees depend on the aggregator's remit, however, and may be better served by other bodies in the heritage sector. One explanation for the reluctance that exists in seeing aggregation infrastructures as something more than data dissemination platforms relate to the sustainability of past initiatives.

4.2 Aggregation fatigue

As the range of aggregation initiatives listed in Section 4.1 suggests, feeding into these infrastructures is for some interviewees a common practice (CHO10; CHO7). For most collections-holding organisations we spoke with, participating in aggregations is however seldomly part of a larger, sustained strategy and often takes place on an ad-hoc basis. It is rather a “nice-to-have” than a “must-have” (CHO5). Contributing to aggregators is not necessarily part of ongoing digitisation programmes. The competitive environment for gaining funding for digitisation and participating in infrastructure projects was identified as an important factor that led to the development of siloed collections (CHO2). Interviewees expressed that if making collection data available in digital infrastructures is something collections-holding organisations ought to be doing then there needs to be a more sector-wide strategic approach with appropriate funding structures (CHO3).

Within the UK context, the funding landscape for digital projects is indeed patchy and TaNC is a public investment unprecedented since the late noughties (Terras, 2011, pp. 614–15). In the last few years, some of the biggest investments in digitisation projects have come from commercial private partnerships, favouring collection types that would correspond more intentionally with market needs. Market logics can mean that commercial companies can have less incentive to create sustainable infrastructures and to offer content with non-restrictive licenses (Ahnert et al., 2023, pp. 24–25; Hauswedell et al., 2020). Among interviewees, there is a strong frustration in respect of public investments made in projects that were partially or fully stalled after funding ran out (CHO1; CHO3; CHO5; CHO7). The motivation for participating in something new that has no commitment to long-term sustainability is low: “[…] we don't want to invest in an aggregator that is going to be static in another year […]” (CHO3). Some interviewees also note the potential negative effect of unsustainable infrastructures and their communities of interest: “if there's finite funding, it can be quite damaging for the institution when that funding ends and the community feels abandoned” (CHO5). TaNC is designed to be entirely explorative, with a limited number of project funding calls and no commitment to ensure long-term sustainability. Some interviewees are sceptical as to whether TaNC can fulfil its ambitions with such an approach: “[…] one will come up against the buffers of internal sort of politics, the need for funding, and the competition between organisations in terms of creating an entire national collection” (CHO4). Given the abundance of existing initiatives for sharing collection data, our interviewees questioned the relevance of creating a new aggregation infrastructure. Interviewees expressed concerns that different projects seem to be “reinventing the wheels” (CHO3) and do not communicate well among each other: “There's presumably parallel spending of money on similar things, similar mistakes, and equally there's no sharing of experiences in most instances. […] the money spent on these things and the efforts on these things aren't actually then becoming available. So, yes, it seems like a wasteful exercise […]” (AP1). Interviewees note that there is generally an “[…] underestimation about the complexity of trying to bring national collections together” (CHO3). In the next two sections, we unpack why unifying collections at scale is challenging from a technical and legal perspective.

4.3 Technical challenges

It is to be expected that for small organisations with few financial and personnel resources the ambitions for collections as data are, without dedicated support, out of scope. Our interviewee from the community archive domain observes a dissonance between aggregations and the practical reality of many community archives to “get their catalogues online in the first place” (CHO1). Throughout our interview consultations, we observe however that collections-holding organisations' asymmetric capabilities for publishing collection data do not only play out between major and small organisations but also between different collection types within the organisations themselves.

Our interviewees from CHO5, an organisation for historic environment conservation, report how data from spatial data information systems can be exported straightforwardly through programming routines. Within the same organisation, the data export for the archival collection would in contrast require a significant amount of manual intervention, as the data is kept within a legacy SQL (Structured Query Language) database system with limited processing capacity. With 1.5 million collection records and over 1 million images stored in the database, any data export at scale requires support from the organisation's IT department (CHO5). In the case of CHO6 the state of digitisation varies across the library and archive collection vs the herbarium and living plant collection. The herbaria's digitisation programme has run since 2004, resulting in approximately three million plant specimen records and 600,000 images. For the living plant collection, metadata and images are complete and updated on an ongoing basis. Herbarium and living plant collection data are also regularly pushed to GBIF. In contrast, the focus of the library and archive was to date not on digitisation but on metadata creation. While the library made collection information available in union catalogues, the capacity for participating in aggregations was facilitated only through the provision of external IT support, as the in-house IT-service did not have the capacity for this kind of work (CHO6). These are not exceptional experiences. At the time of the interview, some of our other interviewees had just recently updated and harmonised their Digital Asset Management Systems (DAMS) across the organisation (CHO2; CHO8). Our interviewee from CHO2 made clear that such work needs careful planning and always needs to be weighed up against the costs for collection conservation and management. Legacy- or “homegrown” systems are also common within the community archive domain and limit their potential for integration in infrastructures (CHO1).

Given the huge discrepancies in technical capacity across and within collections-holding organisations, aggregators need to be able to accommodate a wide range of different data export formats and standards. Metadata mapping is for aggregators a core activity, and data import often needs to be customised towards the capabilities of individual data providers (Brugman et al., 2016, p. 1278; Butigan et al., 2020, p. 61). Both aggregation projects we interviewed reported how essential it is to keep the technical threshold for content providers as low as possible by taking the lead on tasks such as data massaging, cleaning and mapping to facilitate the participation of organisations of different sizes and types. But reducing the challenges of collection data unification at scale on technological and capacity considerations alone obscures some of the more subtle, yet potent challenges. As one of our interviewees from the research aggregation project puts it, collection unification at scale “[…] becomes a story about the inevitable differences between the conventions of recording digital information about archives, libraries, objects and art.” Crucially, this is a “technical and ontological exercise” (AP1). Our interviewee from CHO4, an organisation based in the field of natural sciences, observes how pushing data to GBIF is for the organisation a straightforward process as the data and records are relatively homogenous and confined to domain-specific taxonomical descriptors. At least for specialist audiences, the common search terms and query types can be anticipated. Difficulties emerged when attempting to bring herbaria collections together with other domains such as archaeology or art (CHO4). Similarly, CHO7 report how in their experience standards for data sharing are much more established within the natural sciences than in the humanities. Finding common denominators for cultural collections presents a challenge as the humanities disciplines are traditionally more focussed on interpretative work (Borgman, 2007, pp. 217–19) [16].

4.4 Legal and ethical responsibilities

The challenges collections-holding organisations face when identifying collection copyright status, adopting online licensing schemes and signposting public domain works as Open Access are known to be grounded in the complexity of copyright legislation, contractual agreements, requirements to create revenue or because sometimes copyright holders cannot be identified or contacted anymore (orphan works) (Martinez and Terras, 2019; Peters and Kalshoven, 2016; Wallace, 2022, pp. 81–83). In practice, collections-holding organisations navigate these challenges by assessing the risk of copyright infringement, the consequences of releasing sensitive and personal information or damaging donor relationships (Wallace, 2020b, pp. 8–9; Deazley, 2017, pp. 4–5; Stobo, 2016, pp. 281–85). All these themes were raised by our interviewees. Our analysis suggests that in addition to these issues, collections-holding organisations' disciplinary practices, cultures and collection affordances play out in these risk assessments and ultimately privilege certain collection types being made available online.

Art collections belong to the most celebrated collection types for Open Access frameworks, and the practices of major institutions like the Rijksmuseum, the Danish Statens Museum for Kunst or the Metropolitan Museum of Art significantly raised expectations with respect to open licensing (Halperin, 2017; McCarthy and Wallace, 2020; Pekel, 2013; Sanderhoff, 2014). One of our interviewees observes how major art collections lend themselves to Open Access frameworks because their creators tend to be identifiable and well-known (CHO8).[17] For certain object types the risk of copyright infringement is low. Interviewees exemplified this in the context of herbaria collection where plant specimens qualify as non-human-made works (CHO4; CHO6). However, other collection types present significant copyright assessment challenges. Copyright assessment is sometimes impossible due to the mass of records an archive needs to process (CHO5). The impact of collection types on legal assessments can be illustrated further. While for CHO6's herbaria collection, the legal assessment is relatively straightforward, the case for their archival collection is not, and more uncertainty characterises their online access policy for this part of the collection. In the cases of CHO5 and CHO6, interviewees report how in the past collections were commonly deposited without specifications relating to intellectual property or access permissions. But even in the case where depositor agreements exist there is a dissonance between legacy agreements and the changed expectations of access, as donor agreements often predate the Internet and unification of collections at scale:

I think, from a legal point of view, we would have to be very cautious because we didn't state we were going to do that aggregation work, so therefore we have to return to the owners and work through that and get a new signature on a piece of paper (CHO5).

Depending on the organisation and collection type, legal considerations blur with responsibilities which go beyond the sole assessment of copyright or collection ownership. CHO4 reports on withholding geographic data about certain herbaria collections that fall under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES), as this information could be used to deplete species. Archival collections typically consist of photographs, diaries, letters or administrative records which are prone to contain information that fall under data protection legislation (CHO6). Even if data protection does not apply anymore, interviewees report that they still need to consider that some collection information could be associated with living descendants and family members (CHO6; CHO7; CHO8).

Interviewees emphasise that the legal assessment of collection access protocols is insufficient where intra-cultural injustice needs to be addressed. Digitisation and online dissemination can entrench the violent appropriation of heritage from people who were dispossessed through colonialism, imperialism, the enslavement of human beings and other iniquitous acts and processes (Pavis and Wallace, 2019, pp. 6–7; Wallace, 2021, p. 15; Ortolja-Baird and Nyhan, 2022). Crucially, cultural injustices are not only about images but also about descriptive information, potentially including metadata, as our interviewee from AP1 points out. They estimate that at least a decade will be necessary for building meaningful relationships with affected groups to address cultural injustice within their collection data aggregation project. Important steps forward are in this context the CARE (Collective Benefit, Authority to Control, Responsibility and Ethics) principles, which seek to “[…] reposition[s] Indigenous Peoples, nations, and communities from being subjects of data that perpetuate unequal power distributions to self-determining users of data for development and wellbeing” (Carroll et al., 2020, p. 2).

4.5 Centralised vs decentralised data sharing

Throughout our conversations, we observe scepticism towards the concept of unifying siloed collections in a centralised manner because “[…] one size isn't going to fit all for the legal, technical, organisational […] conditions” of participating organisations (CHO5). For many of our interviewees, the issue of “dissolve[ing] barriers between different collections” (TaNC, 2023) is not resolved by creating a digital national collection. To tackle the technical and legal idiosyncrasies described in Sections 4.3 and 4.4, they call instead for investment in the development and enhancement of resources which allow for cross-disciplinary information management, such as vocabularies and authority files (CHO1; CHO4; CHO6; CHO7; CHO9). Closely related is the observation by some of our interviewees that the concept of aggregation no longer represents the state-of-the-art, which presently tends towards the linking of collections without an intermediary platform (AP2; CHO4; CHO5; CHO6). One interviewee from CHO3 argued, for instance: “It feels to me like it's counterintuitive to still be bringing huge amounts of digital stuff together in one place just so people can search it […] it should be like: you're in one place and it points you off into all these other related resources”. Linked Data, “[…] a set of best practices for the publication of structured data on the web” (Van Hooland and Verborgh, 2014, p. 3), presents an alternative to the data aggregation model. The TaNC Foundation Project “Heritage Connector” leveraged for instance the LOD database Wikidata for data enrichment and linking between siloed institutional collections catalogues (Dutia and Stack, 2021, pp. 2–3).[18] However, LOD is not a common approach within the sector. The few interviewees who mention experience with LOD projects describe them to be of an experimental character. Indeed, scholars found that LOD applications within the cultural heritage sector tend to remain in “the prototype stage” (Winters et al., 2022, p. 10) and are mostly facilitated by big institutions in explorative manners (Davis and Heravi, 2021, p. 12).

Rather than disseminating collections via an aggregator, some interviewees experiment with publishing their collection as datasets for download (CHO2; CHO3). The advantage of this approach is that people can use the material in different ways, including integrating collections in existing aggregations and platforms such as Wikipedia or the Internet Archive, without the organisation actively needing to invest resources in “managing those relationships” with third-party aggregators (CHO2). CHO8 see similar potential in offering access to collection data via an API. However, the experiences they made suggest that the technical tool of an API does not resolve the fundamental issues that impede collections-holding organisations from sharing their collection data. Policies and workflows need to be in place to ensure data is shared in accordance with copyright and data protection legislation. Their servers would not be capable of handling a mass of queries at the moment (CHO8). The statements by CHO8 echo the findings by Winters et al. who point out that the current design of the APIs used by collections-holding organisations does not support large-scale use (2022, pp. 14–16). Our interviewees were also clear that dissemination of data sets alone serves only a very small cohort of potential users with advanced technical skills or the appropriate equipment to deal with large data sets: “having terabytes of data […] isn't the most accessible form for people who don't have a computer that can open terabytes worth of files” (CHO2). Similarly, CHO4 observes how for the general public the dissemination of data alone is of little use and for them “a lot of investment into the interpretation of the objects” is necessary. In terms of the limited technical capacities and skills of smaller organisations, purely data-driven approaches for sharing collections favour large institutions with appropriate resources, bolstering existing privilege in data contexts and further marginalising less-well-resourced players.

5. Discussion

In this section, we respond to the research questions, by discussing the findings from the interviews with reference to the wider literature. In this article, we asked:

(1)
What are collections-holding organisations' experiences of participating in digital infrastructure projects?
(2)
Which factors enable and impede collections-holding organisations from unifying siloed collections?
(3)
Which risks and opportunities do collections-holding organisations perceive in large-scale infrastructure investments, such as the UK's Towards a National Collection (TaNC) programme?

Regarding collections-holding organisations' experiences of participating in digital infrastructure projects, a complex picture emerged from the interviews. Interviewees acknowledge broadly positive experiences with digital infrastructure projects. Interviews contain discussion on the merits of these infrastructures and the new opportunities they can open with respect to data enrichment, collection exploration, display and making collections available to wider audiences. Existing infrastructures are also important platforms for some smaller organisations to make their collections available online in the first place. The potential of analysing collections at scale through their unification is acknowledged by interviewees, though mostly seen as an interest confined to a specialised audience, a position echoed in the external literature (Hauswedell et al., 2020, pp. 155–56). The potential benefits of unification are understood by interviewees to pertain to audiences. Research shows however that usage profiles of digital infrastructures' audiences tend to be not well understood and articulated (Bettivia and Stainforth, 2023, p. 709; Bailey-Ross, 2021, p. 25). While there could be a potential for institutions themselves to explore their own histories and collections through virtual unification, this avenue does not seem to be a core interest for the organisations we interviewed. Overall, the value and opportunities of aggregated collections need to become clearer (Poole, 2015). Those who seek to fund the provision aggregators have been known to communicate their rationale in language that emphasises novelty and accordingly scarcity and is furthermore peppered with references to the technological state of the art. TANC is a case in point:

By seizing the opportunity presented by new digital technology, it will allow researchers to formulate radically new research questions, increase visitor numbers, dramatically expand and diversify virtual access to our heritage, and bring clear economic, social and health benefits to communities across the UK (TaNC, 2023).

Notwithstanding what is characterised above as a broadly positive attitude to aggregators, interviewees' responses often seem to question whether collections-holding organisations experience such infrastructures as novel, scarce and technologically cutting-edge. In fact, to date, collections-holding organisations have been able to choose from an abundance of aggregators or other solutions to share collection data outside of their own catalogues. In the UK context, we find many examples which aspired to unify collections at scale. Since 1995, Scran has been bringing together digitised audio-visual material “[…] from over 300 [UK] museums, galleries, and archives including the V&A [The Victoria & Albert Museum], National Galleries of Scotland, Glasgow Museums and The Scotsman to name a few” (Historic Environment Scotland, 2023a, b). Another example is the UK Archives Hub – a discovery service that aggregates the collections of over 350 archival organisations – and which emerged in the early noughties out of “[t]he vision of a National Archives Gateway” (Hill, 2002, p. 239). With Culture Grid, the UK had until relatively recently another infrastructure in place for “[…] joining and opening up UK collections for more use by more people in more ways” (Culture Grid, 2010). While formally still operating as “the de facto national aggregator for UK museums” and pushing content to Europeana, Culture Grid closed down for new data providers in 2015 (Gosling, 2019). TaNC's aspiration of unifying collections on a national level is not new. Strong frustration across collections-holding organisations surfaced in interviews with respect to infrastructure projects with similar objectives and uncertain longevity [19].

Interviewees also questioned the presentation of aggregators as technologically state-of-the-art. Rather than established aggregation models, interviewees expressed more interest in decentralised approaches to collection unification. What such an approach could practically look like remains unclear. LOD approaches have not had a wide uptake in the sector despite having had some presence there for over a decade. The distribution of data sets alone is likely to serve a specialised technical audience only. Lightweight interfaces and toolboxes for collection data exploration could offer ways forward (Winters et al., 2022, p. 16) [20] but would require a certain level of technical expertise from the users.

The European Commission's ambition with respect to the “common European data space for cultural heritage” points towards decentralised approaches too. The creation of a “data space for cultural heritage” was proposed by the European Commission in 2021 (2021, p. 5) and is situated within the European data strategy to develop “a genuine single market for data, open to data from across the world” that is subject to EU legislation, norms and governance (European Commission, 2020, pp. 4–5). Accordingly, the EU-funded “Data Spaces Support Centre” defines the data space as: “[a]n infrastructure that enables data transactions between different data ecosystem parties based on the governance framework of that data space. Data space [sic] should be generic enough to support the implementation of multiple use cases” (Poikola et al., 2023, p. 5). Given these elusive definitions, commentators point out that there is a lack of consensus as to what the concept of the “data space for cultural heritage” is ought to be in practice (Dobreva et al., 2022, p. 493). Ambiguous also is the role of Europeana (an aggregator of aggregators, after all) which is expected to become in a rather “recursive” manner “[…] the core of the emerging Common European Data Space for cultural heritage” (Keller, 2021). Despite advances in decentralised technologies such as Linked Data, there is to date no robust framework for a coordinated approach for facilitating the interlinking of collection data in a decentralised manner.

Interviewees noted serious risks for small collections-holding organisations in infrastructure programmes. Current funding models, and indeed strategy and foresight policies surrounding aggregation infrastructures are not only deemed inadequate but, from the perspective of smaller organisations (who are the most commonly occurring heritage entity within the wider sector),[21] structurally flawed. Participation in aggregation infrastructures often takes place in an ad-hoc manner, not as part of long-term strategies. Our interviewees identify the absence of a sector-wide strategy and competitive funding environment as key drivers of the fragmented digital collection' landscape. Infrastructure funding within the heritage sector is commonly organised through competitive funding calls [22]. It is known that the nature of major funding schemes presents significant hurdles for smaller collections-holding organisations, such as community archives. Funding applications are labour-intensive and require a certain skill set and domain expertise as well as good connections within professional networks. Funding schemes can come with the prerequisite to collaborate in partnerships that may compromise the local practices of smaller heritage bodies (Jules, 2019, pp. 7–8). By definition, the nature of major funding schemes reinforces the dominance of a handful of well-resourced major institutions in the digital canon as a whole (Kizhner et al., 2020, pp. 12–14; Gosling et al., 2022, p. 6; Wallace, 2020c, p. 5).

Regarding factors that enable and impede collections-holding organisations to unify what are otherwise siloed collections, the tension between innovation and sustainability arises. The limited capabilities of even relatively well-funded collections-holding organisations to disseminate collection data sit much at odds with the scope of current innovation-driven infrastructure investments. Interviewees observed that there is commonly an underestimation of the complexity entailed in building large-scale infrastructure projects and that the capabilities of many organisations to deliver data is often limited. Responses from our interview consultations suggest that legacy systems with limited capacities for data management and export are a common phenomenon across the sector. While it is unsurprising to find that small collections-holding organisations can struggle to make collection data available, we observe throughout our interview consultations that the asymmetries in technical capacities play out not only between major and small organisations but also within organisations themselves. It is worth noting that the issue of legacy technology is not only something that impedes collections-holding organisations in making data available but aggregators and other infrastructure providers too. OAI-PMH – first published in 1999 – is widely considered to be an outdated technology, proposed to be replaced by alternatives such as IIIF (Van de Sompel and Nelson, 2015; Freire et al., 2020). However, OAI-PMH is still widely used in the heritage sector, including by Europeana, the Digital Public Library of America or sub-repositories of the Digital Research Infrastructure for the Arts and Humanities (DARIAH-EU), such as ROSSIO (Silva et al., 2022, p. 3). The technology is difficult to replace because it is considered to be the “de facto standard” now (Butigan et al., 2020, pp. 63–64). Technological hypes often suggest a radical change by adopting a particular technology and obscure how moving from one technology to another is rarely clear-cut. Instead, the “old,” “new” and sometimes even alternatives, co-exist. In addition, the choice of one technology over another is in many cases less rational than commonly assumed. Instead, economic-, cultural- and social factors in addition to pure chance, pragmatism or ideology have a far greater impact on which technologies become established (Edgerton, 2006, pp. 1–3; 8–11).[23].

The dynamics shaping technological choice and its ongoing impact force us to think about the interplay between innovation and sustainability. Contrary to the rhetoric on innovation surrounding infrastructure investments, there is a strong risk that infrastructures become static once they are set up rather than spaces where research and development happens (thus the upkeep of technologies like OAI-PMH). Unsurprisingly a common lack of funding for sustaining infrastructures reinforces this tendency (Rockwell, 2010, pp. 620–21). Sustained infrastructure investment is not a replacement for research and innovation and vice versa. Equally collections-holding organisations need more support for addressing limited technical capacities (e.g. updating legacy databases and skill development) to be able to deliver collection data in the first place.

Divergent disciplinary traditions across the so-called G-L-A-M sector tend to be obscured when speaking of infrastructures which aim to cater for all types of collections-holding organisations. Larger organisations, for example, commonly have specialised departments with bespoke collection management systems for different domains (e.g. for humanities and natural science collections) or different collection types: library-, archive- or museum collections. The unification of collections at scale is not simply a matter of finding the appropriate technical tool. It is shaped and hampered by the inherited norms, practices and cultures of information management which developed with the formation of Western scientific disciplines in the 17th and 18th centuries onwards (Edwards et al., 2007, pp. 5–7). Museum collection description emerged out of the need to register and manage objects internally, not for external information retrieval. Information management via cataloguing standards and vocabularies gained within the museum domain for this reason a less prominent role than in libraries for instance. Consequently, the linking of museum collections internally or externally with other collections is hampered (Zoller and DeMarsh, 2013, pp. 58–64). Another distinction is that libraries and museums usually describe collections on an item level, whereas archives describe on a collection- or folder level. This is because the archives' focus is to contextualise the relationships between administrative or individual activities of record creation (Timms, 2009, p. 74). The insufficiency of bringing collections together on an object-level only, as many aggregation projects do, is recognised. It can, to some extent, be mitigated by allowing for data ingest on different levels: such as institution, collection, catalogue record and digital surrogate (Collections Trust, 2019, p. 22). However, querying such aggregates will inevitably be skewed towards organisation types which record information on an item level as their content will surpass others in quantity and detail. Accordingly, aggregates of digital collections do not present ontological surrogates of their physical collections' counterparts. Aggregators are data assemblies, where the information management practices from library, archive, museum studies and other disciplines play out in ways that are different to the in-house digital and physical catalogues.

Differences in collection types and their description also play out with respect to collections-holding organisations' ability to comply with copyright licensing standards. Copyright clearance requires the identification of the creator of a piece of work. But depending on the cataloguing practice, for example, the taxonomical classification of biological specimens, the identification of authorship may not be of central concern. “Name of creator” in the archival standard ISAD(G) [24] refers for example to “[…] the name of the organization(s) or the individual(s) responsible for the creation, accumulation and maintenance of the records in the unit of description”, which can but not necessarily has to be the collection's copyright holder (ICA, 2000, pp. 10;18). UK Archives Hub no longer requires “creator” as a mandatory metadata field for data submission because for many archival organisations, it is not possible to identify records or collection's creators (Stevenson, 2019, p. 98). A case study presented by the German National Library demonstrates how for copyright assessment of books each work needs to be physically checked because illustrations or photographs may have different copyright holders than the book's author. Overall the report concludes that the costs of rights clearance for one work (between €1.39 and €27) “[…] often exceeds the economic value of the work themselves” (Peters and Kalshoven, 2016, pp. 1–5). If the economic value of Open Access collections is to be reaped, as promoted by some policymakers, considerations need to be put in place to lower the threshold for copyright clearance through legislative frameworks and funding support.

Increasingly Open Access to cultural heritage as a blanket approach for stimulating access and use of collection data turns out to be insufficient. Scholars point out how the violent appropriation of collections from their communities of origin renders those collections as inappropriate for digitisation and Open Access frameworks (Pavis and Wallace, 2019, pp. 6–7). Other collection types prone to replicate trauma and harm through collection data dissemination include among others medical records, LGBTQ+ and Feminist publications (Wernimont, 2021; Cowan and Rault, 2018, pp. 124–25). The CARE principles are for affirming Indigenous data governance and sovereignty. While Caroll et al. see the potential of the CARE principles' wider application the authors also call for caution for the principles' co-option in non-Indigenous environments, until the principles mature fully (Carroll et al., 2021, p. 5). Important contributions to addressing data inequities are also made in the field of Critical Archival Scholarship. The Feminist Ethics of Care, for instance, call collecting-organisations to take up “affective responsibilities” towards the lived-experiences who are the most vulnerable collections stakeholders (depending on the context: creators, subjects, users and the organisation's wider community) (Caswell and Cifor, 2019). Accordingly, the Ethics of Care consolidate as a framework for addressing digital collections' uncertain ethical afterlives (Ziegler, 2020; Agostinho, 2020, pp. 69–70). Recent research building on Social Movement Archives' trajectory of informing critical archival theory and praxis, argues to move towards social-justice based frameworks for collections as data (Humbel, 2023, pp. 181–86).

Finally, notwithstanding the potential of aggregators, interviewees still expect their core audience to use the institutional catalogue for collection access and call for investments in improving the quality of catalogues, such as authority files. Collection curation and cataloguing are prime examples of infrastructural – and thus invisible and typically feminised – work (Shirazi, 2018; Caswell, 2016, pp. 10–13; Star and Strauss, 1999). The common touting of digitisation and digital infrastructures as the solutions for unlocking the past misses the point that significant amounts of collections are not even digitally searchable (Zaagsma, 2023, p. 845). Catalogue data consolidate as distinct sources for data-driven research (Baker et al., 2022; Bagnall and Sherratt, 2021; Havens et al., 2023), and the importance of collections curation and cataloguing is meanwhile even recognised outside the immediate heritage and Digital Humanities domain. So highlighted in the summer of 2023 The Guardian's chief culture writer Charlotte Higgins the fundamental role of professional cataloguing work for institutional transparency and accountability in the context of the stolen, missing and damaged objects from the British Museum (2023) [25]. Cataloguing cannot be thought of as an addition to existing curators' job descriptions but requires specialised roles with distinct skill sets for information management, standardisation and access provision (Zoller and DeMarsh, 2013, p. 64). Ongoing investments in collections curation and high-quality cataloguing are prerequisites for a sustainable heritage sector and collection data infrastructure development.

6. Conclusion

This article highlights institutional perspectives and experiences in relation to large-scale, third-party-funded infrastructural investments which aspire to aggregate collections at scale. Through an analysis of a series of interview consultations with experts in the aggregation scene, and a critical analysis of the dispersed academic and grey literature, we contribute to the emerging understanding of the latent factors that make infrastructural endeavours in the heritage sector complex undertakings. To date, the research focus has tended to be on the drive to adopt specific technological solutions and copyright licensing practices. In addition, projects like TaNC and MDS seek to build new platforms that unify collections at scale. In this article, we argue that the question of how to unify collections at scale cannot be reduced to finding the appropriate technology or licensing framework, nor by building new aggregators. Instead, collections-holding organisations' capacity to participate in digital infrastructures is dependent on a complex interplay of resource allocation across the sector and within organisations, divergent traditions and requirements of collection description, and different disciplinary histories and aims. By conceptualising infrastructures as a “machine to be built” there is a risk of “[…] downplay[ing] the importance of social, institutional, organizational, legal, cultural, and other non-technical problems developers always face” (Edwards et al., 2007, p. 7). In this article, we went beyond the developers' views by bringing together the perspectives of curators, archivists, librarians and IT specialists. Through a critical analysis of their responses and of the scholarly and grey literature, including the stated rationales of aggregators and third-party funders, we exposed holistic levels of potential and yet unresolved difficulty in collection data infrastructure development not apparent when the issue is seen from the perspective of one actor only.

Accordingly, we call for better social-cultural and trans-sectoral (collections-holding organisations, universities and technological providers) understandings of collection data infrastructure development. Attention to collections-holding organisations' local affordances and the differences in disciplinary conventions is likewise salient. For funders, policymakers, researchers and practitioners of infrastructure projects we recommend:

(1)
Sustainable financial investment across the heritage sector is required to address the discrepancies between different organisation types in their capacity to deliver collection data. Smaller organisations play a vital role in diversifying the (digital) historical canon, but they often struggle to digitise collections and bring catalogues online in the first place. Ongoing investments in collections curation and high-quality cataloguing are prerequisites for a sustainable heritage sector and collection data infrastructures.
(2)
More studies on the impact and opportunities of unified collections for different audiences and collections-holding organisations themselves.
(3)
Investments in existing infrastructures for collection data dissemination and unification, instead of creating new platforms. This may include bringing existing systems together.
(4)
Investments in the sustainability of infrastructures are not a replacement for research and vice versa.
(5)
Establishing networks where collections-holding organisations, technology providers and users can communicate their experiences and needs in an ongoing way and influence policy.

Figures

Figure 1

Thematic scope of TaNC's Foundation Projects and commissioned reports autumn 2022 [11]

Table 1

Overview on the interviewed organisations

Pseudonym	Abbreviation	Organisation description	Location	Number of interviewees
Aggregation project 1	AP1	Collection research aggregation project	Australia and UK	2 [12]
Aggregation project 2	AP2	Europeana aggregator	European Union	1
Collections-holding organisation 1	CHO1	Community Archive [13]	UK	1
Collections-holding organisation 2	CHO2	National Organisation (library)	UK	1
Collections-holding organisation 3	CHO3	National Organisation (library)	UK	2
Collections-holding organisation 4	CHO4	University Collection	UK	1
Collections-holding organisation 5	CHO5	National Organisation (historic environment)	UK	2
Collections-holding organisation 6	CHO6	National Organisation (plants and biodiversity conservation)	UK	3
Collections-holding organisation 7	CHO7	National Organisation (museum)	UK	2
Collections-holding organisation 8	CHO8	University Collection	UK	3

Source(s): Table by authors

Table 2

Example first cycle codes and thematic units

Example thematic codes	Example in vivo codes	Thematic units
Aggregation rationale Aggregation audience Aggregation potentials	“many forms of aggregation and sharing collections” “make our data as findable as possible”	Motivation for aggregated collections
Sustainability Funding structures Stalled projects	“[aggregation] very important nice-to-have” “reinventing the wheels”	Aggregation fatigue
Discrepancy in technical capability Legacy systems Aggregation perquisites: vocabularies	“[aggregation] technical and ontological exercise” “no automation”	Technical challenges
Copyright and collection type Data protection Donor relationships	“it's not just about copyright”	Legal and ethical responsibilities
Authority files Linking collections Data set dissemination	“the concept of aggregation is outdated”	Centralised vs decentralised data sharing

Source(s): Table by authors

Notes

1.

GLAMs (Galleries, Libraries, Archives and Museums) is a common acronym used to refer to cultural organisations which curate and provide access to collections. Not all organisations we consulted are covered by this acronym, however. Some with herbaria collections might be better designated as scientific research institutions for instance (see Table 1). We decided to use “collections-holding organisations” as a short-hand to refer to the organisations covered by our research.

2.

Open Access to cultural heritage is defined as “[…] a policy or practice that allows reuse and redistribution of materials for any purpose, including commercial” (Wallace, 2020a: 4)

3.

For the OpenGLAM principles see: OpenGLAM Initiative (2024). OpenGLAM Principles [WWW Document]. URL https://openglam.org/principles/ (accessed 18 June 2024).

4.

See: The National Lottery Heritage Fund (2020). Advice: Understanding our licence requirement [WWW Document]. URL https://www.heritagefund.org.uk/stories/advice-understanding-our-licence-requirement (accessed 27 June 2023).

5.

Other planned European data spaces include for instance the domains of health, industrial manufacturing, finance and agriculture (European Commission, 2020, pp. 22–23).

6.

The claim highlights the weight the UK – as a single country – puts on digital heritage infrastructure investments. Compare for instance the ICT PSP digital libraries programme, in which the European Commission made €30 million funding available (European Commission, 2010, p. 13).

7.

“OAI-PMH […] is a client/server architecture protocol specification that facilitates the diffusion of metadata, e.g.: Resource description of the resource (title, author, date of publishing, publisher, etc.) [and] Resource location on the Internet (indicated by the URL)” (IFLA, 2023).

8.

A close relative of the “aggregation approach” is the federative union catalogues developed in the library domain (Collections Trust, 2019, p. 8). The focus in this paper is however on the aggregation model because these systems are typically used to bring collections together across archives, libraries, museums and heritage organisations.

9.

For a list of Europeana aggregators, see [WWW Document]. URL https://pro.europeana.eu/page/aggregators (accessed 14.12.23)

10.

As the bar chart indicates, TaNC also commissioned research on engaging audiences or the affordances of born-digital material. The point is that research specifically addressing the question of collection unification at scale typically focus on technical outputs and copyright.

11.

Categorisation based on thematic analysis of reports' content. All TaNC reports are available via Zenodo: [WWW Document]. URL https://zenodo.org/communities/tanc?q=&l=list&p=1&s=10&sort=bestmatch (accessed: 17.10.23)

12.

The interview had to be facilitated in two separate sessions due to the unforeseeable unavailability of one of the interviewees.

13.

We use the term “Community Archive” “[…] to characterise a non-traditional archival collection specifically tied to a particular group, often one that may be undocumented or under-documented by traditional archival institutions” (Bastian and Flinn, 2018: XX).

14.

The questionnaires were developed by Marco Humbel, Nina Pearlman, JD Hill Andreas Vlachidis, Daniele Metilli and Julianne Nyhan and are available in appendix.

15.

See: [WWW Document]. URL https://www.photoconsortium.net/services/ and https://dri.ie/dri-membership/ (accessed 8.6.23).

16.

For an exemplary case study see for instance the challenges of identifying the boundaries of object descriptions in historical museum catalogues (Ortolja-Baird et al., 2019, pp. 21–28).

17.

On the impact of copyright on smaller art collections see the ongoing work by UCL Art Museum: [WWW Document]. URL https://www.ucl.ac.uk/culture/projects/ucl-rightsholder-clearance-project (accessed 2.12.23).

18.

Fundamental principles of Linked Data include Uniform Resource Identifiers (URI) to anchor machine-readable data on the web and the Resource Description Framework (RDF) to describe resources and the relationships among them, using triple statements (subject – predicate – object). If Linked Data is published under an open license, it becomes Linked Open Data (LOD) (Berners-Lee, 2010).

19.

For examples of stalled UK infrastructure see: Gosling, K., 2019. Nothing new except what has been forgotten [WWW Document]. URL https://collectionstrust.org.uk/blog/nothing-new-except-what-has-been-forgotten/ (accessed 3.6.23). In addition, also consider the fate of the JISC-AHRC funded national research data infrastructure “Arts and Humanities Data Service” (AHDS). Established in 1995 the service ceased to receive funding in 2008. Ultimately the service was decommissioned in 2017 (Greenstein, 1996; AHDS History, 2007; King's Digital Lab, 2024).

20.

For an example of this approach, see Tim Sherratt. (2021). GLAM Workbench (version v1.0.0). Zenodo. [WWW Document]. URL https://doi.org/10.5281/zenodo.5603060 (accessed 2.12.23).

21.

According to Candlin et al. in 2017 large and huge museums formed for instance together 15,36% of all UK museums, whereas small museums 56% (2020, p. 26). For a discussion on the underrepresentation of the majority of small archives, libraries and museums in the UK's digital canon see (Gosling et al., 2022, p. 33).

22.

See for instance DIGITAL-2022-CULTURAL-02 Data for cultural heritage (deployment): [WWW Document]. URL https://ec.europa.eu/info/funding-tenders/opportunities/docs/2021-2027/digital/wp-call/2022/call-fiche_digital-2022-cultural-02_en.pdf (accessed: 21.07.23).

23.

See for instance the famous VHS vs Betamax case (Ponte and Camussone, 2013).

24.

The International Standard Archival Description (General) “[…] provides guidelines for creating descriptions of archival materials, establishing a model based on the principle of respect des fonds within a multi-level description” (SAA, 2023).

25.

See the British Museum's press release on the case: [WWW Document]. URL https://www.britishmuseum.org/sites/default/files/2023-08/Announcement_regarding_missing_stolen_and_damaged_items.pdf (accessed: 17.11.23).

Appendix

The supplementary material for this article can be found online.

References

Agostinho, D. (2020), “Care”, in Thylstrup, N.B., Agostinho, D., Ring, A., D'Ignazio, C. and Veel, K. (Eds), Uncertain Archives: Critical Keywords for Big Data, The MIT Press, Cambridge, MA, pp. 67-75.

AHDS History (2007), “AHDS history – depositing data with the AHDS”, available at: https://web.archive.org/web/20130413012920/http://www.ahds.ac.uk/history/depositing/index.html (accessed 7 July 2024).

Ahnert, R., Griffin, E., Ridge, M. and Tolfo, G. (2023), “Collaborative historical research in the age of big data: lessons from an interdisciplinary project”, in Cambridge Elements: Elements in Historical Theory and Practice, Cambridge University Press, Cambridge, doi: 10.1017/9781009175548, available at: https://www.cambridge.org/core/product/identifier/9781009175548/type/element (accessed 4 May 2023).

Angelova, L. (2021), “Deep discoveries: a towards a national collection foundation project final report”, Interim Report Foundation Projects Towards A National Collection, TaNC, London, doi: 10.23636/1214, available at: https://www.nationalcollection.org.uk/sites/default/files/2021-10/Deep%20Discoveries%20Final%20Report%20.pdf (accessed 1 December 2021).

ARDC (2023a), “Trove enhancements”, available at: https://ardc.edu.au/project/trove-researcher-platform-for-advanced-research/ (accessed 1 September 2023).

ARDC (2023b), “HASS and indigenous research data commons”, available at: https://ardc.edu.au/program/hass-rdc-indigenous-research-capability/ (accessed 1 September 2023).

Art UK (2022), “Collections trust and the University of Leicester”, Museum Data Service, available at: https://artuk.org/about/museum-data-service

Bagnall, K. and Sherratt, T. (2021), “Missing links: data stories from the archive of British Settler colonial citizenship”, Journal of World History, Vol. 32 No. 2, pp. 281-300, doi: 10.1353/jwh.2021.0025.

Bailey-Ross, C. (2021), “Online user research literature review: UK Gallery, Library, Archive and Museum (GLAM) digital collection”, Zenodo, doi: 10.5281/ZENODO.5779826, available at: https://zenodo.org/record/5779826 (accessed 9 June 2023).

Baker, J., Salway, A. and Roman, C. (2022), “Detecting and characterising transmission from legacy collection catalogues”, Digital Humanities Quarterly, Vol. 16 No. 2, available at: http://www.digitalhumanities.org/dhq/vol/16/2/000615/000615.html (accessed 15 November 2023).

Bastian, J.A. and Flinn, A. (2018), “Introduction”, in Bastian, J.A. and Flinn, A. (Eds), Community Archives, Community Spaces: Heritage, Memory and Identity, Facet, London, pp. xix-xxiv, doi: 10.29085/9781783303526, available at: https://www.cambridge.org/core/product/identifier/9781783303526/type/book (accessed 29 April 2021).

Beelen, K., Lawrence, J., Wilson, D.C.S. and Beavan, D. (2023), “Bias and representativeness in digitized newspaper collections: introducing the environmental scan”, Digital Scholarship in the Humanities, Vol. 38 No. 1, pp. 1-22, doi: 10.1093/llc/fqac037.

Berners-Lee, T. (2010), “Is your linked open data 5 star?”, Linked Data, available at: https://www.w3.org/DesignIssues/LinkedData.html (accessed 7 November 2023).

Bettivia, R.S. and Stainforth, E. (2023), “Negotiating digital public spaces: context, purpose and audiences”, Journal of Documentation, Vol. 79 No. 3, pp. 703-717, doi: 10.1108/JD-04-2022-0079.

Borgman, C.L. (2007), “Disciplines, documents, and data”, in Scholarship in the Digital Age, The MIT Press, pp. 179-226, doi: 10.7551/mitpress/7434.003.0011.

Borgman, C.L. (2015), Big Data, Little Data, No Data: Scholarship in the Networked World, MIT Press, Cambridge, MA.

Bowker, G.C. and Star, S.L. (1999), Sorting Things Out: Classification and Its Consequences, Inside Technology, The MIT Press, Cambridge, MA, available at: http://cognet.mit.edu/book/sorting-things-out (accessed 15 July 2020).

Bowker, G.C., Baker, K., Millerand, F. and Ribes, D. (2009), “Toward information infrastructure studies: ways of knowing in a networked environment”, Dordrecht, in Hunsinger, J., Klastrup, L. and Allen, M. (Eds), International Handbook of Internet Research, Springer, Netherlands, pp. 97-117, doi: 10.1007/978-1-4020-9789-8_5, available at: (accessed 11 November 2021).

Brinkmann, S. (2018), “The interview”, in Denzin, N.K. and Lincoln, Y.S. (Eds), The SAGE Handbook of Qualitative Research, 5th ed., SAGE, Los Angeles, pp. 576-599, (e-book).

Brugman, H., Reynaert, M., Van Der Sijs, N., Van Stipriaan, R., Sang, E.T.K. and Van Den Bosch, A. (2016), “Nederlab: towards a single portal and research environment for diachronic Dutch text corpora”, in Calzolari, N., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J. (Eds), et al., Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portoroz, pp. 1277-1281.

Butigan, T., Cassidy, K., Jeszke, N., Kondratjev, P., Nowak, A., Parkoła, T., Truyen, F., Žogla, A., Davies, R., Felmayer, L., Scholz, H. (2020), Landscape of National Aggregation in Europe and Establishment of Emerging National Aggregators, Europeana Common Culture, available at: https://pro.europeana.eu/files/Europeana_Professional/Projectpartner/EuropeanaCommonCultureProjectFiles/MS3%20Landscape%20of%20national%20aggregation%20in%20Europe%20Report.pdf (accessed 12 January 2022).

Candlin, F., Larkin, J., Ballatore, A. and Poulovassilis, A. (2020), Mapping Museums 1960-2020: A Report on the Data, University of London, London: Birkbeck, available at: https://museweb.dcs.bbk.ac.uk/static/pdf/MappingMuseumsReportMarch2020.pdf (accessed 28 February 2022).

Carroll, S.R., Garba, I., Figueroa-Rodríguez, O.L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J.D., Anderson, J. and Hudson, M. (2020), “The CARE principles for indigenous data governance”, Data Science Journal, Vol. 19, p. 43, doi: 10.5334/dsj-2020-043.

Carroll, S.R., Herczog, E., Hudson, M., Russell, K. and Stall, S. (2021), “Operationalizing the CARE and FAIR principles for indigenous data futures”, Scientific Data, Vol. 8 No. 1, p. 108, doi: 10.1038/s41597-021-00892-0.

Caswell, M. (2016), “The archive’ is not an archives: acknowledging the intellectual contributions of archival studies”, Reconstruction, Vol. 16 No. 1, pp. 1-21.

Caswell, M. and Cifor, M. (2019), “Neither a beginning nor an end: applying an ethics of care to digital archival collections”, in Lewi, H., Smith, W., Lehn, D. vom and Cooke, S. (Eds), The Routledge International Handbook of New Digital Practices in Galleries, Libraries, Archives, Museums and Heritage Sites, 1st ed., Routledge, London, pp. 159-168.

Caswell, M. and Jules, B. (2017), “Integrating community archives into a national digital platform: challenges, opportunities, and recommendations”, A White Paper Reporting on the 2016-2017 “Diversifying the Digital Historical Record” Forums, available at: https://escholarship.org/uc/item/8r10h3tw (accessed 7 December 2021).

Collections Trust (2019), “Mapping digitised collections in England: final report”, Department for Digital, Culture, Media and Sport, available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/974903/Mapping_digitised_collections_in_England_-_scoping_report_V2.pdf (accessed 12 January 2021).

Cowan, T.L. and Rault, J. (2018), “Onlining queer acts: digital research ethics and caring for risky archives”, Women and Performance: A Journal of Feminist Theory, Vol. 28 No. 2, pp. 121-142, doi: 10.1080/0740770X.2018.1473985.

Culture Grid (2010), “About”, culture grid, available at: http://www.culturegrid.org.uk/about/ (accessed 17 February 2023).

Darcovich, J., Flynn, K. and Li, M. (2019), “Born of collaboration: the evolution of metadata standards in an aggregated environment”, VRA Bulletin, Vol. 45 No. 2, 5.

Davis, E. and Heravi, B. (2021), “Linked data and cultural heritage: a systematic review of participation, collaboration, and motivation”, Journal on Computing and Cultural Heritage, Vol. 14 No. 2, pp. 1-18, doi: 10.1145/3429458.

Deazley, R. (2017), “Copyright and digital cultural heritage: introduction”, available at: https://copyrightcortex.org/files/copyright101/1-CDCH-Introduction.pdf (accessed 2 June 2021).

Dobreva, M., Stefanov, K. and Ivanova, K. (2022), “Data spaces for cultural heritage: insights from GLAM innovation labs”, Lecture Notes in Computer Science, in Tseng, Y.-H., Katsurai, M. and Nguyen, H.N. (Eds), From Born-physical to Born-Virtual: Augmenting Intelligence in Digital Libraries, Springer International Publishing, Cham, Vol. 13636, pp. 492-500, doi: 10.1007/978-3-031-21756-2_41, available at: (accessed 23 December 2022).

Dutia, K. and Stack, J. (2021), “Heritage connector: a machine learning framework for building linked open data from museum collections”, Applied AI Letters, Vol. 2 No. 2, doi: 10.1002/ail2.23, available at: (accessed 27 September 2021).

Edgerton, D. (2006), The Shock of the Old: Technology and Global History since 1900, Profile Books, London.

Edwards, P.N., Jackson, S.J., Bowker, G.C. and Knobel, C.P. (2007), “Understanding infrastructure: dynamics, tensions, and design: report of a workshop on “history & theory of infrastructure: lessons for new scientific cyberinfrastructures”, available at: https://deepblue.lib.umich.edu/bitstream/handle/2027.42/49353/UnderstandingInfrastr?sequence=3 (accessed 11 November 2021).

Estermann, B. (2018), “Development paths towards open government – an empirical analysis among heritage institutions”, Government Information Quarterly, Vol. 35 No. 4, pp. 599-612, doi: 10.1016/j.giq.2018.10.005.

European Commission (2010), “Competitiveness and innovation framework programme (CIP) ICT policy support programme: ICT PSP work programme 2010”, available at: https://web.archive.org/web/20190607220227/http://ec.europa.eu/cip/files/docs/ict_psp_wp2010_en.pdf (accessed 22 November 2021).

European Commission (2020), “Communication from the commission to the European Parliament, the Council”, The European Economic and Social Committee and the Committee of the Regions: A European strategy for data, available at: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52020DC0066 (accessed 8 November 2023).

European Commission (2021), “Commission recommendation (EU) 2021/1970 of 10 November 2021 on a common european data space for cultural heritage”, available at: https://eur-lex.europa.eu/eli/reco/2021/1970/oj (accessed 23 December 2022).

European Commission (2022), “The deployment of a common European data space for cultural heritage”, available at: https://digital-strategy.ec.europa.eu/en/news/deployment-common-european-data-space-cultural-heritage (accessed 9 May 2023).

European Commission, Directorate-General for Research and Innovation, Brunet, P., De Luca, L., Hyvönen, E., Joffres, A., Plassmeyer, P., Pronk, M., Scopigno, R. and Sonkoly, G. (2022), Report on a European Collaborative Cloud for Cultural Heritage – Ex – Ante Impact Assessment, Publications Office of the European Union, doi: 10.2777/64014, available at: https://op.europa.eu/en/publication-detail/-/publication/90f1ee85-ca88-11ec-b6f4-01aa75ed71a1/language-en# (accessed 21 June 2024).

European Commission (2024), “The cultural heritage cloud – European commission”, available at: https://research-and-innovation.ec.europa.eu/research-area/social-sciences-and-humanities/cultural-heritage-and-cultural-and-creative-industries-ccis/cultural-heritage-cloud_en (accessed 21 June 2024).

Europeana (2010), “Europeana aggregators' handbook”, available at: https://admin.biodiversitylibrary.org/wiki-archive/mainSpace/files/Aggregators%20Handbook_Europeana.pdf (accessed 26 November 2021).

Freire, N., Robson, G., Howard, J.B., Manguinhas, H. and Isaac, A. (2020), “Cultural heritage metadata aggregation using web technologies: IIIF, Sitemaps and Schema.org”, International Journal on Digital Libraries, Vol. 21 No. 1, pp. 19-30, doi: 10.1007/s00799-018-0259-5.

Gosling, K. (2019), “Nothing new except what has been forgotten”, available at: https://collectionstrust.org.uk/blog/nothing-new-except-what-has-been-forgotten/ (accessed 6 March 2023).

Gosling, K., McKenna, G. and Cooper, A. (2022), “Digital collections audit”, Zenodo, doi: 10.5281/ZENODO.6379581, available at: https://zenodo.org/record/6379581 (accessed 23 March 2022).

Greenstein, D. (1996), “Serving the arts and humanities”, Ariadne, No. 4, available at: http://www.ariadne.ac.uk/issue/4/ahds/ (accessed 3 July 2024).

Halperin, J.R. (2017), “New York's Metropolitan museum of Art releases 375,000 digital works for remix and re-use online via CC0 – creative commons”, available at: https://creativecommons.org/2017/02/07/met-announcement/ (accessed 31 March 2023).

Hauswedell, T., Nyhan, J., Beals, M.H., Terras, M. and Bell, E. (2020), “Of global reach yet of situated contexts: an examination of the implicit and explicit selection criteria that shape digital archives of historical newspapers”, Archival Science, Vol. 20 No. 2, pp. 139-165, doi: 10.1007/s10502-020-09332-1.

Havens, L., Alex, B., Hosker, R., Bach, B. and Terras, M. (2023), Collaboration Across the Archival and Computational Sciences to Address Legacies of Gender Bias in Descriptive Metadata, Digital Humanities 2023, Graz, pp. 267–68, available at: https://www.pure.ed.ac.uk/ws/portalfiles/portal/364769281/DH2023_BookOfAbstracts_Havens.pdf (accessed 1 December 2023).

Higgins, C. (2023), “Politicians, not curators, are to blame for the British Museum's woes”, The Guardian, available at: https://www.theguardian.com/commentisfree/2023/sep/01/british-museum-curators-thefts-funding (accessed 15 November 2023).

Hill, A. (2002), “Bringing archives online through the archives Hub”, Journal of the Society of Archivists, Vol. 23 No. 2, pp. 239-248, doi: 10.1080/0037981022000006408.

Himmelsbach, E. (2021), “Collections trust – tapping the potential of museum collection data”, available at: https://theodi.org/article/collection-trust-tapping-the-potential-of-museum-collection-data/ (accessed 6 March 2023).

Historic Environment Scotland (2023a), “Welcome to Scran SCRAN”, available at: https://www.scran.ac.uk/ (accessed 27 February 2023).

Historic Environment Scotland (2023b), “About SCRAN”, available at: https://www.scran.ac.uk/info/aboutscran.php (accessed 27 February 2023).

Humbel, M. (2022), “Participatory action research for a digital humanities research project: investigating open GLAM in the context of social movement archives”, Proceedings of the Digital Humanities 2022, Tokyo, pp. 249-252, available at: https://dh2022.dhii.asia/dh2022bookofabsts.pdf.

Humbel, M. (2023), The Digitisation and Open Access Politics of Social Movement Archives, University College London, London, available at: https://discovery.ucl.ac.uk/id/eprint/10163973/ (accessed 22 May 2023).

ICA (2000), “ISAD(G), General international standard archival description”, available at: https://www.ica.org/sites/default/files/CBPS_2000_Guidelines_ISAD%28G%29_Second-edition_EN.pdf (accessed 31 March 2022).

IFLA (2023), “OAI-PMH”, available at: https://www.ifla.org/references/best-practice-for-national-bibliographic-agencies-in-a-digital-age/service-delivery/system-interfaces-and-search-protocols/oai-pmh/ (accessed 9 November 2023).

Jules, B. (2019), “Architecting sustainable futures: exploring funding models in community-based archives”. Shift, available at: https://shiftdesign.org/content/uploads/2019/02/ArchitectingSustainableFutures-2019-report.pdf (accessed 27 April 2021).

Keller, P. (2021), “Five things I know about data spaces”, available at: https://openfuture.eu/blog/five-things-i-know-about-data-spaces/ (accessed 23 December 2022).

King's Digital Lab (2024), “Arts and humanities data service: enabling digital resources for the art and humanities”, available at: https://ahds.ac.uk/ (accessed 3 July 2024).

Kizhner, I., Terras, M., Rumyantsev, M., Khokhlova, V., Demeshkova, E., Rudov, I. and Afanasieva, J. (2020), “Digital cultural colonialism: measuring bias in aggregated digitized content held in Google Arts and Culture”, in Digital Scholarship in the Humanities, doi: 10.1093/llc/fqaa055, available at: http://fdslive.oup.com/www.oup.com/pdf/production_in_progress.pdf (accessed 19 January 2021).

Kostelic, C. (2017), “Applying the levels of conceptual interoperability model to a digital library ecosystem – a case study”, available at: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85038813338&partnerID=40&md5=28ad429f49055a6ff36e4c6d85a87f66, October, pp. 62-72.

Kotarski, R., Kirby, J., Madden, F., Mitchell, L., Padfield, J., Page, R., Palmer, R. and Woodburn, M. (2022), “Persistent identifiers as IRO infrastructure: a towards a national collection foundation project final report”, Zenodo, doi: 10.5281/ZENODO.6359926, available at: https://zenodo.org/record/6359926 (accessed 9 June 2023).

MacDonald, I. (2023), “Counting when, who and how”, Journal of the History of Collections, Vol. 35 No. 2, pp. 305-320, doi: 10.1093/jhc/fhac034.

Mak, B. (2014), “Archaeology of a digitization”, Journal of the Association for Information Science and Technology, Vol. 65 No. 8, pp. 1515-1526, doi: 10.1002/asi.23061.

Maron, D. and Feinberg, M. (2018), “What does it mean to adopt a metadata standard? A case study of Omeka and the Dublin Core”, Journal of Documentation, Vol. 74 No. 4, pp. 674-691, doi: 10.1108/JD-06-2017-0095.

Martinez, M. and Terras, M. (2019), “‘Not adopted’: the UK orphan works licensing scheme and how the crisis of copyright in the cultural heritage sector restricts access to digital content”, Open Library of the Humanities, Vol. 5 No. 1, doi: 10.16995/olh.335/, available at: https://www.research.ed.ac.uk/portal/files/94014521/MartinezEtal2019NotAdopted.pdf (accessed 19 May 2021).

McCarthy, D. and Wallace, A. (2018), “Survey of GLAM open access policy and practice”, available at: https://docs.google.com/spreadsheets/d/1WPS-KJptUJ-o8SXtg00llcxq0IKJu8eO6Ege_GrLaNc/edit#gid=1216556120 (accessed 27 February 2022).

McCarthy, D. and Wallace, A. (2020), “Open access to collections is a no-brainer – it's a clear-cut extension of any museum's mission Apollo Magazine”, available at: https://www.apollo-magazine.com/open-access-images-museum-mission-open-glam/ (accessed 5 March 2021).

McGillivray, B., Alex, B., Ames, S., Armstrong, G., Beavan, D., Ciula, A., Colavizza, G., Cummings, J., De Roure, D., Farquhar, A., Hengchen, S., Lang, A., Loxley, J., Goudarouli, E., Nanni, F., Nini, A., Nyhan, J., Osborne, N., Poibeau, T., Ridge, M., Ranade, S., Smithies, J., Terras, M., Vlachidis, A., Willcox, P. (2020), The Challenges and Prospects of The Intersection of Humanities and Data Science: A White Paper from The Alan Turing Institute, Figshare, available at: https://figshare.com/articles/online_resource/The_challenges_and_prospects_of_the_intersection_of_humanities_and_data_science_A_White_Paper_from_The_Alan_Turing_Institute/12732164/5 (accessed 7 April 2021).

McNeill, A. (2022), “Art UK: opening up access to the nation's art”, Zenodo, doi: 10.5281/ZENODO.6334193, available at: https://zenodo.org/record/6334193 (accessed 23 March 2022).

Miles, M.B., Huberman, A.M. and Saldaña, J. (2020), Qualitative Data Analysis: A Methods Sourcebook, 4th ed., SAGE, Los Angeles.

Moulaison Sandy, H. and Freeland, C. (2016), “The importance of interoperability: lessons from the digital public library of America”, The International Information and Library Review, Vol. 48 No. 1, pp. 45-50, doi: 10.1080/10572317.2016.1146041.

Nauta, J.G., Van Den Heuvel, W., Teunisse, S. and DEN Foundation (2017), Europeana DSI 2–Access to Digital Resources of European Heritage: D4.4.Report on ENUMERATE Core Survey 4, Europeana, available at: https://pro.europeana.eu/files/Europeana_Professional/Projects/Project_list/ENUMERATE/deliverables/DSI-2_Deliverable%20D4.4_Europeana_Report%20on%20ENUMERATE%20Core%20Survey%204.pdf (accessed 11 February 2022).

Nyhan, J. and Flinn, A. (2016), “Computation and the humanities: towards an oral history of digital humanities”, Springer Series on Cultural Computing). Cham: Springer International Publishing, doi: 10.1007/978-3-319-20170-2, available at: (accessed 15 December 2023).

Nyhan, J., Vlachidis, A., Flinn, A., Pearlman, N., Carine, M. and Hill, J. (2023), “Second report – Sloane Lab: looking back to build future shared collections”, Discovery Projects, Zenodo, London, doi: 10.5281/ZENODO.7995409, available at: https://zenodo.org/record/7995409 (accessed 9 June 2023).

ODI (2023), “Shaping future services and promoting productivity with cutting-edge expertise”, available at: https://www.theodi.org/project/data-innovation-for-uk-research-and-development/ (accessed 5 June 2023).

OpenGLAM Initiative (2023), “What”, available at: https://openglam.org/what/ (accessed 7 June 2023).

Ortolja-Baird, A. and Nyhan, J. (2022), “Encoding the haunting of an object catalogue: on the potential of digital technologies to perpetuate or subvert the silence and bias of the early-modern archive”, Digital Scholarship in the Humanities, Vol. 37 No. 3, pp. 844-867, doi: 10.1093/llc/fqab065.

Ortolja-Baird, A., Pickering, V., Nyhan, J., Sloan, K. and Fleming, M. (2019), “Digital humanities in the memory institution: the challenges of encoding Sir Hans Sloane's early modern catalogues of his collections”, Open Library of Humanities, Vol. 5 No. 1, p. 44, doi: 10.16995/olh.409.

Padfield, J., Bolland, C., Fitzgerald, N., McLaughlin, A., Robson, G. and Terras, M. (2022), “Practical applications of IIIF as a building block towards a digital national collection”, Zenodo, doi: 10.5281/ZENODO.6884885, available at: https://zenodo.org/record/6884885 (accessed 9 June 2023).

Padilla, T., Allen, L., Frost, H., Potvin, S., Russey Roke, E. and Varner, S. (2019), “Final report – always already computational: collections as data”, Zenodo, doi: 10.5281/zenodo.3152935, available at: https://zenodo.org/record/3152935 (accessed 22 March 2021).

Paltrinieri, C. (2021), “International benchmarking review: a towards a national collection report”, (Towards a National Collection Directorate Report), doi: 10.5281/ZENODO.5793173, available at: https://zenodo.org/record/5793173 (accessed 16 February 2022).

Paltrinieri, C. (2023), “Consolidation report: insights from towards a national collection foundation projects”, Zenodo, doi: 10.5281/ZENODO.7674816, available at: https://zenodo.org/record/7674816 (accessed 9 June 2023).

Pandey, R. and Kumar, V. (2020), “Exploring the impediments to digitization and digital preservation of cultural heritage resources: a selective review”, Preservation, Digital Technology and Culture, Vol. 49 No. 1, pp. 26-37, doi: 10.1515/pdtc-2020-0006.

Patton, M.Q. (2015), Qualitative Research & Evaluation Methods: Integrating Theory and Practice, 4th ed., SAGE Publications, Thousand Oaks, (e-book).

Pavis, M. and Wallace, A. (2019), “Response to the 2018 Sarr-Savoy report: statement on intellectual property rights and open access relevant to the digitization and restitution of African cultural heritage and associated materials”, doi: 10.5281/zenodo.2620596, available at: https://zenodo.org/record/2620596 (accessed 6 February 2020).

Pawlicka-Deger, U. (2021), “Infrastructuring digital humanities: on relational infrastructure and global reconfiguration of the field”, Digital Scholarship in the Humanities, Vol. 37 No. 2, pp. 534-550, doi: 10.1093/llc/fqab086.

Pekel, J. (2013), “Case study: Rijksmuseum releases 111.000 high quality images to the public domain”, available at: https://openglam.org/2013/02/27/case-study-rijksmuseum-releases-111-000-high-quality-images-to-the-public-domain/ (accessed 31 March 2023).

Peters, R. and Kalshoven, L. (2016), “What rights clearance looks like for cultural heritage organisations – 10 case studies”, (Europeana Factsheet), available at: https://pro.europeana.eu/files/Europeana_Professional/IPR/160331rights_clearance_case_studies_public.pdf (accessed 20 May 2021).

Petrus, A., Wildi, T. and Müller, S. (2023), “Preproject ‘Swiss virtual natural history collection’”, Database, Vol. 2023, pp. 1-9, doi: 10.1093/database/baad072.

Poikola, A., Verdonck, B. and Joosten, R. (Eds) (2023), DSSC Glossary Data Spaces Support Centre (DSSC), available at: https://dssc.eu/space/Glossary/55443460/DSSC+Glossary+%7C+Version+1.0+%7C+March+2023?attachment=/rest/api/content/55443460/child/attachment/att110362680/download&type=application/pdf&filename=DSSC-Data-Spaces-Glossary-v1.0.pdf (accessed 8 November 2023).

Ponte, D. and Camussone, P.F. (2013), “Neither heroes nor chaos: the victory of VHS against betamax”, International Journal of Actor-Network Theory and Technological Innovation, Vol. 5 No. 1, pp. 40-54, doi: 10.4018/jantti.2013010103.

Poole, N. (2015), “Guest blog: aggregation & the culture Grid Museum computer group”, available at: https://museumscomputergroup.org.uk/culture-grid/ (accessed 27 February 2023).

Preuss, U. (2016), “Sustainable digitalization of cultural heritage – report on initiatives and projects in Brandenburg, Germany”, Sustainability, Vol. 8 No. 9, 891, doi: 10.3390/su8090891.

Pringle, E., Mavin, H., Greenhalgh, T., Dalal-Clayton, A., Rutherford, A., Bramwell, J., Blackford, K. and Balukiewicz, K. (2022), “Provisional semantics: addressing the challenges of representing multiple perspectives within an evolving digitised national collection”, Zenodo, doi: 10.5281/ZENODO.7081347, available at: https://zenodo.org/record/7081347 (accessed 9 June 2023).

Punch, K. (2014), Introduction to Social Research: Quantitative & Qualitative Approaches, 3rd ed., SAGE, Los Angeles, CA.

Rees, G., Gadd, S., Horgan, J., Hunt, A., Isaksen, L., Morris, V., Musson, A., Simon, R., Strachan, P. and Vitale, V. (2022), “Locating a national collection (LaNC)”, Zenodo, doi: 10.5281/ZENODO.7071654, available at: https://zenodo.org/record/7071654 (accessed 9 June 2023).

Renshaw, C. and Liew, C.L. (2021), “Descriptive standards and collection management software for documentary heritage management: attitudes and experiences of information professionals”, Global Knowledge, Memory and Communication, Vol. 70 Nos 8/9, pp. 697-713, doi: 10.1108/GKMC-08-2020-0129.

Rockwell, G. (2010), “As transparent as infrastructure: on the research of cyberinfrastructure in the humanities”, in McGann, J. (Ed.), Online Humanities Scholarship: the Shape of Things to Come, Rice University, Houston, pp. 613-630, available at: http://cnx.org/content/m34315/1.2/ (accessed 16 March 2020).

SAA (2023), “International standard archival description (general)”, [ISAD(G)], available at: https://www2.archivists.org/groups/standards-committee/international-standard-archival-description-general-isadg (accessed 9 November 2023).

Saldaña, J. (2016), The Coding Manual for Qualitative Researchers, 3rd ed., SAGE, Los Angeles.

Sanderhoff, M. (2014), “This belongs to you: on openness and sharing at Statens Museum for Kunst”, in Sanderhoff, M. (Ed.), Sharing Is Caring: Openness and Sharing in the Cultural Heritage Sector, Statens Museum for Kunst, Copenhagen, pp. 20-131.

Sherratt, T. (2015), “On seams and edges: dreams of aggregation, access and discovery in a broken world”, in ALIA Information Online, Zenodo, Sydney, doi: 10.5281/ZENODO.3556475, available at: https://zenodo.org/record/3556475 (accessed 28 February 2023).

Shirazi, R. (2018), “Reproducing the academy: librarians and the question of service in the digital humanities”, in Sayers, J. (Ed.), Making Things and Drawing Boundaries: Experiments in the Digital Humanities, University of Minnesota Press, pp. 86-94, doi: 10.5749/j.ctt1pwt6wq, available at: (accessed 26 March 2022).

Siebinga, S., Manghi, P., Mieldijk, M. and Van Der Werf, T. (2012), “HOPE: mission, technical vision and high-level design of the architecture”, available at: http://www.peoplesheritage.eu/pdf/D2_1_Grant250549_HOPE_V2-0.pdf (accessed 29 November 2021).

Silva, G. M. da, Glória, A.C., Salgueiro, Â.S., Almeida, B., Monteiro, D., Freitas, M. R. de and Freire, N. (2022), “ROSSIO infrastructure: a digital humanities platform to explore the Portuguese cultural heritage”, Information, Vol. 13 No. 2, p. 50, doi: 10.3390/info13020050.

Sloan, K. and Nyhan, J. (2021), “Enlightenment architectures: the reconstruction of Sir Hans Sloane's cabinets of ‘Miscellanies’”, Journal of the History of Collections, Vol. 33 No. 2, pp. 199-218, doi: 10.1093/jhc/fhaa034.

Star, S.L. and Ruhleder, K. (1996), “Steps toward an ecology of infrastructure: design and access for large information spaces”, Information Systems Research, Vol. 7 No. 1, pp. 111-134, doi: 10.1287/isre.7.1.111.

Star, S.L. and Strauss, A. (1999), “Layers of silence, arenas of voice: the ecology of visible and invisible work”, Computer Supported Cooperative Work, Vol. 8 Nos 1-2, pp. 9-30, doi: 10.1023/A:1008651105359.

Stevenson, J. (2019), “Data processing and sustainability with a large-scale aggregator: the UK archives Hub”, in Depoortere, R., Gheldof, T., Styven, D. and Van Der Eycken, J. (Eds), Trust and Understanding: The Value of Metadata in a Digitally Joined-Up World, pp. 93-103.

Stobo, V. (2016), “Risky business: copyright and making collections available online”, in Wallace, A. and Deazley, R. (Eds.), Display at Your Own Risk: An Experimental Exhibition of Digital Cultural Heritage, CREATe, Glasgow, pp. 281–87, available at: https://displayatyourownrisk.org/wp-content/uploads/2016/04/Display-At-Your-Own-Risk-Publication.pdf (accessed 2 June 2021).

TaNC (2023), “About us”, available at: https://www.nationalcollection.org.uk/about (accessed 7 December 2023).

Terras, M. (2011), “The rise of digitization”, (Educational Futures Rethinking Theory and Practice), in Rikowski, R. (Ed.), Digitisation Perspectives, SensePublishers, Rotterdam, Vol. 46, pp. 3-20, doi: 10.1007/978-94-6091-299-3_1, available at: (accessed 5 July 2024).

Terras, M., Coleman, S., Drost, S., Elsden, C., Helgason, I., Lechelt, S., Osborne, N., Panneels, I., Pegado, B., Schafer, B., Smyth, M., Thornton, P. and Speed, C. (2021), “The value of mass-digitised cultural heritage content in creative contexts”, Big Data and Society, Vol. 8 No. 1, pp. 1-15, doi: 10.1177/20539517211006165.

Thomer, A.K. and Rayburn, A.J. (2023), “‘A patchwork of data systems’: quilting as an analytic lens and stabilizing practice for knowledge infrastructures”, Science, Technology and Human Values, Vol. 20 No. 10, pp. 1-30, doi: 10.1177/01622439231175535.

Thylstrup, N.B. (2018), The Politics of Mass Digitization, The MIT Press, Cambridge, MA.

Timms, K. (2009), “New partnerships for old sibling rivals: the development of integrated access systems for the holdings of archives, libraries, and museums”, Archivaria, Vol. 68, pp. 67-95.

Tzouganatou, A. (2021), “On complexity of GLAMs' digital ecosystem: APIs as change makers for opening up knowledge”, in Rauterberg, M. (Ed.), Culture and Computing. Design Thinking and Cultural Computing, Springer, Cham, Vol. 12795, pp. 348-359, Lecture Notes in Computer Science, doi: 10.1007/978-3-030-77431-8_22, available at: (accessed 13 June 2023).

UKRI (2021), “Artificial intelligence supports culture and heritage exploration”, available at: https://www.ukri.org/news/artificial-intelligence-supports-culture-and-heritage-exploration/ (accessed 5 June 2023).

Van de Sompel, H. and Nelson, M.L. (2015), “Reminiscing about 15 years of interoperability efforts”, D-Lib Magazine, Vol. 21 Nos 11/12, doi: 10.1045/november2015-vandesompel, available at: http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html (accessed 21 March 2022).

Van Hooland, S. and Verborgh, R. (2014), Linked Data for Libraries, Archives and Museums: How to Clean, Link and Publish Your Metadata, Facet Publishing, London.

Van Strien, D., Beelen, K., Ardanuy, M., Hosseini, K., McGillivray, B. and Colavizza, G. (2020), “Assessing the impact of OCR quality on downstream NLP tasks”, Proceedings of the 12th International Conference on Agents and Artificial Intelligence, Valletta, Malta, SCITEPRESS - Science and Technology Publications, pp. 484-496, doi: 10.5220/0009169004840496, available at: (accessed 30 June 2022).

Wallace, A. (2020a), “Words mean things (A glossary)”, Open GLAM, doi: 10.21428/74d826b1.51566976, available at: https://openglam.pubpub.org/pub/the-glossary (accessed 27 October 2021).

Wallace, A. (2020b), “Copyright. Critical Open GLAM: towards [Appropriate] Open Access for Cultural Heritage”, doi: 10.21428/74d826b1.556f5733, available at: https://openglam.pubpub.org/pub/background-copyright (accessed 3 November 2020).

Wallace, A. (2020c), “Introduction”, Critical Open GLAM: Towards [Appropriate] Open Access for Cultural Heritage, doi: 10.21428/74d826b1.be9df175, available at: https://openglam.pubpub.org/pub/introduction-to-critical-open-glam (accessed 29 October 2020).

Wallace, A. (2021), “Decolonization and Indigenization”, Critical Open GLAM: Towards [Appropriate] Open Access for Cultural Heritage, available at: https://openglam.pubpub.org/pub/decolonization (accessed 5 January 2023).

Wallace, A. (2022), “A culture of copyright: a scoping study on open access to digital cultural heritage collections in the UK”, Zenodo, doi: 10.5281/ZENODO.6242611, available at: https://zenodo.org/record/6242611 (accessed 23 March 2022).

Wernimont, J. (2021), “Listening, care, and collections as data”, Journal of Critical Digital Librarianship, Vol. 1 No. 1, pp. 23-42, doi: 10.31390/jcdl.1.1.04.

Winters, J., Stack, J., Dutia, K., Unwin, J., Lewis, R., Palmer, R. and Wolff, A. (2022), “Heritage connector: a towards a national collection foundation project final report”, Zenodo, doi: 10.5281/ZENODO.6022678, available at: https://zenodo.org/record/6022678 (accessed 9 June 2023).

Zaagsma, G. (2023), “Digital history and the politics of digitization”, Digital Scholarship in the Humanities, Vol. 38 No. 2, pp. 830-851, doi: 10.1093/llc/fqac050.

Ziegler, S.L. (2020), “Open data in cultural heritage institutions: can we Be better than data brokers?”, Digital Humanities Quarterly, Vol. 14 No. 2, available at: http://www.digitalhumanities.org/dhq/vol/14/2/000462/000462.html (accessed 11 September 2020).

Zoller, G. and DeMarsh, K. (2013), “For the record: museum cataloging from a library and information science perspective”, Art Documentation: Journal of the Art Libraries Society of North America, Vol. 32 No. 1, pp. 54-70, doi: 10.1086/669989.

Acknowledgements

This research was part of “The Sloane Lab: Looking back to build future shared collections”, a collaborative project led by UCL and the Technische Universität Darmstadt with the Natural History Museum (London) and the British Museum. This project was supported by the UK Arts and Humanities Research Council, funded Towards a National Collection programme. AHRC project [AH/W003457/1]. We wish to thank all interviewees for their invaluable contributions. We also wish to thank the anonymous peer reviewers for their helpful feedback.

Corresponding author

Marco Humbel can be contacted at: marco.humbel.17@ucl.ac.uk

Socio-cultural challenges in collections digital infrastructures

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Keywords

Citation

Publisher

1. Introduction

2. Literature review

3. Methodology

4. Data

4.1 Motivation for aggregated collections

4.2 Aggregation fatigue

4.3 Technical challenges

4.4 Legal and ethical responsibilities

4.5 Centralised vs decentralised data sharing

5. Discussion

6. Conclusion

Figures

Figure 1

Table 1

Table 2

Notes

References

Acknowledgements

Corresponding author

Supplementary materials

Related articles

Related articles

https://zenodo.org/record/5779826

https://op.europa.eu/en/publication-detail/-/publication/90f1ee85-ca88-11ec-b6f4-01aa75ed71a1/language-en#

https://zenodo.org/record/6379581

https://theodi.org/article/collection-trust-tapping-the-potential-of-museum-collection-data/

https://zenodo.org/record/6359926

https://figshare.com/articles/online_resource/The_challenges_and_prospects_of_the_intersection_of_humanities_and_data_science_A_White_Paper_from_The_Alan_Turing_Institute/12732164/5

https://zenodo.org/record/6334193

https://zenodo.org/record/7995409

https://zenodo.org/record/6884885

https://zenodo.org/record/3152935

https://zenodo.org/record/5793173

https://zenodo.org/record/7674816

https://zenodo.org/record/2620596

https://zenodo.org/record/7081347

https://zenodo.org/record/7071654

https://zenodo.org/record/3556475

https://zenodo.org/record/6242611

https://zenodo.org/record/6022678

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Social implications

Originality/value

Keywords

Citation

Publisher

1. Introduction

2. Literature review

3. Methodology

4. Data

4.1 Motivation for aggregated collections

4.2 Aggregation fatigue

4.3 Technical challenges

4.4 Legal and ethical responsibilities

4.5 Centralised vs decentralised data sharing

5. Discussion

6. Conclusion

Figures

Figure 1

Notes

References

Acknowledgements

Corresponding author

Related articles

All feedback is valuable

Report an issue or find answers to frequently asked questions