Using Open Source Systems for Digital Libraries

Jon P. Knight (Pilkington Library, Loughborough University, Loughborough, UK)

Program: electronic library and information systems

ISSN: 0033-0337

Article publication date: 1 March 2005

172

Keywords

Citation

Knight, J.P. (2005), "Using Open Source Systems for Digital Libraries", Program: electronic library and information systems, Vol. 39 No. 1, pp. 89-90. https://doi.org/10.1108/00330330510578903

Publisher

:

Emerald Group Publishing Limited

Copyright © 2005, Emerald Group Publishing Limited


For many institutions, the idea of a digital library is very attractive. Unfortunately it is often a very expensive idea: not only does it require hardware and staff time to set up and manage, but many of the commercial software packages that are used have large price tags attached. This book looks at one avenue available to help reduce this cost: the use of Open Source Software (OSS). This software is distributed with the full source code required to build it, as opposed to most commercial closed source packages that are distributed in binary form only. They not only have much lower price tags (often zero!) but they also allow digital library designers to modify the code for their own needs and even contribute these changes back to the community.

The book starts with a preface and introduction in which the author outlines his digital library experiences, describes the ideas behind OSS and looks at a variety of OSS licences. This is followed by a chapter that looks at the variety of digital objects, their life‐cycles, media and file formats and introduces the eXtensible Markup Language (XML). XML is heavily used in modern digital library and web‐based systems and is a recurring theme throughout the book. The next chapter looks at protocols such as HTTP and Z39.50 and the OSS packages that implement them.

A digital library is only as good as the objects it contains, so there is a chapter on the OSS authoring tools available. This includes graphics toolkits, XML editors and general‐purpose office suites. The importance of XML content in providing a basis for digital libraries is underlined by the next chapter on the application of XML transformations and filters. The author explains how these can be used to allow a single XML file generate a number of different output files, thus allowing a master file to be used to service a number of different users with different needs.

Databases form a large part of the software needs of a digital library, and so there are two chapters looking at two of the most popular types: relational databases and XML/object databases. These databases are widely used, both within and without the library community, and have a large user‐base. Digital library specific software obviously have much smaller user groups but are equally important and thus have a chapter to themselves.

It is rare that a single software package can be used to handle all the needs of a full‐blown digital library, and so the next two chapters look at two different aspects of gluing systems together. The first of these covers the scripting languages Perl, Python and PHP, and looks at the regular expression mechanism that sits at the heart of all of them. The second chapter looks at how digital libraries can link with each other using either simple URLs (the so‐called “REST” loosely coupled option) or using the Web Services which are currently being heavily hyped for developing e‐commerce systems. It also notes how digital libraries can “hedge their bets” with OSS packages and support both options – a wise move considering that the jury is still definitely out on which one (if any) will be the “One True Way”.

The last chapter looks at the maintenance aspects of a digital library (including preservation and archiving of digital content and systems) and how to build a user community around it. The building of digital library communities needs to be done in much the same way as a physical library attempts to cater for groups of users visiting the building and working together. The book then has an appendix that contains the definition of OSS from www.opensource.org and the “Notes”, which really double as the citations for each of the preceding chapters. The book is rounded off by a glossary, some suggested further reading/resources and finally a useful index.

Now at this point I must admit two things. First, I'm a great fan of OSS, having used many packages in my day‐to‐day work and also written the odd digital library related OSS system myself. So I'll admit to a certain bias in favour of OSS in digital libraries. Second, I had expected this book to be mostly a list of OSS packages that could be used (or bent into being usable) in digital libraries, which would date the book quickly.

On the second aspect, there are indeed suggestions for OSS packages that will prove useful in digital libraries. However, the book is far more than just a catalogue of OSS suites. Instead it gives a basic grounding into many of the underlying problems and concepts in digital libraries and looks at how a variety of existing packages can be applied to help solve them. This exposition on common issues means that the book will find a wider audience than simply techies looking to apply OSS packages to digital library projects. The LIS student and professional will also find it a useful introduction to digital libraries. Indeed, as the technical content is not terribly deep, the techies are not likely to be the primary audience and the less technical audience may find it useful to introduce the basic technical framework that underlies so many systems, both OSS and commercial.

One of the most striking things that one comes away with after reading this book is the importance of XML as an underlying and unifying mechanism in digital libraries. This almost seems more important than the particular OSS packages mentioned: the whole point of XML is that it does not tie library developers into a single software package for ever more. If this were the only thing that this book got across to library managers, planners and developers, its cover price would be well worth it. As it offers much more useful information, it has to be a recommended read for such LIS professionals.

Related articles