Friday, November 2, 2007

Analysis: DSpace for use in an Archives

For my software evaluation, I've chosen to evaluate the possibility of using DSpace in an archives organization. My conclusion is somewhat wordy here, but I thought providing some background information would be valuable. DSpace is a digital "archiving" repository (the quotes will be significant later) whose primary purposes are to preserve digital content and to make content accessible, primarily over the internet.

Unfortunately, however, I've found a few snags concerning the use of DSpace in an archives specifically. Although it's often called a digital archives tool, it's actually meant for published works; it's only archival in the sense of "self-archiving." As a result it's missing a number of features that would be important for an archives by the professional definition, such as the ability to restrict specific records. (The only proposed solution to this I've seen is to create separate public and private databases, which seems inconvenient, and which would prevent storing all documents from a series in the same database.) Similarly, it doesn't appear to support RAD notation out of the box, although this can be added with some work. DSpace's public presentation is designed to be customizeable. In general, because DSpace is open-source, even outright non-existent features such as restricting files could be added; this would significantly add to the cost of adoption, however, and it's difficult to estimate how much this might cost.

Otherwise, however, its cost is likely not to be too significant. It is designed for Unix-like operating systems, and Linux, the most popular Unix-like, is very common as a server operating system; similarly, the other software on which DSpace is built, such as Apache, is also common and well-docuented. An archives' IT staff is likely to already be supporting Apache on Linux or a BSD, minimizing the cost of adding a new server with the same base software. This is somewhat complicated by the lack of commercial support for DSpace, however, which may require training IT staff for it.

DSpace's preservation support is likely its most valuable feature; I won't go into it in detail here, but put simply it can perform automatic migration of data formats provided that the data is in a standard format. This alone would make DSpace adoption worthwhile. Its other primary feature is its public presentation of data. Archives websites are somewhat of a wild west at the moment, and public availability of archival holdings is spotty at best. DSpace could make it possible for archives to make the majority of their holdings publicly available. These features make it worthwhile to do a more thorough cost analysis of making up for these missing features; if possible within the budget, adopting DSpace could be a major benefit to an archives.