The Archives Hub uses the Collections Information Integration Middleware or CIIM (pronounced “sim”) provided by Knowledge Integration.
This is a modular suite of software which sits between the archive descriptions and the web site (or other end points). It uses Elasticsearch, a search engine based on Lucence, to represent the Hub's large volumes of complexly structured descriptions.
The interface for the application was designed by Gooii. The static site is developed and maintained in-house.
We provide access to the Archives Hub through OAI-PMH and also through the Elastic Search API.
We have an administration interface onto the CIIM, which we use to manage the ingest and processing of descriptions.
We process descriptions in order to create a store of aggregated content that is structured and potentially re-usable. We use the International Standard Archival Description (General), or ISAD(G), but we also recognise its shortcomings for the current online world. Index terms follow recognized rules or recognized sources (e.g. NCA Rules, UKAT).
The format used to ingest descriptions is Encoded Archival Description (EAD). The descriptions are stored in JSON. Descriptions may be at collection level or they may be multi-level, down to individual item. It is the responsibility of the Hub contributors to create and submit descriptions for inclusion on the Hub.
The Hub Model
Our model is based upon entities and relationships between them. Archive descriptions, themed collection descriptions, repositories, people and organisations are all entities that can be described in their own right, and they all relate to eachother. This kind of model is extensible, allowing for new entities, such as place names, to be introduced.
The Data Flow
Descriptions go through a set of 'pipelines' designed to standardise them to Hub requirements. For example, creating a standardised header level identifier, adding ISO standard language and date information, correcting links to external locations and ensuring consistency of repository name. We assess the data of each contributor to set up pipelines for their data.