I have just completed a major digitisation project – the complete run of the Journal of the Cork Historical and Archaeological Society from 1892 to 2007 – 115 volumes in all. The journal proper consists of over 22,000 pages with over 3,000 pages of additional material as well. A table of contents database had to be created, this consists of 5,502 records each of which is a bibliographical item and is linked to a PDF file. PHP in conjunction with MySQL is used to create a simple search interface.
These figures conceal the processes involved in going from page to screen. My aim was to ensure that users could browse by volume (displaying the contents of each issue or volume together as if one was looking at a table of contents) or search by title keyword or author name. To this end, the articles and thus PDFs are numbered sequentially within each volume so that they appear in the order in which they were published. All files are searchable as optical character recognition was used on each page image. The final file sizes have been reduced significantly by examining how the images are stored at each stage of the workflow. Each PDF file includes a ‘front page’ that includes the bibliographical details of each item together with links to terms and conditions. Finally, every page is watermarked to show information about the source of the file.
In the process no journals were disbound (although this decision carried an overhead in terms of time) so that they are preserved for future use. A range of software was employed, most of which was open source or freely available.
The end result is that due to the foresight of the Cork Historical and Archaeological Society an important Irish journal is available for all online. In addition, I have donated the index database, which is now also searchable online. This makes the Digital JCHAS webpage a powerful tool when researching the history and archaeology of the south of Ireland, particularly Cork county. I have contributed more information about Digital JCHAS and the Indexes to the CHAS blog.