Decision making in text digitisation

TÓSZEGI Zsuzsanna

Könyvtári Figyelő (Library Review) vol. 16. (52.) 2006. no. 2. pp. 245–260.

Librarians involved in digitisation projects need to make many important decisions during the digitisation process. This study aims to support the decision making with presenting the most important aspects that need to be considered in the digitisation of printed texts. Digital texts can be digitally born or digitised, and the digitisation of texts can be carried out at three different levels: reproductive, representative or interpretative. At the reproductive level, the digitised version of the text does not offer extended features compared to the original text. At the representative level, the digitised version offers additional functions, e.g. can be searched. At the interpretative level, a higher quality hypertext is produced, generally enhanced with cross-references. When planning a digitisation project, a number of questions need to be clarified. For what purpose (preservation, archiving or accessibility)? For which user group (private use, institutional access, or for the wide public)? In the context of intellectual property rights, digitisation means reproduction, thus digitisation needs to be authorised by the author or the copyright holder. The project can only start after the clarification of these issues. The study presents the most frequently applied scanning solutions. If the original document is available in multiple copies, scanning of separated pages can be chosen. Contact scanning represents a rapid, cost-effective solution for achieving reasonable quality. When the original document needs a high degree of protection, special and more costly scanners must be used. The typographical, structural and semantic features of the work need to be examined thoroughly in order to decide upon the format to be used (HTML, SGML or XML). As digitisation technologies develop rapidly, it needs to be thought through if the digitised material will be used in the short or long term. The advantages, disadvantages and costs of the above technical options are summarised in table format.

