Technology

  • Restoring illegible manuscripts
  • Two techniques pioneered in the Electronic Beowulf project are essential for recovering and restoring large sections of text in Alfred’s Boethius. Ultraviolet photography requires long exposure times that can cause damage to a manuscript, not to mention a photographer. We discovered while digitizing the Beowulf manuscript that the digital camera could safely record fluorescent effects in the seconds it took to digitize in normal bright light. The saved images carry an obscuring artifact of the process, a dark overlay on the image, which can be removed through a straightforward image-processing routine. The second technique employs fiber-optic backlighting and a digital camera to reveal and record letters and parts of letters hidden by the nineteenth-century restoration frames of the Cotton collection. The Electronic Beowulf won an award in 1994-1995 for Innovation in Information Technology for developing this process. Although it works with all Cottonian manuscripts whose binding frames cover parts of text, fiber-optic backlighting is not as effective with Cotton Otho A. vi as it is with Beowulf. The bright backlighting reveals wherever text is covered, but ultraviolet cannot penetrate the paper frames to restore or enhance the covered text.

  • Integrating XML technologies
  • Still under revision by the Committee on Scholarly Editions (CSE), the Modern Language Association’s (MLA) “Guidelines for Electronic Scholarly Editions” promote the use of such encoding norms as Standard Generalized Markup Language (SGML), in particular its subset for markup of electronic texts in the humanities, the Text Encoding Initiative (TEI). Although now converted to XML, the TEI Guidelines for Electronic Text Encoding and Interchange (P4) are not designed for the description of physical features of a document, which of course is indispensable for encoding a damaged medieval manuscript for an image-based electronic edition. We take advantage of what TEI has accomplished, but our requirements go beyond what TEI has addressed. For example, the physical damage to the manuscript requires complex markup that leads to conflicting hierarchies of descriptive tags. To accommodate our markup requirements we are therefore designing software to deal with conflicts automatically. We encode with XML all textual components of Alfred’s Boethius, including the transcription, the edited text, the glossaries, and apparatus, and will use Extensible Style Language Transformations (XSLTs) to provide multiple viewing options. A native XML database with a powerful XPath processor will store the encoded documents and extensively linked images to facilitate comprehensive searches of all content, from glossary to manuscript transcript with linked images to the full editorial apparatus, through a user-friendly interface.

  • Developing and customizing open source software
  • Commercial SGML/XML-aware software is not only prohibitively expensive, but also limited to specific purposes. Most textual scholars in the humanities must accordingly rely on open source software, non-proprietary software for which the source code is freely available. Open source software, however, also requires the support of applications programmers and interface designers to adapt it for changing purposes and to maintain it over time. To address these perennial barriers, we are designing software that is extensible and adaptable without the need for new programming. We are using a modular Java and XML software framework, including an edition production management system, a native XML database, configurable interfaces, and a suite of reusable editorial tools, customized to the needs of textual scholars in the humanities. This extensible toolkit will facilitate the efficient assembly of complex scholarly editions from high-resolution digital facsimiles and XML-encoded texts, apparatus, and ancillary materials. The Edition Production Toolkit (EPT), now in development as Edition Production Technology under the auspices of The ARCHway Project: Architecture for Research in Computing for Humanities through Research, Teaching, and Learning, will be refined as we build the electronic edition of Alfred’s Boethius. When completed, EPT will be freely available to scholars in the humanities for other projects.

     

    [Adaptable Search Window GUI through an XML file]

     

  • Documenting preservation metadata
  • Since 1996, when the RLG issued Preserving Digital Information: Report of the Task Force on Archiving of Digital Information, progress has been made toward establishing standards for ensuring long-term preservation of and access to digital resources. One of the new standards most widely-embraced by the library and archival community is the METS: Metadata Encoding and Transmission Standard, an initiative of the Digital Library Federation. METS allows the creator of a digital library (in our case, digital edition) to combine different types of metadata in one document, using the XML schema language. Following the endorsement of METS, the Electronic Boethius Project will also use MODS: Metadata Object Description Schema for descriptive metadata and NISO's MIX: Metadata for Images in XML Schema for technical image metadata. For technical text metadata we expect to use "Schema textmd.xsd," the Schema for Technical Metadata for Text under development at NYU. The METS document for Electronic Boethius will also include a Behavior Section describing the computer programs required to run the edition and a Structural Map outlining the hierarchical structure of the edition. This project will continue to research emerging options for the descriptive, administrative, and structural metadata required for long-term preservation of the electronic edition.

Back to Headings

 


Editing Methods and Tools

There is as yet no modern edition of the distinct version of Alfred's Boethius preserved in the earliest extant manuscript, British Library MS Cotton Otho A. vi. Following its Latin source, this mid-tenth-century manuscript, which was severely damaged by fire in 1731, presents the text in a five-book structure with alternating prose and verse. The other surviving version, the twelfth-century Oxford, Bodleian Library, Bodley 180 manuscript, is arranged in forty-two chapters, all in prose. Before the fire Francis Junius collated Cotton Otho A. vi with his own transcript of Bodley 180. In view of the great difference between the verse sections of Otho A. vi and their corresponding prose renditions in MS Bodley 180, which made a simple collation impossible, Junius providentially chose to copy all of the verse parts in their entirety. Today Junius's transcript, MS Junius 12, is the only source for those verse passages in Cotton Otho A. vi that the fire later destroyed.

In the standard modern edition of King Alfred’s Old English Version of Boethius “De Consolatione Philosophiae” (Oxford 1899), W. J. Sedgefield used as much of the prose of Cotton Otho A. vi as he was able to see through the fire damage, but he replaced the “meters” (as they are called) with the prose versions of Bodley 180 and also followed its forty-two chapter structure, instead of the five-book structure of the earlier manuscript. Other modern editions present the meters all by themselves, extracted from their original context. Founding the text on digitally enhanced ultraviolet images of the burnt Cotton manuscript, supplemented where necessary by Junius's collations and transcripts of the meters, the Electronic Boethius will thus constitute the first edition of the earliest manuscript.

  • To begin to create a feasible copy-text that actually represents the prose-and-verse version, the editor reconstructed a Cottonian text by removing the prose versions of the meters from Sedgefield's edition and replacing them with the original verse versions, edited by George Phillip Krapp as The Meters of Boethius (ASPR 5, 1932). Both editions are available in the Dictionary of Old English Corpus in Electronic Form and were used for this purpose with permission of the editor of the DOEC.
  • Next, a preliminary collation against scanned preservation-quality microfilm of Cotton Otho A. vi furnished a copy-text more closely resembling the earliest manuscript by tagging in XML all folio and folio-line divisions, as well as the book divisions and the prose and verse boundaries of the manuscript. This XML encoding separates what is gone and what survives, based on these professionally prepared photographs. The encoding also records how much of the restored parts are dependent on the Sedgefield and Krapp editions, which in turn depend on the collations and verse transcripts of MS Junius 12.
  • From this lightly encoded copy-text a specially designed Glossary Tool generates an exhaustive wordlist with folio-line and edition-line locations, as well as book, prose, and verse boundaries, for building a complex glossary for both edition and manuscript. The tool, with templates for all parts of speech, guides the preparation of a fully searchable glossary with consistent, data-centric, XML tagging for use in the database.
  •  

    [Glossary Tool]

     

  • A complementary Tagger software now under production similarly allows the editor and research assistants to provide pervasive, extremely complex, document-centric, XML encoding for the transcript and edition. Although document-centric encoding, unlike the data-centric encoding of the glossary, is not conducive to broad template-driven markup, the Tagger is designed to help editors descriptively encode a document without concentrating on XML or otherwise encountering a forest of angle brackets. A partial encoding of the transcript for lines 9-11 of fol. 38v forces a number of conflicting hierarchies:

    [Some conflicting hierarchies on fol. 38 verso (UV)]

    The software we are developing will correctly resolve all conflicting tags behind the scenes, silently avoiding for the humanities editors the creation of invalid or non-well-formed XML encoding. The editor views in one window the source of the descriptive markup in the digital images of the manuscript, and tags a transcript of it in another window using clickable element buttons. The resulting tagged file is, like the glossary, fully searchable and open to any number of configurable views. Because the XML encoding automatically includes x/y coordinates for all tagged parts of an image, searching the text and the image can proceed in tandem.

  • To complete the formation of an adequate copy-text, collations are made with bright light and ultraviolet images against the emerging copy-text. The encoding of these two sets of images gathers comprehensive information about the legibility of the manuscript in bright light and with the aid of ultraviolet fluorescence. To avoid the need of repetitive encoding for bright light and ultraviolet (and any other images that might be used), we have developed an OverLay tool, which allows the editor to superimpose any combination of images digitized, for instance, by daylight, fiber-optic, and ultraviolet, to disclose readings rendered obscure or illegible by fire-damage. [Click on image to animate the OverLay]
  •  

    [OverLay Otho A. vi, fol. 1r, bright light and ultraviolet]

  • With a reconstructed copy-text of the version of Alfred's Boethius in Cotton Otho A. vi, the editors using the Tagger can proceed to the historical collation with Oxford, Bodleian Library, Bodley 180, and Oxford, Bodleian Library, Junius 12, to record variant readings and to facilitate comparison between poetic and prosaic versions of the same essential content. A series of additional collations with important editions are likewise encoded in XML to record their variant readings, editorial emendations and conjectural restorations. This encoding will enable comprehensive searches as well as the display of any or all of the texts in the different manuscripts and editions.
  • Other custom-made tools will provide methodical tagging for other specialized purposes. For example, the editor can easily prepare paleographical descriptions of the scribal letterforms by using templates for specific letters. [Click on image to animate paleographical encoding]

     

    [DucType Tool for tagging paleographical features]

    These individually tagged letterforms can provide students with easy access to paleographical illustrations, which are usually omitted in scholarly editions and sparsely included even in paleographical treatises.

  • The tagging of physical features of the manuscript is meant to be as comprehensive as possible. For example, the codicological markup includes tags for quires, sheets, leaves, folios, folio-lines, sectional divisions, margins, and frames; the paleographical markup covers scribes, letterforms, manuscript punctuation and spacing, abbreviations, additions, omissions, alterations, deletions, glosses, and uses of other languages; and the editorial markup records all conjectural restorations and emendations, while the “edition production technology” (EPT) ensures that any editorial changes to the text are updated in the glossary.
  • These specialized tools are all intended to function together, and ultimately to transform the editor's EPT workbench into a virtual reading room for all users of the electronic edition. An interface with Java applets will display image and text together, allow for precise linking between them, and access the database for apparatus, glossary, translation, and user-defined searches. The browser interface used for editing the text will also be included in the user’s version of the edition. Users of the completed edition will have access to the source code of these and other editorial tools to construct their own electronic editions and to conduct their own research of any other collection of images.
Back to Headings
 



HOME Home The Electronic Boethius. Last modified RCH