Technical preface

Document Contents

Technical responsibilities

Report on modular approach

Team members

Gabriel Bodard, Paul Spence

This is a brief, preliminary indication of the technical background to the EPAPP project and ALA website. A fuller technical report will be found on the Inscriptions of Aphrodisias site. (InsAph Technical Report)

Technical responsibilities

When this project began, the first task was to acquire the contents of the original 1989 printed volume in electronic format. Charlotte Roueché scanned the narrative and commentary material, and began to update it in MS Word documents. Once these were relatively stable, Paul Spence carried out extensive analysis with Charlotte Roueché and designed a special DTD for this material; John Lavagnino wrote Perl scripts that converted the Word documents to a single XML file. This file was then further edited by hand, both to correct errors and improve content, and to add deep encoding, keywords and internal linking. Some elements of this file were later turned into free-standing TEI documents, but the bulk is still in this home-grown markup system.

The 230 existing and 23 new inscriptions were a larger job. The EpiDoc DTD devised by Tom Elliott and colleague in Chapel Hill was discussed and revised by the EPAPP technical team of Gabriel Bodard, John Lavagnino and Paul Spence, and then the first dozen inscriptions were marked up by hand by Gabriel Bodard. John Lavagnino used the online Pizza Chef to generate a new TEI EpiDoc DTD (which Tom Elliott now maintains), and then wrote a further Perl script to convert the MS Word documents containing the inscription description, commentary and history data into TEI XML format. The Greek text was entered in Beta Code in the first instance by Gabriel Bodard, and hand coded in EpiDoc.

Paul Spence designed and implemented the application that generates a website from these various XML files. Ever since the first test output, which included some indices and XSLT checking pages, there has been an iterative process correcting and improving the XML documents.

In the meantime, Gabriel Bodard compiled a large databank of digital images of the Aphrodisias images. Most of these images were scanned as high-quality TIFFs from black and white photographs taken by Charlotte Roueché and others; some digital images were received from New York University and other repositories, and more recently Charlotte Roueché has taken colour photographs with a new digital camera. These images are managed in a FileMaker Pro database designed with the help of Hafed Walda. Plans of the site were marked up in Adobe Illustrator by Thomas Roueché, and turned into the dynamic plans on the website by Martyn Jessop and Paul Vetch.

Once all the data was in place, in XML files, in a preliminary form, the project benefitted from the addition to the EPAPP technical team of Juan Luis Garcés and Zaneta Au, on the XML and XSLT fronts respectively. Zaneta Au moved the various elements of the publication into the xMod application, designed by Paul Spence and Paul Vetch (xMod website), and added many new project-specific features.

All Greek words and names in the EpiDoc inscriptions have been lemmatised by Gabriel Bodard and Juan Garcés, and now make up the bulk of the indices and search terms in the ALA site. Many more search terms were added to the markup by Charlotte Roueché. At a late stage Zaneta Au has done a huge amount of work on the design and implementation of XSLT stylesheets for the publication, tables, and indices, with some help from Gabriel Bodard.

At all stages we have received technical advice from collaborators via email discussion lists and at international workshops organised by the EPAPP and InsAph projects. In particular, Tom Elliott and Hugh Cayless have advised on EpiDoc XML and helped in countless other ways. Users of the test website, and especially those who returned the user questionnaire, are also owed a vote of thanks for their contribution to the testing and evaluating phase of this project.

Report on modular approach

The exploratory nature of the project to create a digital version of Aphrodisias in Late Antiquity has been explained on this site's homepage. Digitising and integrating this rich and diverse set of materials, while attempting to ensure that they could reference, and be referenced by, other research in the field, presented us with many challenges familiar to humanities computing scholars:

Repurposing of material for different ends;
Integration of material with varied structures;
Integration of material with other repositories and reference works;
Complex linking between objects;
Conversion of data from proprietary formats (e.g. MSWord, Greek fonts) to portable formats (e.g. XML, Unicode);
Ensuring that the technical approaches are developed to suit the scholarly objectives, and that they are relatively easy for a person with little formal technical training to learn;
Hence ensuring that any approaches taken work on a variety of environments, meet international standards and rely on free 'open source' technologies, wherever possible, and documenting new techniques and tools;
Problems of character encoding: in particular encoding, presentation and processing of ancient Greek

We are still digesting many of the lessons learnt during this exercise, and this report is merely a summary of some of the main points that will eventually be covered in the technical preface.

The adoption of XML, and in particular flavours of XML based on the TEI standard, is one of the main technical pillars of ALA.

We used:

TEI for presentational web pages;
EpiDoc TEI for the inscriptions;
A home-grown XML markup system for the digitised version of the commentary and reference material–however, this was largely based on TEI, and will indeed be converted to TEI very soon for archival purposes;¹
Home-grown XML for the look-up tables and parameters that are used in the publishing process.

The overall TEI XML publishing framework is based on an application developed by Paul Spence and Paul Vetch at the Centre for Computing in the Humanities, called 'xMod'.

This highly modular application aims to keep separate four fundamentally different kinds of activities:

The creation and mark-up of content, so that it can be archived in ways which will promote its longevity, independent of current technological fashions
The presentation of data, re-purposed to give a particular 'view' of a document collection at a particular moment in time
The visual design that is applied as a 'skin' to the presentation
The underlying technologies and programming that bring all of the components together and drive the different processes

At present, it produces static web pages (using XSLT and Perl), but we are currently examining 'dynamic' XML publishing solutions, including some that allow for semantic searching of XML markup.

Team members

Technical Director

Harold Short

Research Associate, Greek texts and XML

Dr Gabriel Bodard

Technical Consultants

Martyn Jessop
Dr John Lavagnino
Paul Spence

Technical Project Officers

Zaneta Au
Dr Juan Garcés
Elliott Hall
Paul Vetch
Dr Hafed Walda

Contact Details

Contact Details

Centre for Computing in the Humanities

King's College London

Strand

London

WC2R 2LS

Thanks for advice and technical help with EpiDoc to:

Hugh Cayless
Tom Elliott

Footnotes

1.		The reasons for developing our own XML markup structure, and an evaluation of the benefits/disadvantages of this approach, will form an important part of the final report.

Document Contents

Technical responsibilities

Report on modular approach

Team members