Gabriel Bodard, Paul Spence
This is a brief, preliminary indication of the technical background to the EPAPP project and ALA website. A fuller technical report will be found on the Inscriptions of Aphrodisias site. (InsAph Technical Report)
When this project began, the first task was to acquire the contents of the original 1989 printed volume in electronic format.
Charlotte Roueché scanned the narrative and commentary material, and began to update it in MS Word documents. Once these
were relatively stable, Paul Spence carried out extensive analysis with Charlotte Roueché and designed a special DTD for this
material; John Lavagnino wrote Perl scripts that converted the Word documents to a single XML file. This file was then further
edited by hand, both to correct errors and improve content, and to add deep encoding, keywords and internal linking. Some
elements of this file were later turned into free-standing TEI documents, but the bulk is still in this home-grown markup
The 230 existing and 23 new inscriptions were a larger job. The EpiDoc DTD devised by Tom Elliott and colleague in Chapel
Hill was discussed and revised by the EPAPP technical team of Gabriel Bodard, John Lavagnino and Paul Spence, and then the
first dozen inscriptions were marked up by hand by Gabriel Bodard. John Lavagnino used the online Pizza Chef to generate a
new TEI EpiDoc DTD (which Tom Elliott now maintains), and then wrote a further Perl script to convert the MS Word documents
containing the inscription description, commentary and history data into TEI XML format. The Greek text was entered in Beta
Code in the first instance by Gabriel Bodard, and hand coded in EpiDoc.
Paul Spence designed and implemented the application that generates a website from these various XML files. Ever since the
first test output, which included some indices and XSLT checking pages, there has been an iterative process correcting and
improving the XML documents.
In the meantime, Gabriel Bodard compiled a large databank of digital images of the Aphrodisias images. Most of these images
were scanned as high-quality TIFFs from black and white photographs taken by Charlotte Roueché and others; some digital images
were received from New York University and other repositories, and more recently Charlotte Roueché has taken colour photographs
with a new digital camera. These images are managed in a FileMaker Pro database designed with the help of Hafed Walda. Plans
of the site were marked up in Adobe Illustrator by Thomas Roueché, and turned into the dynamic plans on the website by Martyn
Jessop and Paul Vetch.
Once all the data was in place, in XML files, in a preliminary form, the project benefitted from the addition to the EPAPP
technical team of Juan Luis Garcés and Zaneta Au, on the XML and XSLT fronts respectively. Zaneta Au moved the various elements
of the publication into the xMod application, designed by Paul Spence and Paul Vetch (xMod website), and added many new project-specific features.
All Greek words and names in the EpiDoc inscriptions have been lemmatised by Gabriel Bodard and Juan Garcés, and now make
up the bulk of the indices and search terms in the ALA site. Many more search terms were added to the markup by Charlotte
Roueché. At a late stage Zaneta Au has done a huge amount of work on the design and implementation of XSLT stylesheets for
the publication, tables, and indices, with some help from Gabriel Bodard.
At all stages we have received technical advice from collaborators via email discussion lists and at international workshops
organised by the EPAPP and InsAph projects. In particular, Tom Elliott and Hugh Cayless have advised on EpiDoc XML and helped
in countless other ways. Users of the test website, and especially those who returned the user questionnaire, are also owed
a vote of thanks for their contribution to the testing and evaluating phase of this project.
Report on modular approach
The exploratory nature of the project to create a digital version of Aphrodisias in Late Antiquity has been explained on this site's homepage. Digitising and integrating this rich and diverse set of materials, while attempting
to ensure that they could reference, and be referenced by, other research in the field, presented us with many challenges
familiar to humanities computing scholars:
- Repurposing of material for different ends;
- Integration of material with varied structures;
- Integration of material with other repositories and reference works;
- Complex linking between objects;
- Conversion of data from proprietary formats (e.g. MSWord, Greek fonts) to portable formats (e.g. XML, Unicode);
- Ensuring that the technical approaches are developed to suit the scholarly objectives, and that they are relatively easy for
a person with little formal technical training to learn;
- Hence ensuring that any approaches taken work on a variety of environments, meet international standards and rely on free
'open source' technologies, wherever possible, and documenting new techniques and tools;
- Problems of character encoding: in particular encoding, presentation and processing of ancient Greek
We are still digesting many of the lessons learnt during this exercise, and this report is merely a summary of some of the
main points that will eventually be covered in the technical preface.
The adoption of XML, and in particular flavours of XML based on the TEI standard, is one of the main technical pillars of
- TEI for presentational web pages;
- EpiDoc TEI for the inscriptions;
- A home-grown XML markup system for the digitised version of the commentary and reference material–however, this was largely
based on TEI, and will indeed be converted to TEI very soon for archival purposes;1
- Home-grown XML for the look-up tables and parameters that are used in the publishing process.
The overall TEI XML publishing framework is based on an application developed by Paul Spence and Paul Vetch at the Centre
for Computing in the Humanities, called 'xMod'.
This highly modular application aims to keep separate four fundamentally different kinds of activities:
- The creation and mark-up of content, so that it can be archived in ways which will promote its longevity, independent of current technological fashions
- The presentation of data, re-purposed to give a particular 'view' of a document collection at a particular moment in time
- The visual design that is applied as a 'skin' to the presentation
- The underlying technologies and programming that bring all of the components together and drive the different processes
At present, it produces static web pages (using XSLT and Perl), but we are currently examining 'dynamic' XML publishing solutions,
including some that allow for semantic searching of XML markup.
Research Associate, Greek texts and XML
- Martyn Jessop
- Dr John Lavagnino
- Paul Spence
Technical Project Officers
- Zaneta Au
- Dr Juan Garcés
- Elliott Hall
- Paul Vetch
- Dr Hafed Walda
|Centre for Computing in the Humanities
|King's College London
Thanks for advice and technical help with EpiDoc to: