Corpora and Language
Eugenia Beu-Dachin, The Latin Language in the Inscriptions from Roman Dacia
The topic of my research is the linguistic study (phonetic, morphological, syntactical, lexical) of the approximately 4000 Latin inscriptions from the Roman province Dacia. Its aim is to contribute to a deeper understanding of how the Latin language evolved in this eastern province of the multilingual Roman Empire. In order to facilitate my work and to do it accurately, I started building a corpus of inscriptions, using computational methods for the purposes of encoding, concordancing, search and statistical analysis, and for deriving electronic editions of the corpus.
The inscriptions are encoded using TEI XML, and transformations are applied using XSL stylesheets to render full text editions at the different levels of coding. I used the standards which have been set out in The Menota handbook (Guidelines for the electronic encoding of Medieval Nordic primary sources), ed. Odd Einar Haugen (see: http://gandalf.aksis.uib.no/menota/guidelines/index.html). The inscriptions were encoded on three different levels: facsimile (the text is represented exactly as it appears on the ancient material), diplomatic (this provides the version which is given by the editor of the text, abbreviations are expanded using special marks) and normalised (at this level, each word is represented in accordance with the grammatical rules, so the text corresponds to a literary version).
Based on this corpus, a concordance of words will be made in order to study the deviation between classical Latin and the Latin in the inscriptions, analyzing the peculiarities of the local Latin.