HdtDep is a search engine for a treebank consisting of the first book of Herodotus' Histories. The treebank is encoded in an XML file based on A. Godley's Loeb edition (1920), available under a Creative Commons Attribution-ShareAlike 3.0 United States License on the Perseus Project website. All typos have been corrected.

The Greek characters are encoded in the UTF-8 Unicode format. The XML files is structured in <chapter> and <sentence> node, which contain <word> nodes. All punctuation was removed. Since the UTF-8 format encodes graphemes with different diacritics as distinct glyphs, all grave accents have been turned into acutes (in order to improve the searchability). Enclisis accents have been removed. All elided vowels have been restored. Moreover, all crasis forms have been resolved into uncontracted words, in order to correctly represent their syntactic relationship.

The syntactic structure of the sentences has been described by applying an adapted version of Igor Mel'čuk's dependency theory (Mel'čuk 1988: Dependency Syntax: Theory and Practice, Albany; Mel'čuk 2009: 'Dependency in Natural Language', in A. Polguère & I. Mel'čuk, Dependency in Linguistic Description, Amsterdam - Philadelphia: 1-110; Vatri 2011: Syntactic dependencies in Classical Greek [submitted]). Each word is annotated with the element it depends on and its grammatical category/sub-category (see below). Nouns, adjectives and verbs also contain the Attic lexical entry under which they appear in LSJ. The syntactic relationship types, whose interpretation is highly theory-dependent, has not been encoded.

The project was presented at the 2011 Digital Classicist seminar in London. Further technical details on the project are available in the published presentation.

1. Search page

Add new element

The button adds a new element to the query. The elements will be searched in the exact order in which they appear in the list.

Show category labels

If the box is ticked, the grammatical categories of the searched elements will be shown in the search results. The abbreviations are listed here.

ID

The figure identifying the element in the query.

Category

Select the grammatical category to which the element belongs. The list contains 6 special categories:

Morph. button

The Morph. button prompts users to select the desired morphological categories to be included in the query for each word. N.B.: all selected categories have to occur together for that word to be returned as a result (boolean AND). When one or more categories are selected, the text in the button will be displayed in red.

Dependency

Users can type the ID of the element the word must depend on. If left blank, dependency information for that word will be ignored. If the word must be the head of the sentence, the ID to be entered is 0. If the operator not is added before the ID (with a space), the search engine will return words that do not depend on the specified ID. The operator = can be used to specify that the word must depend on the same word in the sentence (which need not be entered) as the element indicated by the ID.

Contiguous

Specify if the word must or must not immediately follow the previous element. If this is not relevant, the option Both must be selected (as it is by default).

2. Search results

The results are listed in the order in which they occur in the text. The sentence node and the chapter they belong to are also displayed. By clicking on the "sentence" link, the entire sentence will be loaded and displayed in a table. The first row contains the exact wording, the second one contains the grammatical category to which each word belongs (abbreviation list), and the third row shows the element each word depends on. On suitable browsers (not on Internet Explorer), a "View" link will appear, which will load a self-generated graphic representation of the dependency structure (dependency arcs).

3. Abbreviations

ADJ adjective
ADJ ANA anaphoric adjective
ADJ DEM demonstrative adjective
ADJ IND indefinite adjective
ADJ INT interrogative adjective
ADJ NOU substantivized adjective
ADJ NUM numeral (ordinal) adjective
ADJ POS possessive adjective
ADJ REL relative adjective
ADJ VER verbal adjective
ADV adverb
ADV INT interrogative adverb
ADV NOU substantivized adverb
ADV REL relative adverb
ART article
CON subordinating conjunction
CON PAR coordinating conjunction
ENDO parenthesis
ITJ interjection
NEG CON PAR negative coordinating conjunction
NEG PAR negative particle
NOU noun
NOU PPR proper name
NOU VOC vocative
PAR particle
PRE preposition
PRO pronoun
PRO ANA anaphoric pronoun
PRO DEM demonstrative pronoun
PRO IND indefinite pronoun
PRO INT interrogative pronoun
PRO NUM numeral (ordinal) pronoun
PRO PER personal pronoun
PRO POS possessive pronoun
PRO REL relative pronoun
PRO REL IND relative-indefinite pronoun
VER FVE finite verb
VER INF infinitive
VER INF NOU substantivized infinitive
VER PPL participle
VER PPL NOU substantivized participle