Diorisis Search 2.0
[ made in 2020 by Alessandro Vatri ]
Diorisis Search is an application designed to build and run complex linguistic queries on the Diorisis Ancient Greek Corpus (Vatri and McGillivray 2018) through an intuitive graphic interface.
Download Mac (OSX 10.10 or higher)
95Mb      82 downloads
Download Windows
100Mb      55 downloads
Download Linux (32 bit)
96Mb      4 downloads
Select and search any of 820 lemmatized Ancient Greek texts
Build complex search patterns through an intuitive interface
Search for forms according to their morphological features
Parse any Greek text and load it into the search engine
Save and reload your queries
Calculate basic frequency data for your results
View occurrences in context
Rank lemmas or word forms matching search patterns by frequency
View the analysis of each word and report errors
Export the results as Excel spreadsheets

download example
Share your results with other users
Create a collection of sentences from your search results
Export sentence collections as Word files
Create parsing exercises
Mark parsing exercises automatically
Export marked scripts and return them to students
Select and search any of 820 lemmatized Ancient Greek texts

Functions

File and corpora

Diorisis Search is capable of:

Online services

Corpus management

Query builder

Searchable elements

Users can include the following types of linguistic items in their queries:

the exact form
A word whose form corresponds exactly (i.e. including the grave vs acute accents) to the user input, unless the option ignore diacritics is selected (this option may also be activated globally). Forms may be input either in Unicode (UTF-8) Polytonic Greek or in BetaCode.
a form containing the sequence
A word node whose form contains the sequence (the string) input by the user.
For instance, πι selects all forms that contain πι (e.g. πιστεύεις, Ἀσκληπιῷ, etc., but not ἐλπίζων or πίπτει, unless the option ignore diacritics is selected). Strings may be input either in Unicode (UTF-8) Polytonic Greek or in BetaCode.
a form of the lemma
A word whose lemma corresponds exactly to the one selected by the user from the list of lemmas occurring in the Diorisis Corpus (and not all lemmas in e.g. LSJ, some of which would not be found anyway!). Lemmas may be input either in Unicode (UTF-8) Polytonic Greek or in BetaCode with all diacritics (even if the option ignoring diacritics is active).
a form of the lemma that contains the sequence
A wordwhose lemma contains the sequence (the string) input by the user.
For instance, πι selects all forms of lemmas that contain πι (e.g. πιστεύω, ἐπιλαμβάνω, etc., but not ἐλπίς, unless the option ignore diacritics is selected). Strings may be input either in Unicode (UTF-8) Polytonic Greek or in BetaCode.
a word with the following morphological features
A word of which at least one possible morphological analysis corresponds to the combination of values input by the user.
the punctuation mark
A punctuation mark corresponding to the user input.

If words are selected according to their form or lemma, users have the option to specify which morphological analyses should be possible for the word, if required.

Search commands

The relationship between linguistic items in the linear order of the sentence can be specified in the following ways:

followed by
Requires that the first word or punctuation mark should be followed by a word or punctuation mark whose features the user will be prompted to specify.
If the first element in the query is a strong punctuation mark (full stop, middle dot, or question mark), the search will be extended to the immediately following sentence (e.g. queries can capture words that follow a question mark within the following sentence).
followed or preceded by
Requires that the first word or punctuation mark should be followed or preceded by a word or punctuation mark mark whose features the user will be prompted to specify.
This command is not available if the first element in the query is a strong punctuation mark (full stop, middle dot, or question mark).
followed by (ignore punctuation)
Requires that the first word or punctuation mark should be followed by a word mark whose features the user will be prompted to specify.
If the first element in the query is a strong punctuation mark (full stop, middle dot, or question mark), the search will be extended to the immediately following sentence (e.g. queries can capture words that follow a question mark within the following sentence).
NB: The scope of the search is defined by counting only the number of word nodes.
For instance, in the sequence [ ὦ ἄνδρες, ἐγὼ ], ἐγὼ counts as immediately following ἄνδρες (scope = 1).
If used to include in the query more than one element after the first, this command will appear as preceded by the word and in the drop-down menu.
followed or preceded by (ignore punctuation)
Requires that the first word or punctuation mark should be followed or preceded by a word whose features the user will be prompted to specify.
NB: The scope of the search is defined by counting only the number of word nodes.
For instance, in the sequence [ ὦ ἄνδρες, ἐγὼ ], ἐγὼ counts as immediately following ἄνδρες (scope = 1).
When any of these commands is selected, the user will also be prompted to indicate the scope of the search (i.e. the required distance or range of distances of the target from the first element).
or
Specifies that the first element may be defined by an alternative set of features. This command is only available while specifying the first element of the search.
ignoring diacritics
Requests that all diacritics signs be ignored in all form- and lemma-based searches. This option may be activated selectively for individual elements.

All elements may be searched for negatively, that is, it is possible to search for elements that match any feature but those specified (e.g. anything but the exact form instead of the exact form).

The maximum scope of a search is one sentence. In the Diorisis Corpus, sentences are defined as sequences of words and punctuation marks delimited by a strong punctuation mark (full stop, middle dot, or question mark).

Within the sentence, searches for individual elements to follow or precede the first one may be restricted to a specific range (scope) for each element. The following options are available:

within
The search engine will search for the specified element within one and the specified number of elements from the first element.
For instance, a search for the form ἀνὴρ within 3 words after the form ὁ will capture sequences like ὁ ἀνὴρ, ὁ δ’ ἀνὴρ, ὁ αὐτὸς ἀνὴρ, or ὁ δ’ αὐτὸς ἀνὴρ.
With the commands followed by or followed or preceded by, the range is calculated counting the number of word and punct nodes.
between
The search engine will search for the specified element in elements at a distance from the first element ranging within the specified lower and upper end.
For instance, a search for the form ἀνὴρ between 2 and 3 words after the form ὁ will capture sequences like ὁ δ’ ἀνὴρ, ὁ αὐτὸς ἀνὴρ, or ὁ δ’ αὐτὸς ἀνὴρ, but not ὁ ἀνὴρ.
With the commands followed by or followed or preceded by, the range is calculated counting the number of word and punct nodes.
exactly
The search engine will search for the specified element in elements at the specified distance from the first element.
For instance, a search for the form ἀνὴρ exactly 2 words after the form ὁ will capture sequences like ὁ δ’ ἀνὴρ or ὁ αὐτὸς ἀνὴρ, but not ὁ ἀνὴρ or ὁ δ’ αὐτὸς ἀνὴρ.
With the commands followed by or followed or preceded by, the range is calculated counting the number of word and punct nodes.
With the commands followed by (ignore punctuation) or followed or preceded by (ignore punctuation), the range is calculated counting the number of word nodes only (and, as a consequence, it is wider).
in the same sentence
The search engine will search for the specified element in elements in the same sentence as the first element.

Results

Diorisis Search returns the following data:

Result sentences can be saved in the Saved Sentences workbook. Saved sentences are temporarily stored in memory, along with their exact reference, and may be viewed and as a Microsoft Word 2010+ docx document for use e.g. in handouts, exercises, or other teaching materials.

Teaching Tools

Diorisis Search can be used to create and mark parsing exercises.

User Reviews

Caio Borges Geraldes
App version: 1.0
Researcher (Linguistics)
Reviewed in Brasil on 11/05/20 19:33:32
The search engine is very good and has great flexibility, surely something I will be using in my research from now on. I have two suggestions that would make the engine more usefull for my own research and maybe for other colleagues : 1. A minor improvement would be to include a native Linux version. It is possible to run it with Wine on Linux machines, but the loading time with Wine is a bit too long and might be affecting the query times. It might be quite simple to do so, but I am not so sure. 2. The major improvement I would like to see is to include not only the frequency counts of the query's result, but also the sentences themselves in the export result file. It would make the application way more effective for building research databases for further annotation. It seems to me that this addition can be easily implemented since the engine is already returning this data. Thank you very much for the project!

Changelog

Version 2.0
Version 2.0β
Version 1.02
Version 1.01
Version 1.0