User's manual

User roles

Each user has a role in the system: reader, editor or administrator:

Reader – may view documents
Editor – may edit the documents (Section 6).
Administrator – has access to configuration (Section 7).

Roles are assigned to users by administrator.

Getting started

System can be accessed from: IA tagger home

It requires: Chrome, Opera or Mozilla Firefox. To start, log in. The default password is tagger, which can be changed at any time.

login	<request login from administrator>
password	tagger

How to upload a text document

Select Documents. Click File/Wybierz dokument to select a text document that should be uploaded. In the Language window select the language of the document. Click Submit, to confirm the choice of the file and the language. The name of the file will be added to the list of documents uploaded so far (Documents list).

During the upload, the text is split into sentences and words. One line corresponds to one sentence. A string of characters between spaces is interpreted as a word.

How to open a document

Select the Documents option and click the name of the document from the list (or the adjacent icon).

How to save a document

You don’t have to save the document. Each modification is automatically saved.

How to annotate a document

An opened document is split into sentences. Click next to the sentence you want to annotate. You can navigate between sentences using CTRL + arrow (up or down).

Adjusting the sentence splitting

You can adjust the automatic sentence splitting. Switch on the Edit mode (just below the menu). You can split the sentences with the scissors icon () or merge them with the glue icon ().

Editing the contents of a sentence

You can modify a sentence by adding or deleting words. Click any word in a sentence. You will see the following icons:

- insert a new word before the current one - available for all words, which are not postpositions
- add a new word after the current one - available only for the last word
- remove the word (available only for words that do not have postpositions; to remove a word that has a postposition, you must first separate it from the postposition)
- or (Ctrl-y) - mark the next word as a postposition (available only for words that do not have postpositions, are not postpositions themselves and do not end the line)
- or (Ctrl-u) - separate the word from the postposition (available only postpositions or words having postpositions).

Word breaking

Words can be split into a stem and a suffix: Click the word, select the position of the split with the mouse and press Ctrl-J. Confirm with Enter. To remove the split press Ctrl-K and confirm with Enter.

Word annotation

Words can be annotated at six different levels (levels can be set in Configuration menu, Section 7). To annotate a word at the selected level click on the level under the word you annotate.

Annotation levels

LEXEME = equivalent English gloss
GRAMMAR = grammatical role.
Type the starting character(s) of the tag and you will see the list of all tags beginning with the typed characters. For example: typing the ‘f’ character displays three tags starting with ‘f’: F (feminine), FOC (focus), FUT (future); typing the „pr” string displays six tags starting with „pr”. Click the tag from the list to annotate the word.
OTHER LEVELS
Annotation at other levels consists in selecting one or more suggestions offered by the system. To select or deselect the offered suggestion click it or press the shortcut button (given in the brackets). Moving to another edit box or pressing enter saves the value of the current edit box.

Sentence annotation

Sentences are annotated at two levels (levels are set up in the Configuration menu, Section 7): Add info (additional information) and English (English translation). At both levels sentences are annotated manually.

Tags suggested by the system

The suggestion cloud

IA tagger suggests tags for words that fulfil one of the conditions:

The same word has already occurred in one of the documents and has been tagged.
The structure of the word allows the system to automatically deduce its tags.

Suggestions for tags appear in a cloud above the word. At most three suggestion lines may appear for one word. Each line suggests tags for a fee levels. For an instance, the suggestions for the word nagari looks as follows:

You can accept a chosen suggestion by clicking the ‘check’ symbol on the left side of a cloud. In the above example, the first 2 suggestions are generated according to annotations already existing in the system. By clicking the edit icon you can go to the word that was used to generate the suggestion. The number in the red border is the suggestion frequency score. It corresponds to the number of words in the system annotated with the tags of the suggestion.

The third suggestion in the example is marked with the letter R in a red border. This means that it was generated according to predefined rules. The rule used to generate this suggestion is "*|i". It catches words with the suffix "i".

Accepting the first suggestion (shortcut ctrl + 1) would add the following tags to the annotation:

Level	Tags
LEXEME	town
GRAMMAR	LOC, M, SG
POS	NOUN

Tags overwritten by suggestions

Note that if the word for which the suggestions were generated had already been annotated, the suggested tags would overwrite existing annotations on corresponding levels. Original annotations on the levels which are absent in the suggestion are left untouched after applying this suggestion. In the above example, let us suppose that the word nagari already has the following annotations:

Level	Tags
GRAMMAR	INS
SEM	REC

If we now apply the first suggestion, the tag INS on the level GRAMMAR will be overwritten with 3 tags: LOC, M and SG, while the tag REC on the level SEM will be left untouched.

System configuration

The configuration of the system can be set solely by the administrator - other users don’t have access to configuration options.

By clicking Configuration you can manage the following options:

Users

This option allows to manage system users. You can add a new user (+ Add user) or delete a user (Delete). You can assign or change a role of the user (Change Role). You can reset the user’s system access password to a default password (Reset password), (e.g. when user forgets the password).

Languages

This option allows to manage languages of tagged documents. You can add a new language (+ Add language). The language will be added to the list of languages supported by the system. Once a new language is added, opening and tagging documents saved in this language becomes possible.

You can delete a chosen language (Delete) or edit it (Edit). Language edition allows you to change a language code (Code), which helps in shorter identification of the language, or to change the language description (Description).

Word annotation levels

This option allows you to manage the levels of word annotation. For each level you manage the following features:

Name
Description
Strict Choice - if this option is selected (+), annotations on this level have the form of tags (e.g. POS), not text (e.g. LEXEME)
Multiple Choice - if this option is selected (+) in addition to the above, tags for the words are chosen from a long list by their names (e.g. GRAMMAR)

Not selecting any of these options means that the word on this level should be annotated with text (e.g. LEXEME).

Levels are set in a pre-defined order. For example, by default the POS level is set as the first level. You may change the order of the levels by using arrows in the Order column. You can add a new level (+ Add word annotation level) or delete an existing level (Delete). You can edit the existing level (Edit), i.e. edit its name or the description, or modify Strict Choice and Multiple Choice.

For each level you can add or delete tags. To do so, you can use the option Edit tags. For each tag you should define its value (Value) and description (Description). In the process of tagging the editor may annotate words solely with tag values defined for each particular level.

You can change the order, in which tags are suggested to user. For example, for the POS level, the first suggestion on the list is the NOUN tag. You can move it to another position on the list by using arrows.

Sentence annotation level

This option allows you to manage levels of the sentence annotation, i.e. the levels, on which the Editor annotates whole sentences. Managing sentence annotation levels is similar to managing word annotation levels (word tags are not configured though).

Statistics

You can generate statistics of selected/tagged texts by selecting the Statistics option from the main menu. There are two types of statistics: word statistics and collocation statistics.

Word statistics

You can display word statistics by selecting a caption starting with a black dot, e.g.

Verb participles (PTCP on the level "Grammar", V on "SYNTAX")

The system displays information on the number of words, for which the tags for the GRAMMAR level include PTCP and the tags for the SYNTAX level include V. Moreover, the list of words meeting the criteria, along with all tags, is displayed.

Colocation statistics

You can display colocation statistics by selecting a caption starting with an empty dot, e.g.

PTCP(PTCP) + A(INS)

The system displays information on pairs of words occurring in the same sentence, first of which meets the criteria PTCP(PTCP) (tags for both GRAMMAR and POS include PTCP) and the other meets the criteria A(INS) (tags for SYNTAX include A and tags for GRAMMAR include INS). Moreover, for each pair the system gives information on a distance between the words in the text (how many words are between them).

Filtering by documents

All types of statistics offer the feature of per document filtering. At the top of the Statistics page you can find the list of all documents in the system with corresponding checkboxes. If the checkbox is full, the words from this document are counted and shown in the statistics. If the checkbox is empty, all the words from the corresponding document are ommitted in the statistics.

Clicking on the checkboxes refreshes the statistics automatically.