Page 1 of 1

Some PLANS for the FORUM

PostPosted: January 26th, 2009, 2:19 am
by Jan
I would like to introduce some of the plans that we have for the Lakota Language Forum.

1) Lemmatizer

This weird sounding word represents a very useful tool for language learners and language users. To describe lemmatizer we first have to explain the term lemma. The English words goes, going, went, gone are all forms of one word and that is go. This base form is the citation form that you will find as an entry in a dictionary. But you will not find goes, going, went, gone as separate entries in most dictionaries. The base form (e.g. go) is called lemma.
Because English has very simple word morphology each lemma has but very few forms (most verbs have only four forms, a small number of irregular verbs have five forms, nouns have two forms).
Lakota, on the other hand, is a highly inflectional language with complex morphology which results in a large number of word forms for every lemma. Most stative and active verbs have at least 7 forms but those with changeable A (ablaut) have 21 forms. Transitive verbs have 30 forms and if they allow ablaut then they have 90 forms. These numbers are further multiplied by the number of non-personal prefixes and suffixes that each verb takes. Because of their high number the forms of each Lakota verb cannot be included in a dictionary. Moreover, some of the word forms look very different than the base form, look for example at iwíčhauŋkičupi which is a form of the verb ičú. Such forms represent a difficulty for beginner learners who are unable to associate them with the base forms unless they become familiar with all the personal affixes.
This is where a lemmatizer can help. It is a software tool that can recognize any word form and associate it with the appropriate lemma or base form. For instance if you type in owíčhabluspe the lemmatizer will tell you that it is a form of the verb oyúspA and you can then look that word up in the dictionary.
The lemmatizer will also be able to generate all word forms of each verb. This can be used in several useful ways described below.

2) Find the paradigm for any verb

The list of word forms created by the lemmatizer will enable us to create a conjugation paradigm for each verb and a table with subject-object combinations for each transitive verb. So if you type máni in the lemmatizer you will see the following:

1s mani1d maúŋni1p maúŋnipi
2s mani2p manipi
3s máni3p mánipi

If you type ičú in the lemmatizer you will see the following:


verb-table.jpg (40.91 KiB) Viewed 20270 times


3) Spellchecker

Having all possible word forms available will enable us to create a spellchecker. This will further enhance the ability to type Lakota using consistent spelling and it will be a great tool for learners and language users on all levels.

4) Online dictionary

The lemmatizer will also be an integral part of an online Lakota dictionary. In the future we are hoping to make the dictionary database available online with advanced search options. The online dictionary will be in multimedia format, so it will have not only words but also pictures for nouns and sounds for pronunciation of the words.

Re: Some PLANS for the FORUM

PostPosted: January 28th, 2009, 3:08 pm
by Jan
5) More enhancements of the Virtual Keyboard and the LLC Keyboard and Fonts bundle

We are also hoping to be able to enhance the programming of the two keyboards in such a way that would partly automate adding stress marks to words during typing. We do not yet know for sure if we can program this but so far it is looking promissing (mainly because Kostya Chmielnicky is a genious programmer).

Re: Some PLANS for the FORUM

PostPosted: March 31st, 2009, 1:08 pm
by Jan
This is just an update to inform you that we do continue working on the lemmatizer and the spellchecker. As usual with complex programming projects, it is taking longer than expected, but we are not giving up and we will eventually make those tools available to the forum members.

Also, forum members keep giving us feedback on the interactive textbook and we are grateful for that. I just want to let everybody to know that the interactive textbook is now in a sort of preliminary state. We hope to make it more structured and sequenced in the future.