Update: 2012-11-25 04:30 AM +0630

Morphology : Inflection

morpho.htm

collection by U Kyaw Tun, M.S. (I.P.S.T., U.S.A.). Not for sale. No copyright. Free for everyone. Prepared for students of TIL Computing and Language Center, Yangon, MYANMAR .

index.htm | |Top
lang-mean-indx.htm

Contents of this page

Morphology
History of morphological analysis
Aṣṭādhyāyī [ अष्टाध्यायी meaning "eight chapters"]
FUNDAMENTAL CONCEPTS
Lexemes vs. word forms
Inflexion vs word formation
Paradigms and morphosyntax
Allomorphy
Lexical morphology
Morpheme-based morphology
Lexeme-based morphology
Word-based morphology
Morphological typology

UKT Notes
• agglutinative language (some Tibeto-Burman) • morpheme • null morpheme • syntax

Noteworthy passages in this file: (always check with the original section from which they are taken.)
• In the 19th century, philologists devised a now classic classification of languages according to their morphology. According to this typology, some languages are isolating, and have little to no morphology; others are agglutinative, and their words tend to have lots of easily separable morphemes; while others yet are inflectional or fusional, because their inflectional morphemes are "fused" together. This leads to one bound morpheme conveying multiple pieces of information. The classic example of an isolating language is Chinese; the classic example of an agglutinative language is Turkish; both Latin and Greek are classic examples of fusional languages. -- From Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801
• An agglutinative language is a language that uses agglutination extensively: most words are formed by joining morphemes together. -- Wikipedia: http://en.wikipedia.org/wiki/Agglutinative 090830.
[UKT: Is it possible that Burmese-Myanmar is agglutinative ? In the example, {nwa: kyaung: tha:} 'cow herd', we have the morphemes {nwa:} 'cattle', {kyaung:} 'to protect, look after, direct, herd', {tha:} 'son, person (male)' strung together in a certain order and not in any other.]
• ... in analytic languages [non-inflectional] context and syntax are more important than morphology.

Contents of this page

Morphology

UKT: Before we begin we should at least know what a morpheme is.

by UKT based on Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801

Morphology is the identification, analysis and description of the structure of words (words as units in the lexicon are the subject matter of lexicology). While words are generally accepted as being (with clitics) the smallest units of syntax, it is clear that in most (if not all) languages, words can be related to other words by rules. [UKT ¶ ]

For example, English speakers recognize that the words <dog>, <dogs>, and <dog catcher> are closely related. English speakers recognize these relations from their tacit knowledge of the rules of word formation in English. They infer intuitively that <dog> is to <dogs> as <cat> is to <cats>; similarly, <dog> is to <dog catcher> as <dish> is to <dishwasher>. [UKT ¶ ]

The rules understood by the speaker reflect specific patterns (or regularities) in the way words are formed from smaller units and how those smaller units interact in speech. In this way, morphology is the branch of linguistics that studies patterns of word formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages.

UKT: Continued below.

Contents of this page

History of morphological analysis

by UKT based on Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801

The history of morphological analysis dates back to the ancient Indian linguist Pāṇini {pa-Ni.ni.}, who formulated the 3,959 rules of Sanskrit morphology in the text Aṣṭādhyāyī [ अष्टाध्यायी meaning "eight chapters"] by using a Constituency Grammar. The Greco-Roman grammatical tradition also engaged in morphological analysis. Studies in Arabic morphology, conducted by Marāḥ al-arwāḥ and Aḥmad b. ‘alī Mas‘ūd, date back to at least 1200 CE. ^[1]

The term morphology was coined by August Schleicher in 1859. ^[2]

UKT: Note that the period 1824-1852 covers the years when Britain was "swallowing" Burma piecemeal in the form of the First and Second Anglo-Burmese wars. The European philologists on their parts tried to understand the cultures and the languages of the "natives" of India and Burma whom the colonialists hope to Europeanize. In the process the philologists were surprised to find that the Easterners were far more advanced than even the Greeks whom the Europeans had thought were the most advanced people in ancient times. They came on the works of linguists such as Pāṇini {pa-Ni.ni.}, Yāska, and those before them.
The person mentioned by both Pāṇini and Yāska was Śākaṭāyana . The following is from: http://en.wikipedia.org/wiki/Sakatayana 090822

Śākaṭāyana is a Sanskrit grammarian of Iron Age India (fl. roughly 8th c. BCE). His work is referred by scholars such as Yāska (around 7th c. BCE) and Pāṇini (circa 5th c. BCE), as well as other Sanskrit grammarians, but is lost to us today.

UKT: Gautama Buddha must have known about the Sanskrit grammarians like Śākaṭāyana and Yāska, and possibly Pāṇini (if only Pāṇini had come before the Buddha - a question which we do not know for certain). The Buddha must have realised that by letting Sanskrit into the liturgy of Buddhism prescribed for his monks, he would be bringing the idea of the Hindu Creator into his religion. Since, he (the Buddha) did not accept the concept of Creation, he condemned those monks who brought up the proposal:

'Bhikkhus, you are not allowed to express the Buddha's words in Sanskrit. Those who act contrarily will be considered as having committed the offence of Dukkata {doak~ka.Ta.}.' -- from: The Vinaya Pitakam, ed. by Hermann Olderberg, Vol. II, The Cullavagga, London, 1880, p. 139. Quoted by Chi Hisen-lin, Language Problem of Primitive Buddhism, Journal of the Burma Research Society, XLIII, i, June 1960

He [Śākaṭāyana] claimed that all nouns are ultimately derived from some verbal root. This process is reflected in the Sanskrit grammar as the system of krit- pratyayas or verbal affixes.

In his The word and the world, the philosopher Bimal Krishna Matilal refers to this debate (which lasted several centuries) as an

interesting philosophical discussion between the nairuktas or etymologists and the pāṇinīyas or grammarians. According to the etymologists, all nouns (substantives) are derived from some verbal root or the other. Yāska in his Nirukta refers to this view (in fact defends it) and ascribes it to an earlier scholar Śākaṭāyana. This would require that all words are to be analysable into atomic elements, 'roots' or 'bases' and 'affixes' or 'inflections' — better known in Sanskrit as dhātu and pratyaya [...] Yāska reported the view of Gārgya who opposed Śākaṭāyana (both preceded Pāṇini who mentions them by name) and held that not all substantival words or nouns (nāma) were to be derived from roots, for certain nominal stems were 'atomic'. (p. 8-9)

His text may have been called the Lakṣaṇa Śāstra, in which he also describes the process of determining grammatical gender in animate and inanimate creation.

UKT: End of Wikipedia: http://en.wikipedia.org/wiki/Sakatayana 090822

Contents of this page

FUNDAMENTAL CONCEPTS

Lexemes vs. word forms

UKT: Remember what we mean by "lexeme"?

lex·eme n. 1. The fundamental unit of the lexicon of a language. [The words] <find>, <found>, and <finding> are members of the English lexeme <find>. [ lex(icon) -eme ] -- AHTD
UKT: I've edited this entry from AHTD inserting the <...> to suit the TIL bracket convention.)

by UKT based on Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801

The distinction between these two senses of "word" is arguably the most important one in morphology. The first sense of "word", the one in which dog and dogs are "the same word", is called a lexeme. The second sense is called word form. We thus say that:

<dog> and <dogs> - same lexeme, different forms : inflectional rules
<dog> and <dog catcher> - different lexemes (different kinds of entities) : word formation rules to form new lexemes

The form of a word that is chosen conventionally to represent the canonical form of a word is called a lemma, or citation form.

Prosodic word vs. morphological word

UKT: Not copied

Inflexion vs word formation

Given the notion of a lexeme, it is possible to distinguish two kinds of morphological rules. Some morphological rules relate to different forms of the same lexeme; while other rules relate to different lexemes. Rules of the first kind are called inflectional rules, while those of the second kind are called word formation. The English plural, as illustrated by <dog> and <dogs>, is an inflectional rule; compounds like <dog catcher> or <dishwasher> provide an example of a word formation rule. Informally, word formation rules form "new words" (that is, new lexemes), while inflection rules yield variant forms of the "same" word (lexeme).

There is a further distinction between two kinds of word formation: derivation and compounding. Compounding is a process of word formation that involves combining complete word forms into a single compound form; <dog catcher> is therefore a compound, because both <dog> and catcher are complete word forms in their own right before the compounding process has been applied, and are subsequently treated as one form. Derivation involves affixing bound (non-independent) forms to existing lexemes, whereby the addition of the affix derives a new lexeme. One example of derivation is clear in this case: the word independent is derived from the word dependent by prefixing it with the derivational prefix in-, while dependent itself is derived from the verb depend.

The distinction between inflection and word formation is not at all clear cut. There are many examples where linguists fail to agree whether a given rule is inflection or word formation. The next section will attempt to clarify this distinction.

Word formation is a process, as we have said, where you combine two complete words, whereas with inflection you can combine a suffix with some verb to change its form to subject of the sentence. For example: in the present indefinite, we use ‘go’ with subject I/we/you/they and plural nouns, whereas for third person singular pronouns (he/she/it) and singular nouns we use ‘goes’. So this ‘-es’ is an inflectional marker and is used to match with its subject. A further difference is that in word formation, the resultant word may differ from its source word’s grammatical category whereas in the process of inflection the word never changes its grammatical category.

UKT: More on word-formation in word-forma.htm

Paradigms and morphosyntax

A linguistic paradigm is the complete set of related word forms associated with a given lexeme. [UKT ¶ ]

The familiar examples of paradigms are the conjugations of verbs, and the declensions of nouns. [UKT ¶ ]

UKT: Since Burmese-Myanmar is non-inflectional whereas Pali-Myanmar is highly inflectional, when an Burmese speaker starts to study Pali, the conjugation of verbs and the declensions of nouns present the greatest difficulty. However, since English is less inflectional than Pali, it may be used as an introduction to Pali-Myanmar. -- UKT 090902]

Accordingly, the word forms of a lexeme may be arranged conveniently into tables, by classifying them according to shared inflectional categories such as tense, aspect, mood, number, gender or case. For example, the personal pronouns in English can be organized into tables, using the categories of person (1st., 2nd., 3rd.), number (singular vs. plural), gender (masculine, feminine, neuter), and case (subjective, objective, and possessive). See English personal pronouns for the details.

The inflectional categories used to group word forms into paradigms cannot be chosen arbitrarily; they must be categories that are relevant to stating the syntactic rules of the language. For example, person and number are categories that can be used to define paradigms in English, because English has grammatical agreement rules that require the verb in a sentence to appear in an inflectional form that matches the person and number of the subject. In other words, the syntactic rules of English care about the difference between dog and dogs, because the choice between these two forms determines which form of the verb is to be used. In contrast, however, no syntactic rule of English cares about the difference between dog and dog catcher, or dependent and independent. The first two are just nouns, and the second two just adjectives, and they generally behave like any other noun or adjective behaves.

An important difference between inflection and word formation is that inflected word forms of lexemes are organized into paradigms, which are defined by the requirements of syntactic rules, whereas the rules of word formation are not restricted by any corresponding requirements of syntax. Inflection is therefore said to be relevant to syntax, and word formation is not. The part of morphology that covers the relationship between syntax and morphology is called morphosyntax, and it concerns itself with inflection and paradigms, but not with word formation or compounding.

In the exposition above, morphological rules are described as analogies between word forms: <dog> is to <dogs> as <cat> is to <cats>, and as <dish> is to <dishes>. In this case, the analogy applies both to the form of the words and to their meaning: in each pair, the first word means "one of X", while the second "two or more of X", and the difference is always the plural form <-s> affixed to the second word, signaling the key distinction between singular and plural entities.

Contents of this page

Allomorphy

by UKT contd. from Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801

One of the largest sources of complexity in morphology is that this one-to-one correspondence between meaning and form scarcely applies to every case in the language. [UKT ¶ ]

In English, we have word form pairs like <ox/oxen>, <goose/geese>, and <sheep/sheep>, where the difference between the singular and the plural is signaled in a way that departs from the regular pattern, or is not signaled at all. Even cases considered "regular", with the final <-s>, are not so simple; the <-s> in <dogs> is not pronounced the same way as the <-s> in <cats>, and in a plural like <dishes>, an "extra" vowel appears before the <-s>. These cases, where the same distinction is effected by alternative forms of a "word", are called allomorphy.

UKT: Compare the singular/plural forms, where the plural form ends in a fricative <-s>:

1. <cat/cats> : /kæt/ --> /kæts/ simple addition of plural marker ending with sound /s/

2. <dog/dogs> : /dɒg/ --> /dɒgz/ -- simple addition of plural marker and ending sound /z/

3. <dish/dishes> : /dɪʃ/ --> /dɪʃəz/ -- addition of plural marker not simple and ending sound /z/
(UKT: See DJPD16-155 which does not add /ə/)

Phonological rules constrain which sounds can appear next to each other in a language, and morphological rules, when applied blindly, would often violate phonological rules, by resulting in sound sequences that are prohibited in the language in question. For example, to form the plural of <dish> by simply appending an <-s> to the end of the word would result in the form *[dɪʃs], which is not permitted by the phonotactics of English. In order to "rescue" the word, a vowel sound is inserted between the root and the plural marker, and [dɪʃəz] results. Similar rules apply to the pronunciation of the <-s> in <dogs> and <cats>: it depends on the quality (voiced vs. unvoiced) of the final preceding phoneme.

UKT: Phonotactics:

When you pronounce a low back vowel such as {au} you can pronounce it in many ways:
¤ with rounded lips producing /ɒ/ or
¤ with lips unrounded producing /ɑ/

Moreover, if you are a Burmese-Myanmar you can still vary it by pitch-registers as:
¤ creak - {aau.}
¤ modal - {aau} the same as {au}
¤ emphatic - {aau:}
However, if you are a native English speaker, you would vary it by short and long vowels.

In other words, the way you pronounce a word depends on what set of vocal muscles you use to produce it and that I presume depends on not only your ethnicity but also your exposure to a spoken language. This, second variation has been observed in the case of native speakers of Yangon who have been transferred to Tavoy where they use a different dialect. A native Yangonian sounded Tavoyan after about six months.
I have observed the second variation in my case also. Thus, if I were pronouncing the word <father> I would pronounce the vowel differently whether I was being exposed to Canadian English (if I had been staying in Canada, say, more than six months or so), or English as being spoken in Myanmar (if I had been staying in Myanmar and speaking mostly Burmese all the time). -- UKT 090903

The following is from: http://en.wikipedia.org/wiki/Allomorph 090912

An allomorph is a linguistics term for a variant form of a morpheme. The concept occurs when a unit of meaning can vary in sound (phonologically) without changing meaning. It is used in linguistics to explain the comprehension of variations in sound for a specific morpheme.

Allomorphy in English suffixes

English has several morphemes that vary in sound but not in meaning. Examples include the past tense and the plural morphemes.

For example, in English, a past tense morpheme is <-ed>. It occurs in several allomorphs depending on its phonological environment, assimilating voicing of the previous segment or inserting a schwa, /ə/, when following an alveolar stop [UKT: dental stop for Burmese speaker?]:

• as /əd/ or /ɪd/ in verbs whose stem ends with the alveolar stops /t/ or /d/, such as <hunted> /hʌntəd/ or <banded> /bændəd/

• as /t/ in verbs whose stem ends with voiceless phonemes other than /t/, such as <fished> /fɪʃt/

• as /d/ in verbs whose stem ends voiced phonemes other than /d/, such as <buzzed> /bʌzd/

Notice the "other than" restrictions above. This is a common fact about allomorphy: if the allomorphy conditions are ordered from most restrictive (in this case, after an alveolar stop) to least restrictive, then the first matching case usually "wins". Thus, the above conditions could be re-written as follows:

• as /əd/ or /ɪd/ when the stem ends with the alveolar stops /t/ or /d/

• as /t/ when the stem ends with voiceless phonemes

• as /d/ [when the stem ends with voiced phonemes?]

The fact that the /t/ allomorph does not appear after stem-final /t/, despite the fact that the latter is voiceless, is then explained by the fact that /əd/ appears in that environment, together with the fact that the environments are ordered. Likewise, the fact that the /d/ allomorph does not appear after stem-final /d/ is because the earlier clause for the /əd/ allomorph takes priority; and the fact that the /d/ allomorph does not appear after stem-final voiceless phonemes is because the preceding clause for the /t/ takes priority.

Irregular past tense forms, such as <broke> or <was/ were>, can be seen as still more specific cases (since they are confined to certain lexical items, like the verb <break>), which therefore take priority over the general cases listed above.

Stem allomorphy

Allomorphy can also exist in stems or roots, as in Classical Sanskrit [see inset]. There are three allomorphs of the stem: /vaːk/, /vaːt͡ʃ/ and /vaːɡ/. The allomorphs are conditioned by the particular case-marking suffixes.

The form of the stem /vaːk/, found in the nominative singular and locative plural, is the etymological form of the morpheme. Pre-Indic palatalization of velars resulted in the variant form /vaːt͡ʃ/, which was initially phonologically conditioned. This conditioning can still be seen in the Locative Singular form, where the /t͡ʃ/ is followed by the high front vowel /i/.

But subsequent merging of /e/ and /o/ into /a/ made the alternation unpredictable on phonetic grounds in the Genitive case (both Singular and Plural), as well as the Nominative Plural and Instrumental Singular. Hence, this allomorphy was no longer directly relatable to phonological processes.

Phonological conditioning also accounts for the /vaːɡ/ form found in the Instrumental Plural, where the /ɡ/ assimilates in voicing to the following /bʱ/.

History

The term [allomorphy] was originally used to describe variations in chemical structure. It was first applied to language (in writing) in 1948, by E.A. Nida in Language XXIV. ^[1]

UKT: End of http://en.wikipedia.org/wiki/Allomorph 090912

Contents of this page

Lexical morphology

by UKT contd. from Wikipedia: http://en.wikipedia.org/wiki/Morphology 090801

Lexical morphology is the branch of morphology that deals with the lexicon, which, morphologically conceived, is the collection of lexemes in a language. As such, it concerns itself primarily with word formation: derivation and compounding.

Models

There are three principal approaches to morphology, which each try to capture the distinctions above in different ways. These are,

• Morpheme-based morphology, which makes use of an Item-and-Arrangement approach.

• Lexeme-based morphology, which normally makes use of an Item-and-Process approach.

• Word-based morphology, which normally makes use of a Word-and-Paradigm approach.

Note that while the associations indicated between the concepts in each item in that list is very strong, it is not absolute.

Morpheme-based morphology

In morpheme-based morphology, word forms are analyzed as arrangements of morphemes. A morpheme is defined as the minimal meaningful unit of a language. [UKT ¶ ]

In a word like <independently>, we say that the morphemes are <in->, <depend>, <-ent>, and <ly>; <depend> is the root and the other morphemes are, in this case, derivational affixes. ^[5] [UKT ¶ ]

In a word like <dogs>, we say that <dog> is the root, and that <-s> is an inflectional morpheme. [UKT ¶ ]

In its simplest (and most naïve) form, this way of analyzing word forms treats words as if they were made of morphemes put after each other like beads on a string, is called Item-and-Arrangement. More modern and sophisticated approaches seek to maintain the idea of the morpheme while accommodating non-concatenative, analogical, and other processes that have proven problematic for Item-and-Arrangement theories and similar approaches.

Morpheme-based morphology presumes three basic axioms (cf. Beard 1995 for an overview and references):

1. Baudoin’s SINGLE MORPHEME HYPOTHESIS:
   Roots and affixes have the same status in the theory, they are MORPHEMES.

2. Bloomfield’s SIGN BASE MORPHEME HYPOTHESIS:
   As morphemes, they are dualistic signs, since they have both (phonological) form and meaning.

3. Bloomfield’s LEXICAL MORPHEME HYPOTHESIS:
   The morphemes, affixes and roots alike, are stored in the lexicon.

Morpheme-based morphology comes in two flavours, one Bloomfieldian and one Hockettian. (cf. Bloomfield 1933 and Charles F. Hockett 1947). For Bloomfield, the morpheme was the minimal form with meaning, but it was not meaning itself. For Hockett, morphemes are meaning elements, not form elements. For him, there is a morpheme plural, with the allomorphs <-s>, <-en>, <-ren> etc. Within much morpheme-based morphological theory, these two views are mixed in unsystematic ways, so that a writer may talk about "the morpheme plural" and "the morpheme -s" in the same sentence, although these are different things.

Lexeme-based morphology

Lexeme-based morphology is (usually) an Item-and-Process approach. Instead of analyzing a word form as a set of morphemes arranged in sequence, a word form is said to be the result of applying rules that alter a word form or stem in order to produce a new one. An inflectional rule takes a stem, changes it as is required by the rule, and outputs a word form; a derivational rule takes a stem, changes it as per its own requirements, and outputs a derived stem; a compounding rule takes word forms, and similarly outputs a compound stem.

Word-based morphology

Word-based morphology is (usually) a Word-and-paradigm approach. This theory takes paradigms as a central notion. Instead of stating rules to combine morphemes into word forms, or to generate word forms from stems, word-based morphology states generalizations that hold between the forms of inflectional paradigms. The major point behind this approach is that many such generalizations are hard to state with either of the other approaches. The examples are usually drawn from fusional languages, where a given "piece" of a word, which a morpheme-based theory would call an inflectional morpheme, corresponds to a combination of grammatical categories, for example, "third person plural." Morpheme-based theories usually have no problems with this situation, since one just says that a given morpheme has two categories. Item-and-Process theories, on the other hand, often break down in cases like these, because they all too often assume that there will be two separate rules here, one for third person, and the other for plural, but the distinction between them turns out to be artificial. Word-and-Paradigm approaches treat these as whole words that are related to each other by analogical rules. Words can be categorized based on the pattern they fit into. This applies both to existing words and to new ones. Application of a pattern different from the one that has been used historically can give rise to a new word, such as older replacing elder (where older follows the normal pattern of adjectival superlatives) and cows replacing kine (where cows fits the regular pattern of plural formation).

Morphological topology

In the 19th century, philologists devised a now classic classification of languages according to their morphology. According to this typology, some languages are isolating, and have little to no morphology; others are agglutinative, and their words tend to have lots of easily separable morphemes; while others yet are inflectional or fusional, because their inflectional morphemes are "fused" together. This leads to one bound morpheme conveying multiple pieces of information. The classic example of an isolating language is Chinese; the classic example of an agglutinative language is Turkish; both Latin and Greek are classic examples of fusional languages.

Considering the variability of the world's languages, it becomes clear that this classification is not at all clear cut, and many languages do not neatly fit any one of these types, and some fit in more than one way. A continuum of complex morphology of language may be adapted when considering languages.

The three models of morphology stem from attempts to analyze languages that more or less match different categories in this typology. The Item-and-Arrangement approach fits very naturally with agglutinative languages; while the Item-and-Process and Word-and-Paradigm approaches usually address fusional languages.

The reader should also note that the classical typology mostly applies to inflectional morphology. There is very little fusion going on with word formation. Languages may be classified as synthetic or analytic in their word formation, depending on the preferred way of expressing notions that are not inflectional: either by using word formation (synthetic), or by using syntactic phrases (analytic).

UKT: End of Wikipedia article

Contents of this page

Morphological typology

From Wikipedia: http://en.wikipedia.org/wiki/Morphological_typology 090802

Morphological typology is a way of classifying the languages of the world (see linguistic typology) that groups languages according to their common morphological structures. First developed by brothers Friedrich von Schlegel and August von Schlegel, the field organizes languages on the basis of how those languages form words by combining morphemes. Two primary categories exist to distinguish all languages:analytic languages [non-inflectional] and synthetic languages, where each term refers to the opposite end of a continuous scale including all the world's languages.

Analytic languages [non-inflectional : Burmese]

Analytic languages show a low ratio of morphemes to words; in fact, the correspondence is nearly one-to-one. Sentences in analytic languages are composed of independent root morphemes. Grammatical relations between words are expressed by separate words where they might otherwise be expressed by affixes, which are present to a minimal degree in such languages. There is little to no morphological change in words: they tend to be uninflected. Grammatical categories are indicated by word order (for example, inversion of verb and subject for interrogative sentences) or by bringing in additional words (for example, a word for <some> or <many> instead of a plural inflection like English -s). Individual words carry a general meaning (root concept); nuances are expressed by other words. Finally, in analytic languages context and syntax are more important than morphology.

Analytic languages include some of the major East Asian languages, such as Chinese, and Vietnamese. Additionally, English is moderately analytic (probably one of the most analytic of Indo-European languages). However, it is traditionally analyzed as a synthetic language.

Synthetic languages [inflectional]

Synthetic languages form words by affixing a given number of dependent morphemes to a root morpheme. The morphemes may be distinguishable from the root, or they may not. They may be fused with it or among themselves (in that multiple pieces of grammatical information may potentially be packed into one morpheme). Word order is less important for these languages than it is for analytic languages, since individual words express the grammatical relations that would otherwise be indicated by syntax. In addition, there tends to be a high degree of concordance (agreement, or cross-reference between different parts of the sentence). Therefore, morphology in synthetic languages is more important than syntax. Most Indo-European languages are moderately synthetic.

There are two subtypes of synthesis, according to whether morphemes are clearly differentiable or not. These subtypes are agglutinative and fusional (or inflectional or flectional in older terminology).

Agglutinative languages

Agglutinative languages have words containing several morphemes that are always clearly differentiable from one another in that each morpheme represents only one grammatical meaning and the boundaries between those morphemes are easily demarcated; that is, the bound morphemes are affixes, and they may be individually identified. Agglutinative languages tend to have a high number of morphemes per word, and their morphology is highly regular.

Agglutinative languages include Korean, Hungarian, Turkish, Japanese and Luganda.

Fusional languages

Morphemes in fusional languages are not readily distinguishable from the root or among themselves. Several grammatical bits of meaning may be fused into one affix. Morphemes may also be expressed by internal phonological changes in the root (i.e. morphophonology), such as consonant gradation and vowel gradation, or by suprasegmental features such as stress or tone, which are of course inseparable from the root.

Most Indo-European languages are fusional to a varying degree. A remarkably high degree of fusionality is also found in certain Sami languages such as Skolt Sami.

Polysynthetic languages

In 1836, Wilhelm von Humboldt proposed a third category for classifying languages, a category that he labeled polysynthetic. (The term polysynthesis was first used in linguistics by Peter Stephen DuPonceau who borrowed it from chemistry.) These languages have a high morpheme-to-word ratio, a highly regular morphology, and a tendency for verb forms to include morphemes that refer to several arguments besides the subject (polypersonalism). Another feature of polysynthetic languages is commonly expressed as "the ability to form words that are equivalent to whole sentences in other languages". Of course, this is rather useless as a defining feature, since it is tautological ("other languages" can only be defined by opposition to polysynthetic ones, and vice versa).

Many Amerindian languages are polysynthetic. Inuktitut is one example, for instance the word-phrase: tavvakiqutiqarpiit roughly translates to "Do you have any tobacco for sale?".

Note that no clear division exists between synthetic languages and polysynthetic languages; the place of one language largely depends on its relation to other languages displaying similar characteristics on the same scale.

Morphological typology in reality

Each of the types above are idealizations; they do not exist in a pure state in reality. Although they generally fit best into one category, all languages are mixed types. English is analytic, but it is more analytic than Spanish, and much more analytic than Latin. Chinese is the usual model of analytic languages, but it does have some bound morphemes. Japanese is highly synthetic (agglutinative) in its verbs, but clearly analytic in its nouns. For these reasons, the scale above is continuous and relative, not absolute. It is difficult to classify a language as absolutely analytic or synthetic, as a language could be described as more synthetic than Chinese, but less synthetic than Korean.

End of Wikipedia article

Go to morpho-topo-note-b

Contents of this page

UKT notes

agglutinative language

From Wikipedia: http://en.wikipedia.org/wiki/Agglutinative_language 090830

An agglutinative language is a language that uses agglutination extensively: most words are formed by joining morphemes together. This term was introduced by Wilhelm von Humboldt in 1836 to classify languages from a morphological point of view. ^[1] It was derived from the Latin verb agglutinare, which means "to glue together." ^[2]

UKT: Is it possible that Burmese-Myanmar is agglutinative ? In the example, {nwa: kyaung: tha:} 'cow herd', we have the morphemes {nwa:} 'cattle', {kyaung:} 'to protect, look after, direct', {tha:} 'son, person (male)' strung together in a certain order and not in any other.

An agglutinative language is a form of synthetic language where each affix typically represents one unit of meaning (such as <diminutive>, <past tense>, <plural>, etc.), and bound morphemes are expressed by affixes (and not by internal changes of the root of the word, or changes in stress or tone). Additionally, and most importantly, in an agglutinative language affixes do not become fused with others, and do not change form conditioned by others.

Synthetic languages that are not agglutinative are called fusional languages; they sometimes combine affixes by "squeezing" them together, often changing them drastically in the process, and joining several meanings in one affix (for example, in the Spanish word comí [I ate], the suffix -í carries the meanings of indicative mood, active voice, past tense, first person singular subject and perfect aspect).

Agglutinative is sometimes used as a synonym for synthetic, although it technically is not. When used in this way, the word embraces fusional languages and inflected languages in general. The distinction between an agglutinative and a fusional language is often not sharp. Rather, one should think of these as two ends of a continuum, with various languages falling more toward one end or the other. In fact, a synthetic language may present agglutinative features in its open lexicon but not in its case system: for example, German, Dutch.

Agglutinative languages tend to have a high rate of affixes/morphemes per word, and to be very regular. For example, Japanese has only two irregular verbs (and not very irregular), Luganda has only one (or two, depending on how 'irregular' is defined), Turkish has only one and in Quechua all the verbs are regular. Georgian is an exception; not only is it highly agglutinative (there can be simultaneously up to 8 morphemes per word), but there are also significant number of irregular verbs, varying in degrees of irregularity.

Examples of agglutinative languages

Examples of agglutinative languages include Basque, Blackfoot, Georgian, the Altaic languages (see Turkish and Tatar), Japanese (sometimes grouped with Altaic), Korean (sometimes grouped with Altaic), the Malay and Indonesian, many Tibeto-Burman languages, the Dravidian languages, many Uralic languages (the largest are Hungarian, Finnish and Estonian), Inuktitut, the Bantu languages (see Luganda), the Northeast, Northwest and South Caucasian languages, and some Mesoamerican and native North American languages including Nahuatl, Huastec, and Salish. Quechua and Aymara are Native American language of South America. Much of the ancient Near East also spoke such languages, such as Sumerian, Elamite, Hurrian, Urartian, Hattic, Gutian, Lullubi, Kassite, and among Indo-European languages, Persian.

Agglutination is a typological feature and does not imply a linguistic relation, but there are some families of agglutinative languages. For example, the Proto-Uralic language, the ancestor of Uralic languages, was agglutinative, and most descended languages inherit this feature. But since agglutination can arise in languages that previously had a non-agglutinative typology and it can be lost in languages that previously were agglutinative, agglutination as a typological trait cannot be used as evidence of genetic relationship to other agglutinative languages.

Many separate languages developed this property through convergent evolution. There seems to exist a preferred evolutionary direction from agglutinative synthetic languages to fusional synthetic languages, and then to non-synthetic languages, which in their turn evolve into isolating languages and from there again into agglutinative synthetic languages. However, this is just a trend, and in itself a combination of the trend observable in Grammaticalization theory and that of general linguistic attrition, especially word-final apocope and elision. This phenomenon is known as language drift.

Some real-world and fictional constructed languages, such as Esperanto, Newspeak, Klingon, Atlantean and Black Speech are presented as agglutinative.

UKT: End of Wikipedia article.

Go back agglutinative-lang-note-b

Contents of this page

morpheme

mor·pheme n. Linguistics 1. A meaningful linguistic unit consisting of a word, such as <man>, or a word element, such as <-ed> in <walked>, that cannot be divided into smaller meaningful parts. [French morphème blend of Greek morphē form French phonème phoneme; See phoneme ] -- AHTD
¤ word <cat> ; word element <-s> in <cats>
¤ word <walk> ; word element <-ed> in <walked>

The following by UKT based on Wikipedia: http://en.wikipedia.org/wiki/Morpheme 090731

In morpheme-based morphology, a morpheme is the smallest linguistic unit that has semantic meaning.

Morphemes are composed of
• phonemes (the smallest linguistically distinctive sound units) in spoken languages
• graphemes (the smallest units of written language) in written languages

The concept morpheme differs from the concept word, as many morphemes cannot stand as words on their own. A morpheme is free if it can stand alone, or bound if it is used exclusively alongside a free morpheme. Its actual phonetic representation is the morph, with the different morphs representing the same morpheme being grouped as its allomorphs.

al·lo·morph ² n. 1. Any of the variant forms of a morpheme. For example, the phonetic /s/ of <cats> /kæts/, /z/ of <dogs> /dɒgz (US) dɔːgz/, and /ɪz/ of <horses> /hɔːsɪz/ and the /en/ of <oxen> /ˡɒk.s^ən (US) ˡɒːk-/ are allomorphs of the English plural morpheme. -- AHTD
Note: the IPA introduced above are mine -- UKT090911

Allomorphs are variants of a morpheme, e.g. the plural marker in English is sometimes realized as /-z/, /-s/ or /-ɨz/. -- http://en.wikipedia.org/wiki/Morpheme

English example:

The word "unbreakable" has three morphemes:

<un->, a bound morpheme - prefix
<break>, a free morpheme - [UKT: root]
<-able>, a bound morpheme - suffix
Both <un-> and <-able> are affixes: <un-> is the prefix, and <-able> the suffix.

The morpheme plural-s has:

the morph voiceless "-s" : /s/, in <cats> /kæts/, but
the morph "-es" : /ɨz/, in <dishes> /dɪʃɨz/,
the morph voiced "-s" : /z/, in <dogs> /dɒgz/
These are allomorphs.

Types of morphemes

• Free morphemes like <town>, and <dog> can appear with other lexemes (as in <town hall> or <dog house>) or they can stand alone, i.e. "free".

• Bound morphemes like "un-" appear only together with other morphemes to form a lexeme. Bound morphemes in general tend to be prefixes and suffixes. Unproductive, non-affix morphemes that exist only in bound form are known as "cranberry" morphemes, from the "cran" in that very word.

• Derivational morphemes can be added to a word to create (derive) another word: the addition of "-ness" to "happy," for example, to give "happiness." They carry semantic information.

• Inflectional morphemes modify a word's tense, number, aspect, and so on, without deriving a new word or a word in a new grammatical category (as in the "dog" morpheme if written with the plural marker morpheme "-s" becomes "dogs"). They carry grammatical information.

• Allomorphs are variants of a morpheme, e.g. the plural marker in English is sometimes realized as /-z/, /-s/ or /-ɨz/.

Other variants

• Null morpheme
• Root morpheme
• Word stem

Morphological analysis

In natural language processing for Japanese, Chinese and other languages, morphological analysis is a process of segmenting given sentence into a row of morphemes. It is closely related to Part-of-speech tagging, but word segmentation is required for these languages because word boundaries are not indicated by blank spaces. Famous Japanese morphological analysers include Juman, ChaSen and Mecab.

UKT: End of Wikipedia article

Go back morpheme-note-b

Contents of this page

null morpheme

From Wikipedia: http://en.wikipedia.org/wiki/Null_morpheme 090911

In morpheme-based morphology, a null morpheme is a morpheme that is realized by a phonologically null affix (an empty string of phonological segments). In simpler terms, a null morpheme is an "invisible" affix. It's also called zero morpheme; the process of adding a null morpheme is called null affixation, null derivation or zero derivation. The concept was first used over two thousand years ago by Pāṇini in his Sanskrit grammar. Some linguists object to the notion of a null morpheme, arguing that it sets up an unverifiable distinction between a "null" or "zero" element, and nothing at all.

The null morpheme is represented as either the figure zero (0 ), the empty set symbol Ø.

Examples in English include hiatus and co-operation.

The existence of a null morpheme in a word can also be theorized by contrast with other forms of the same word showing alternate morphemes. For example, the singular number of English nouns is shown by a null morpheme that contrasts with the plural morpheme -s.

• cat = cat + -Ø = ROOT ("cat") + SINGULAR
• cats = cat + -s = ROOT ("cat") + PLURAL

In addition, there are some cases in English where a null morpheme indicates plurality in nouns that take on irregular plurals.

• sheep = sheep + -Ø = ROOT ("sheep") + SINGULAR
• sheep = sheep + -Ø = ROOT ("sheep") + PLURAL

Also, a null morpheme marks the present tense of verbs in all forms but the third person singular:

• (I) run = run + -Ø = ROOT ("run") + PRESENT: Non-3rd-SING
• (He) runs = run + -s = ROOT ("run") + PRESENT: 3rd-SING

According to some linguists' view, it's also a null morpheme that turns some English adjectives into verbs of the kind of to clean, to slow, to warm. Null derivation, also known as conversion if the word class changes, is very common in analytic languages such as English.

In languages that show the above distinctions, it's quite common to employ null affixation to (not) mark singular number, present tense and third persons (English is unusual in its marking of the third person singular with a non-zero morpheme, by contrast with a null morpheme for others). It's also frequent to find null affixation for the least-marked cases (the nominative in nominative-accusative languages, and the absolutive in ergative-absolutive languages).

In most languages of the world these are the affixes that are realized as null morphemes. But in some cases roots may also be realized as these. For instance, Russian word вы-Ø-ну-ть (vynut', 'to take out') consists of one prefix (вы-), one zero root (-Ø-), one suffix (-ну-), and one postfix (-ть) ^[1].

A basic radical element plus a null morpheme is not the same as an uninflected word, though usage may make those equal in practice.

Go back null-morpheme-note-b

Contents of this page

syntax

From Wikipedia: http://en.wikipedia.org/wiki/Syntax 090902

In linguistics, syntax (from Ancient Greek συν- syn-, "together", and τάξις táxis, "arrangement") is the study of the principles and rules for constructing sentences in natural languages. [To be differentiated from "made-up" languages like Esperanto and "machine" language used in computers.] In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the syntax of Modern Irish."

UKT: Simple sentences in:
¤ Burmese-Myanmar - SOV ("Subject-Object-Verb" aka "Subject-Copula-Predicate"): Subject is aka (also known as) Agent .
¤ English-Latin - SVO (Subject-Verb-Object)

Modern research in syntax attempts to describe languages in terms of such rules. Many professionals in this discipline attempt to find general rules that apply to all natural languages. The term syntax is also sometimes used to refer to the rules governing the behavior of mathematical systems, such as logic, artificial formal languages, and computer programming languages.

Early history

Works on grammar were being written long before modern syntax came about; the Aṣṭādhyāyī of Pāṇini पाणिन {pa-Ni.ni.} (fl. 4th century BC) is often cited as an example of a pre-modern work that approaches the sophistication of a modern syntactic theory. ^[1] In the West, the school of thought that came to be known as "traditional grammar" began with the work of Dionysius Thrax.

UKT: Dionysius Thrax (thrăks), fl. 100 B.C. 1. Greek grammarian who taught at Rhodes and Rome and wrote an influential synthesis of Greek grammar, the Art of Grammar. -- AHTD

For centuries, work in syntax was dominated by a framework known as grammaire générale , first expounded in 1660 by Antoine Arnauld in a book of the same title. This system took as its basic premise the assumption that language is a direct reflection of thought processes and therefore there is a single, most natural way to express a thought. That way, coincidentally, was exactly the way it was expressed in French.

However, in the 19th century, with the development of historical-comparative linguistics, linguists began to realize the sheer diversity of human language, and to question fundamental assumptions about the relationship between language and logic. It became apparent that there was no such thing as a most natural way to express a thought, and therefore logic could no longer be relied upon as a basis for studying the structure of language.

The Port-Royal grammar modeled the study of syntax upon that of logic (indeed, large parts of the Port-Royal Logic were copied or adapted from the Grammaire générale ^[2]). Syntactic categories were identified with logical ones, and all sentences were analyzed in terms of "Subject-Copula-Predicate". Initially, this view was adopted even by the early comparative linguists such as Franz Bopp [1791-1867].

The central role of syntax within theoretical linguistics became clear only in the 20th century, which could reasonably be called the "century of syntactic theory" as far as linguistics is concerned. For a detailed and critical survey of the history of syntax in the last two centuries, see the monumental work by Graffi (2001).

There are a number of modern theoretical approaches to the discipline of syntax. Many linguists (e.g. Noam Chomsky [1928- ]) see syntax as a branch of biology, since they conceive of syntax as the study of linguistic knowledge as embodied in the human mind. Others (e.g. Gerald Gazdar) take a more Platonistic view, since they regard syntax to be the study of an abstract formal system. ^[3] Yet others (e.g. Joseph Greenberg) consider grammar a taxonomical device to reach broad generalizations across languages. Some of the major approaches to the discipline are listed below.

Generative Grammar

The hypothesis of generative grammar is that language is a structure of the human mind. The goal of generative grammar is to make a complete model of this inner language (known as i-language). This model could be used to describe all human language and to predict the grammaticality of any given utterance (that is, to predict whether the utterance would sound correct to native speakers of the language). This approach to language was pioneered by Noam Chomsky. Most generative theories (although not all of them) assume that syntax is based upon the constituent structure of sentences. Generative grammars are among the theories that focus primarily on the form of a sentence, rather than its communicative function.

Among the many generative theories of linguistics, the Chomskyan theories are:

• Transformational Grammar (TG) (Original theory of generative syntax laid out by Chomsky in Syntactic Structures in 1957 ^[4])
• Government and binding theory (GB) (revised theory in the tradition of TG developed mainly by Chomsky in the 1970s and 1980s). ^[5]
• The Minimalist Program (MP) (revised version of GB published by Chomsky in 1995) ^[6]

Other theories that find their origin in the generative paradigm are:

• Generative semantics (now largely out of date)
• Relational grammar (RG) (now largely out of date)
• Arc Pair grammar
• Generalized phrase structure grammar (GPSG; now largely out of date)
• Head-driven phrase structure grammar (HPSG)
• Lexical-functional grammar (LFG)

Categorial Grammar

Categorial grammar is an approach that attributes the syntactic structure not to rules of grammar, but to the properties of the syntactic categories themselves. For example, rather than asserting that sentences are constructed by a rule that combines a noun phrase (NP) and a verb phrase (VP) (e.g. the phrase structure rule S → NP VP), in categorial grammar, such principles are embedded in the category of the head word itself. So the syntactic category for an intransitive verb is a complex formula representing the fact that the verb acts as a functor which requires an NP as an input and produces a sentence level structure as an output. This complex category is notated as (NP\S) instead of V. NP\S is read as " a category that searches to the left (indicated by \) for a NP (the element on the left) and outputs a sentence (the element on the right)". The category of transitive verb is defined as an element that requires two NPs (its subject and its direct object) to form a sentence. This is notated as (NP/(NP\S)) which means "a category that searches to the right (indicated by /) for an NP (the object), and generates a function (equivalent to the VP) which is (NP\S), which in turn represents a function that searches to the left for an NP and produces a sentence).

Tree-adjoining grammar is a categorial grammar that adds in partial tree structures to the categories.

Dependency grammar

Dependency grammar is a different type of approach in which structure is determined by the relations (such as grammatical relations) between a word (a head) and its dependents, rather than being based in constituent structure. For example, syntactic structure is described in terms of whether a particular noun is the subject or agent of the verb, rather than describing the relations in terms of phrases.

Some dependency-based theories of syntax:

• Algebraic syntax
• Word grammar
• Operator Grammar

Stochastic/probabilistic grammars/network theories

Theoretical approaches to syntax that are based upon probability theory are known as stochastic grammars. One common implementation of such an approach makes use of a neural network or connectionism. Some theories based within this approach are:

• Optimality theory
• Stochastic context-free grammar

Functionalist grammars

Functionalist theories, although focused upon form, are driven by explanation based upon the function of a sentence (i.e. its communicative function). Some typical functionalist theories include:

• Functional grammar (Dik)
• Prague Linguistic Circle
• Systemic functional grammar
• Cognitive grammar
• Construction grammar (CxG)
• Role and reference grammar (RRG)
• Emergent grammar

UKT: End of Wikipedia article

Go back syntax-note-b

Contents of this page

End of TIL page