Update: 2020-11-15 05:04 PM -0500

BEPS Grammar

BEPS01-1.htm
formerly: ch01t.htm

by U Kyaw Tun (UKT) (M.S., I.P.S.T., USA), Tun Institute of Learning (TIL).
Based on Burmese Grammar and Grammatical Analysis by A. W. Lonsdale, Education Department, Burma, British Burma Press, Rangoon, 1899. In two parts:
Part 1. Orthoepy (pronunciation) and orthography (spelling) -- BG1899-1-indx.htm - update 2019Nov
Part 2. Accidence (morphology) and syntax (sentence structure) -- BG1899-2-indx.htm - update 121117
I plan to update Lonsdale's work and name it BEPS (Burmese, English, Pali, Sanskrit speeches in Myanmar, Latin, and Devanagri scripts) Grammar. But before I could do it, I must equip myself with Linguistics and Phonetics. This is my attempt to educate myself in these fields. Be kind to remember, I am a chemist-cum-chemical engineer, and had no knowledge of Linguistics and Phonetics. Moreover, I was very weak in traditional grammar - both English and Burmese - and had almost no knowledge of Pali, and had absolutely no knowledge Sanskrit speech and Devanagari script. I am just a learner and bound to make fatal mistakes. I am doing this for my own satisfaction at this late age in life: I'm 85. I am sharing my work with own students and staff of TIL . Not for sale. No copyright. Free for everyone. Prepared for students and staff of TIL Research Station, Yangon, MYANMAR : http://www.tuninst.net , www.romabama.blogspot.com

index.htm | Top
BG1899-1-indx

Contents of this page

Speech segmentation
Introduction - Wiki: https://en.wikipedia.org/wiki/Speech_segmentation 111112
Lexical recognition - Wiki
Phonotactic cues - Wiki
Speech segmentation in infants and non-natives - Wiki

UKT notes
As you go on reading the Wikipedia article on Speech Segmentation you will come across terms which I want you to concentrate. These terms are important for me who is still a novice in the field. These terms are arranged in alphabetical order under my notes.
- Coarticulation - Wiki: https://en.wikipedia.org/wiki/Coarticulation 191113
- Grapheme - Wiki: https://en.wikipedia.org/wiki/Grapheme 191115
- Lexeme - Wiki: https://en.wikipedia.org/wiki/Lexeme 191116
- Moropheme - Wiki: https://en.wikipedia.org/wiki/Morpheme 191116
- Phonotactics - Wiki: https://en.wikipedia.org/wiki/Phonotactics 191112
- Sematics - Wiki: https://en.wikipedia.org/wiki/Semantics
- Word - Wiki: https://en.wikipedia.org/wiki/Word 191116

Contents of this page

Speech segmentation

UKT 191112: Though Bur-Myan differentiates {sa.ka.} and {sa}, I feel that I am not fully prepared for treating Burmese as a scientific language. Bur-Myan, different from Pal-Myan, is free from inflexion.

UKT 191115: In BEPS study, I need one-to-one definition, between Bur-Myan and Engl-Lat, for many grammatical terms. I cannot rely on MLC Myanmar-English Dictionary, 2006, for it uses many Pali words and many times with multiple meanings. I have to come up with my own convention (which may change as my knowledge advances) given on the right:

I now quote A. W. Lonsdale, in his Burmese Grammar and Grammatical Analysis , Rangoon 1899, :
¤ "The Burmese language is constructed on scientific principles, and there is no reason why its grammar should not be dealt with also from a scientific standpoint. But it may be safely said that Burmese grammar as a science has not received that attention it deserves.
¤ "With regard to the grammatical treatises by native writers, ... not content with merely borrowing the grammatical nomenclature of the Pali language, ... assimilate the grammatical principles of the uninflected Burmese to those of the inflected Pali; so that they produced, not Burmese grammars, but modified Pali grammars in Burmese dress."

Going back further, A. Judson wrote in 1883 in his Grammar of Burmese Language,
"§2. The pure Burmese is monosyllabic, every word consisting of one syllable only; but the introduction of the Pali language, with the Boodhistic religion, has occasioned the incorporation of many polysyllabic words of Pali origin into the pure Burmese."
On Grammatical Case:
"§57. The relations of nouns expressed in most languages by prepositions or inflections, are in the Burmese language expressed by particles affixed to the noun, without any inflection of the noun itself. "

Now that I've ascertained that Bur-Myan, which I would like to presume - the language of the Arigyi (Ari monks) - is quite different from most languages, it would be fruitless to use traditional methods of analysis. I therefore propose to rely on the shape of the glyphs - based on perfectly rounded circles - to bring out the meaning. I cannot rely on the pronunciation, when we are dealing with languages of different linguistic groups.

Since Bur-Myan is a segmental language, let's start with Segmentation. Whether it is "Speech segmentation" or "Script segmentation" is immaterial because Bur-Myan is a phonetic language. Pronounce as it is spelled and you will be understood. The motto is:
{ré:tau. a.mhûn} "what is written is correct",
{hpût-tau. a.þän} "what is vocalized is transient sound"
- just the opposite of what modern Western phoneticians would like us to do. But first, let's see what these Westerners are doing by watching a series of videos in TIL HD-VIDEO and SD-VIDEO libraries (Phonetics section), by Dr. Jürgen Handke, Marburg Univ., Germany:
- PHO106-BasicSegOfSpeechConson<Ô> / Bkp<Ô> (link chk 191113)
- PHO107-BasicSegOfSpeechVowels1<Ô> / Bkp<Ô> (link chk 191113)
- PHO107-BasicSegOfSpeechVowels2<Ô> / Bkp<Ô> (link chk 191113)

From Wikipedia: https://en.wikipedia.org/wiki/Speech_segmentation 111112
The following are still linked to Wikipedia:
4. See also
5. References
6. External links

UKT notes
As you go on reading the Wikipedia article on Speech Segmentation you will come across terms on which I want you to concentrate. These terms are important for me who is still a novice in the field. They are arranged in alphabetical order under my notes.

Contents of this page

Introduction

Speech segmentation is the process of identifying the boundaries between words {waad}, syllables {sa.ka:þän-su.}, or phonemes {sa.ka:aim} in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language processing.

UKT 191113: see my notes below for word
"In linguistics, a word {waad} is the smallest element that can be uttered in isolation with objective or practical meaning. [citation needed] ...
"The term word may refer to a spoken word {sa.ka:} "speech", or to a written word {sa}, or sometimes to the abstract concept behind either. [citation needed] Spoken words {sa.ka:} are made up of units of sound called phonemes {sa.ka:aim}, and written words of symbols called graphemes {sa-aim}, such as the Letters of the English alphabet."

UKT 191113: The Wikipedia article on Phonemes is extensive because of which it is given in another file: BEPS1-2.htm
"A phoneme (/ˈfoʊniːm/) is a unit of sound that distinguishes one word from another in a particular language." -- Wikipedia

UKT 191115: see my notes below for Grapheme
"In linguistics, a grapheme is the smallest unit of a writing system of any given language. [1] An individual grapheme may or may not carry meaning by itself, and may or may not correspond to a single phoneme of the spoken language."

Speech segmentation is a subfield of general speech perception and an important subproblem of the technologically focused field of speech recognition, and cannot be adequately solved in isolation. As in most natural language processing problems, one must take into account context, grammar, and semantics, and even so the result is often a probabilistic division (statistically based on likelihood) rather than a categorical one. [UKT ¶]

UKT 191113: see my notes below for semantics

Though it seems that coarticulation — a phenomenon which may happen between adjacent words just as easily as within a single word — presents the main challenge in speech segmentation across languages, some other problems and strategies employed in solving those problems can be seen in the following sections.

UKT 191113: see my notes below for coarticulation
"Coarticulation in its general sense refers to a situation in which a conceptually isolated speech sound is influenced by, and becomes more like, a preceding or following speech sound." -- Wikipedia
See a video by Jürgen Handke, in TIL HD-VIDEO and SD-VIDEO libraries in Phonetics section in Micro Lecture series
- LingMicroCoarticulation<Ô> / Bkp<Ô> (link chk 191126)

This problem overlaps to some extent with the problem of text segmentation that occurs in some languages which are traditionally written without inter-word spaces, like Chinese and Japanese, compared to writing systems which indicate speech segmentation between words by a word divider, such as the space. [UKT¶]

UKT 191114: The word space is what I understand as white-space inserted between words in Eng-Lat. This white-space in written language {sa} is notably absent in Burmese (Bur-Myan), Pali (Pali-Myan), and Sanskrit (Skt-Dev). Yet in spoken language {sa.ka:} we indicate it by pauses - short or long - depending on how we want it to be noticed. I feel we sorely need white-space -- a view which my friend U (Dr.) Tun Tint of MLC shares with me. However, Sanskrit is slightly different because of its preferences for horizontal conjuncts which affects the pronunciation. Hindi written in Devanagari is worse, because of its attempts to get rid of the virama {a.þût} at the end of words.

However, even for those languages, text segmentation is often much easier than speech segmentation, because the written language usually has little interference between adjacent words, and often contains additional clues not present in speech (such as the use of Chinese characters for word stems in Japanese). Word Boundary Identification can be overcome by NLU (Natural Language Understanding) approaches such as Patom theory integrated with Role and Reference Grammar (RRG) for languages without spaces between words such as Japanese and Chinese.

Contents of this page

Lexical recognition

UKT 191116: Before we launch into "Lexical recognition", we should know what a Lexeme is.
See my note below for Lexeme .

In natural languages, the meaning of a complex spoken sentence can be understood by decomposing it into smaller lexical segments (roughly, the words of the language), associating a meaning to each segment, and combining those meanings according to the grammar rules of the language.

Though lexical recognition is not thought to be used by infants in their first year, due to their highly limited vocabularies, it is one of the major processes involved in speech segmentation for adults. [UKT ¶]

Three main models of lexical recognition exist in current research [UKT ¶] :

¤ first, whole-word access, which argues that words have a whole-word representation in the lexicon;

¤ second, decomposition, which argues that morphologically complex words are broken down into their morphemes (roots, stems, inflections, etc.) and then interpreted and;
UKT: see my notes below for morpheme

¤ third, the view that whole-word and decomposition models are both used, but that the whole-word model provides some computational advantages and is therefore dominant in lexical recognition. [1]

To give an example, in a whole-word model, the word "cats" might be stored and searched for by letter, first "c", then "ca", "cat", and finally "cats". The same word, in a decompositional model, would likely be stored under the root word "cat" and could be searched for after removing the "s" suffix. "Falling", similarly, would be stored as "fall" and suffixed with the "ing" inflection. [2]

UKT 191112: The above para is not relevant for Bur-Myan, because there is no singular-plural, and inflexion is not present in Bur-Myan.

Though proponents of the decompositional model recognize that a morpheme-by-morpheme analysis may require significantly more computation, they argue that the unpacking of morphological information is necessary for other processes (such as syntactic structure) which may occur parallel to lexical searches.

As a whole, research into systems of human lexical recognition is limited due to little experimental evidence that fully discriminates between the three main models. [1]

In any case, lexical recognition likely contributes significantly to speech segmentation through the contextual clues it provides, given that it is a heavily probabilistic system — based on the statistical likelihood of certain words or constituents occurring together. For example, one can imagine a situation where a person might say "I bought my dog at a ____ shop" and the missing word's vowel is pronounced as in "net", "sweat", or "pet". While the probability of "netshop" is extremely low, since "netshop" isn't currently a compound or phrase in English, and "sweatshop" also seems contextually improbable, "pet shop" is a good fit because it is a common phrase and is also related to the word "dog". [3]

Moreover, an utterance can have different meanings depending on how it is split into words. A popular example, often quoted in the field, is the phrase "How to wreck a nice beach", which sounds very similar to "How to recognize speech". [4] As this example shows, proper lexical segmentation depends on context and semantics which draws on the whole of human knowledge and experience, and would thus require advanced pattern recognition and artificial intelligence technologies to be implemented on a computer.

Lexical recognition is of particular value in the field of computer speech recognition, since the ability to build and search a network of semantically connected ideas would greatly increase the effectiveness of speech-recognition software. Statistical models can be used to segment and align recorded speech to words or phones. Applications include automatic lip-synch timing for cartoon animation, follow-the-bouncing-ball video sub-titling, and linguistic research. Automatic segmentation and alignment software is commercially available.

Contents of this page

Phonotactic cues

For most spoken languages, the boundaries between lexical units are difficult to identify; phonotactics are one answer to this issue. [UKT ¶]

UKT 191112: see a section below for phonotactics.
"Phonotactics defines permissible syllable structure, consonant clusters and vowel sequences by means of phonotactic constraints. "

UKT 191112: The first glaring -- but unrecognized -- difference between Alphabet-Letter languages like English (Eng-Lat) and Georgian, and Abugida-Akshara languages like Bur-Myan and Skt-Dev, is how the consonant is defined. Bur-Myan consonant is pronounceable - a syllable -- whereas Georgian consonant is mute. The hall-mark of an Abugida-Akshara language is the Virama {a.þût} which kills the inherent vowel of the Akshara.

{ta.} (Myanmar) + viram --> თ /t/ (Georgian)

One might expect that the inter-word spaces used by many written languages like English or Spanish would correspond to pauses in their spoken version, but that is true only in very slow speech, when the speaker deliberately inserts those pauses. In normal speech, one typically finds many consecutive words being said with no pauses between them, and often the final sounds of one word blend smoothly or fuse with the initial sounds of the next word.

The notion that speech is produced like writing, as a sequence of distinct vowels and consonants, may be a relic of alphabetic heritage for some language communities. In fact, the way vowels are produced depends on the surrounding consonants just as consonants are affected by surrounding vowels; this is called coarticulation. For example, in the word "kit", the [k] is farther forward than when we say 'caught'. But also, the vowel in "kick" is phonetically different from the vowel in "kit", though we normally do not hear this. [UKT ¶]

UKT 191112: see a section below for Coarticulation.
- https://en.wikipedia.org/wiki/Coarticulation 191113
"Coarticulation in its general sense refers to a situation in which a conceptually isolated speech sound is influenced by, and becomes more like, a preceding or following speech sound. There are two types of coarticulation: anticipatory coarticulation, when a feature or characteristic of a speech sound is anticipated (assumed) during the production of a preceding speech sound; and carryover or perseverative coarticulation, when the effects of a sound are seen during the production of sound(s) that follow. Many models have been developed to account for coarticulation. They include the look-ahead, articulatory syllable, time-locked, window, coproduction and articulatory phonology models. [1]

In addition, there are language-specific changes which occur in casual speech which makes it quite different from spelling. For example, in English, the phrase "hit you" could often be more appropriately spelled "hitcha".

From a decompositional perspective, in many cases, phonotactics play a part in letting speakers know where to draw word boundaries. In English, the word "strawberry" is perceived by speakers as consisting (phonetically) of two parts: "straw" and "berry". Other interpretations such as "stra" and "wberry" are inhibited by English phonotactics, which does not allow the cluster "wb" word-initially. Other such examples are "day/dream" and "mile/stone" which are unlikely to be interpreted as "da/ydream" or "mil/estone" due to the phonotactic probability or improbability of certain clusters. The sentence "Five women left", which could be phonetically transcribed as [faɪvwɪmɘnlɛft], is marked since neither /vw/ in /faɪvwɪmɘn/ or /nl/ in /wɪmɘnlɛft/ are allowed as syllable onsets or codas in English phonotactics. These phonotactic cues often allow speakers to easily distinguish the boundaries in words.

Vowel harmony in languages like Finnish can also serve to provide phonotactic cues. While the system does not allow front vowels and back vowels to exist together within one morpheme, compounds allow two morphemes to maintain their own vowel harmony while coexisting in a word. Therefore, in compounds such as "selkä/ongelma" ('back problem') where vowel harmony is distinct between two constituents in a compound, the boundary will be wherever the switch in harmony takes place — between the "ä" and the "ö" in this case. [5] Still, there are instances where phonotactics may not aid in segmentation. Words with unclear clusters or uncontrasted vowel harmony as in "opinto/uudistus" ('student reform') do not offer phonotactic clues as to how they are segmented. [6] [full citation needed]

From the perspective of the whole-word model, however, these words are thought be stored as full words, so the constituent parts wouldn't necessarily be relevant to lexical recognition.

Contents of this page

Speech segmentation in infants and non-natives

Infants are one major focus of research in speech segmentation. Since infants have not yet acquired a lexicon capable of providing extensive contextual clues or probability-based word searches within their first year, as mentioned above, they must often rely primarily upon phonotactic and rhythmic cues (with prosody being the dominant cue), all of which are language-specific. Between 6 and 9 months, infants begin to lose the ability to discriminate between sounds not present in their native language and grow sensitive to the sound structure of their native language, with the word segmentation abilities appearing around 7.5 months.

Though much more research needs to be done on the exact processes that infants use to begin speech segmentation, current and past studies suggest that English-native infants approach stressed syllables as the beginning of words. At 7.5 months, infants appear to be able to segment bisyllabic words with strong-weak stress patterns, though weak-strong stress patterns are often misinterpreted, e.g. interpreting "guiTAR is" as "GUI TARis". It seems that infants also show some complexity in tracking frequency and probability of words, for instance, recognizing that although the syllables "the" and "dog" occur together frequently, "the" also commonly occurs with other syllables, which may lead to the analysis that "dog" is an individual word or concept instead of the interpretation "thedog". [7] [8]

Language learners are another set of individuals being researched within speech segmentation. In some ways, learning to segment speech may be more difficult for a second-language (L2) learner than for an infant, not only in the lack of familiarity with sound probabilities and restrictions but particularly in the overapplication of the native language's patterns. While some patterns may occur between languages, as in the syllabic segmentation of French and English, they may not work well with languages such as Japanese, which has a mora-based segmentation system. Further, phonotactic restrictions like the boundary-marking cluster /ld/ in German or Dutch are permitted (without necessarily marking boundaries) in English. Even the relationship between stress and vowel length, which may seem intuitive to speakers of English, may not exist in other languages, so second-language learners face an especially great challenge when learning a language and its segmentation cues. [9]

UKT: End of Wikipedia article

Contents of this page

UKT notes

Wikipedia articles added to this file to help me understand the main topic of Segmentation
- https://en.wikipedia.org/wiki/Speech_segmentation 111112

Coarticulation

UKT 191113: To enhance our knowledge through seeing and hearing, let's see a video on Coarticulation - by from
- LingMicroCoArticulat<Ô> / Bkp<Ô> (link chk 191113)

From Wikipedia:
- https://en.wikipedia.org/wiki/Coarticulation 191113
"Coarticulation in its general sense refers to a situation in which a conceptually isolated speech sound is influenced by, and becomes more like, a preceding or following speech sound."

Coarticulation in its general sense refers to a situation in which a conceptually isolated speech sound is influenced by, and becomes more like, a preceding or following speech sound. There are two types of coarticulation: anticipatory coarticulation, when a feature or characteristic of a speech sound is anticipated (assumed) during the production of a preceding speech sound; and carryover or perseverative coarticulation, when the effects of a sound are seen during the production of sound(s) that follow. Many models have been developed to account for coarticulation. They include the look-ahead, articulatory syllable, time-locked, window, coproduction and articulatory phonology models.^[1]

Coarticulation in phonetics refers to two different phenomena:

• the assimilation of the place of articulation of one speech sound to that of an adjacent speech sound. For exmple, while the sound /n/ of English normally has an alveolar place of articulation, in the word tenth it is pronounced with a dental place of articulation because the following sound, /θ/, is dental.

• the production of a co-articulated consonant, that is, a consonant with two simultaneous places of articulation. An example of such a sound is the voiceless labial-velar plosive /k͡p/ found in many West African languages.

The term coarticulation may also refer to the transition from one articulatory gesture to another.

UKT: End of Wikipedia article.

Go back coarticulation-note-b

Contents of this page

Grapheme

UKT 191115: In linguistics there is a class of words ending in suffix <-eme>. I need to know each as they crop up in my study.

From Wikipedia: https://en.wikipedia.org/wiki/Grapheme 191115

In linguistics, a grapheme is the smallest unit of a writing system of any given language. [1] An individual grapheme may or may not carry meaning by itself, and may or may not correspond to a single phoneme of the spoken language.

Graphemes include alphabetic letters, typographic ligatures, Chinese characters, numerical digits, punctuation marks, and other individual symbols. A grapheme can also be construed as a graphical sign that independently represents a portion of linguistic material. [2]

The word grapheme, coined in analogy with phoneme, is derived from Ancient Greek γράφω (gráphō), meaning 'write', and the suffix -eme by analogy with phoneme and other names of emic units. The study of graphemes is called graphemics.

The concept of graphemes is abstract and similar to the notion in computing of a character. By comparison, a specific shape that represents any particular grapheme in a specific typeface is called a glyph. For example, the grapheme corresponding to the abstract concept of "the Arabic numeral one" has two distinct glyphs (allographs) in Times New Roman and Helvetica fonts.

UKT: More in the Wikipedia article

Go back grapheme-note-b

Contents of this page

Lexeme

From Wikipedia: https://en.wikipedia.org/wiki/Lexeme 191116

A lexeme (/ˈlɛksiːm/ is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning, [1] a unit of morphological analysis in linguistics that roughly corresponds to a set of forms taken by a single root word. For example, in English, run, runs, ran and running are forms of the same lexeme, which can be represented as RUN. [note 1]

One form, the lemma (or citation form), is chosen by convention as the canonical form of a lexeme. The lemma is the form used in dictionaries as an entry's headword. Other forms of a lexeme are often listed later in the entry if they are uncommon or irregularly-inflected.

Description: The notion of the lexeme is central to morphology, [2] the basis for defining other concepts in that field. For example, the difference between inflection and derivation can be stated in terms of lexemes:

• Inflectional rules relate a lexeme to its forms.
• Derivational rules relate a lexeme to another lexeme.

A lexeme belongs to a particular syntactic category, has a certain meaning (semantic value) and, in inflecting languages, has a corresponding inflectional paradigm. That is, a lexeme in many languages will have many different forms. For example, the lexeme RUN has a present third person singular form runs, a present non-third-person singular form run (which also functions as the past participle and non-finite form), a past form ran, and a present participle running. (It does not include runner, runners, runnable etc.) [UKT ¶]

The use of the forms of a lexeme is governed by rules of grammar. In the case of English verbs such as RUN, they include subject-verb agreement and compound tense rules, which determine the form of a verb that can be used in a given sentence.

In many formal theories of language, lexemes have subcategorization frames to account for the number and types of complements. They occur within sentences and other syntactic structures.

Decomposition: A language's lexemes are often composed of smaller units with individual meaning called morphemes, according to root morpheme + derivational morphemes + suffix (not necessarily in that order), where:

• The root morpheme is the primary lexical unit of a word, which carries the most significant aspects of semantic content and cannot be reduced to smaller constituents. [3]

• The derivational morphemes carry only derivational information. [4]

• The suffix is composed of all inflectional morphemes, and carries only inflectional information. [5]

The compound root morpheme + derivational morphemes is often called the stem.^[6] The decomposition stem + desinence can then be used to study inflection

UKT: End of Wikipedia article.

Go back Lexeme-note-b

Contents of this page

Morpheme

UKT 191116. Morpheme is one of the terms I didn't know, but which I deemed is less important compared to Lexeme for BEPS.

From Wikipedia: https://en.wikipedia.org/wiki/Morpheme 191116

A morpheme is the smallest meaningful unit in a language. A morpheme is not identical to a word. The main difference between them is that a morpheme sometimes does not stand alone, but a word, by definition, always stands alone. The linguistics field of study dedicated to morphemes is called morphology. [UKT ¶]

When a morpheme stands by itself, it is considered as a root because it has a meaning of its own (such as the morpheme cat). [UKT ¶]

When it depends on another morpheme to express an idea, it is an affix because it has a grammatical function (such as the –s in cats to indicate that it is plural). [1] Every word comprises one or more morphemes.

UKT: More in Wikipedia article.

Go back morpheme-note-b

Contents of this page

Phonotactics

UKT 191112: One thing leads to another, and we are now at Phonotactics - which concerns Eng-Latin, one of the constituents of BEPS. Above, I've quoted Wikipedia: "Phonotactics defines permissible syllable structure, consonant clusters and vowel sequences by means of phonotactic constraints. "

UKT 191112: The first glaring -- but unrecognized -- difference between Alphabet-Letter languages like English (Eng-Lat) & Georgian, and Abugida-Akshara languages like Bur-Myan & Skt-Dev, is how the consonant is defined. Bur-Myan consonant is pronounceable - a syllable -- whereas Georgian consonant is mute. The hall-mark of an Abugida-Akshara language is the Virama {a.þût} which kills the inherent vowel of the Akshara.

{ta.} (Myanmar) + viram --> თ /t/ (Georgian)

From Wikipedia: https://en.wikipedia.org/wiki/Phonotactics 191112

Phonotactics (from Ancient Greek phōnḗ "voice, sound" and tacticós "having to do with arranging") [1] is a branch of phonology that deals with restrictions in a language on the permissible combinations of phonemes. Phonotactics defines permissible syllable structure, consonant clusters and vowel sequences by means of phonotactic constraints.

Phonotactic constraints are highly language specific. For example, in Japanese, consonant clusters like /st/ do not occur. Similarly, the clusters /kn/ and /ɡn/ are not permitted at the beginning of a word in Modern English but are in German and Dutch (in which the latter appears as /ɣn/) and were permitted in Old and Middle English. In contrast, in some Slavic languages /l/ and /r/ are used alongside vowels as syllable nuclei.

Syllables have the following internal segmental structure:

• Onset (optional)
• Rhyme (obligatory, comprises nucleus and coda):
¤ Nucleus (obligatory)
¤ Coda (optional):
UKT: The Coda is obligatory for Abugida-Akshara languages like Bur-Myan and Skt-Dev, where Virama {a.þût} is obligatory to kill the inherent vowel of the Consonant-Akshara. Pal-Myan uses vertical conjuncts to hide the Virama {a.þût}, must misleads the uninitiated to say that Pal-Myan does not have the Virama {a.þût}. Also note that I use the word vowel in two senses: the inherent vowel of the Akshara, and the nuclear vowel of the syllable. Moreover, note also that the Syllables have the Canonical structure, CVÇ, and because of that the Onset-consonant is different from the Coda-consonant.

Both onset and coda may be empty, forming a vowel-only syllable, or alternatively, the nucleus can be occupied by a syllabic consonant. Phonotactics is known to affect second language vocabulary acquisition. [2]

UKT: More in Wikipedia article.

Go back phonotactics-note-b

Contents of this page

Semantics

From Wikipedia: https://en.wikipedia.org/wiki/Semantics

Semantics (from Ancient Greek: σημαντικός sēmantikós, "significant") [1] [a] is the linguistic and philosophical study of meaning in language, programming languages, formal logics, and semiotics. It is concerned with the relationship between signifiers — like words, phrases, signs, and symbols — and what they stand for in reality, their denotation.

In International scientific vocabulary semantics is also called semasiology. The word semantics was first used by Michel Bréal, a French philologist. [2] It denotes a range of ideas — from the popular to the highly technical. It is often used in ordinary language for denoting a problem of understanding that comes down to word selection or connotation. This problem of understanding has been the subject of many formal enquiries, over a long period of time, especially in the field of formal semantics. [UKT ¶]

In linguistics, it is the study of the interpretation of signs or symbols used in agents or communities within particular circumstances and contexts. [3] Within this view, sounds, facial expressions, body language, and proxemics have semantic (meaningful) content, and each comprises several branches of study. In written language, things like paragraph structure and punctuation bear semantic content; other forms of language bear other semantic content. [3]

The formal study of semantics intersects with many other fields of inquiry, including lexicology, syntax, pragmatics, etymology and others. Independently, semantics is also a well-defined field in its own right, often with synthetic properties. [4] In the philosophy of language, semantics and reference are closely connected. Further related fields include philology, communication, and semiotics. The formal study of semantics can therefore be manifold and complex.

Semantics contrasts with syntax {wa-kya.sæÑ:}, the study of the combinatorics of units of a language (without reference to their meaning), and pragmatics, the study of the relationships between the symbols of a language, their meaning, and the users of the language. [5] [UKT ¶]

Semantics as a field of study also has significant ties to various representational theories of meaning including truth theories of meaning, coherence theories of meaning, and correspondence theories of meaning. Each of these is related to the general philosophical study of reality and the representation of meaning. In 1960s psychosemantic studies became popular after Osgood's massive cross-cultural studies using his semantic differential (SD) method that used thousands of nouns and adjective bipolar scales. A specific form of the SD, Projective Semantics method [6] uses only most common and neutral nouns that correspond to the 7 groups (factors) of adjective-scales most consistently found in cross-cultural studies (Evaluation, Potency, Activity as found by Osgood, and Reality, Organization, Complexity, Limitation as found in other studies). In this method, seven groups of bipolar adjective scales corresponded to seven types of nouns so the method was thought to have the object-scale symmetry (OSS) between the scales and nouns for evaluation using these scales. For example, the nouns corresponding to the listed 7 factors would be: Beauty, Power, Motion, Life, Work, Chaos, Law. Beauty was expected to be assessed unequivocally as “very good” on adjectives of Evaluation-related scales, Life as “very real” on Reality-related scales, etc. However, deviations in this symmetric and very basic matrix might show underlying biases of two types: scales-related bias and objects-related bias. This OSS design meant to increase the sensitivity of the SD method to any semantic biases in responses of people within the same culture and educational background. [7] [8]

UKT: More in Wikipedia article.

Go back semantics-note-b

Contents of this page

Word

- UKT 191116: MLC Myanmar-English Dictionary, 2006-274c2,
¤ - n. ¹. word. ². punctuation mark.
----- - part numerical classifier for counting pieces of writing such as articles, verse, songs, etc. [Pali. {pa.da.}]
Because I need a one-to-one definition, I've to define:
¤ Word - ; Punctuation mark - ; Numerical classifier, such as {naam poad} is not really needed

From Wikipedia: https://en.wikipedia.org/wiki/Word 191116

In linguistics, a word is the smallest element that can be uttered in isolation with objective or practical meaning. [citation needed]

This contrasts deeply with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own. A word may consist of a single morpheme (for example: oh!, rock, red, quick, run, expect), or several (rocks, redness, quickly, running, unexpected), whereas a morpheme may not be able to stand on its own as a word (in the words just mentioned, these are -s, -ness, -ly, -ing, un-, -ed). [UKT ¶]

A complex word will typically include a root and one or more affixes (rock-s, red-ness, quick-ly, run-ning, un-expect-ed), or more than one root in a compound (black-board, sand-box). [UKT ¶]

Words can be put together to build larger elements of language, such as phrases (a red rock, put up with), clauses (I threw a rock), and sentences (He threw a rock too, but he missed).

The term word may refer to a spoken word {sa.ka:} or to a written word {sa}, or sometimes to the abstract concept behind either. [citation needed]. Spoken words are made up of units of sound called phonemes, and written words of symbols called graphemes, such as the letters of the English alphabet.

UKT: More in the Wikipedia article.

Go back word-note-b

Contents of this page

End of TIL file