Update: 2016-01-31 05:34 PM -0500

Devanagari script

deva.htm

by U Kyaw Tun (UKT) M.S. (I.P.S.T., U.S.A.) Based on Unicode Consortium, Not for sale. No copyright. Free for everyone. Prepared for students and staff of TIL Research Station, Yangon, MYANMAR : http://www.tuninst.net , www.romabama.blogspot.com

index.htm | Top
indic-indx.htm

Contents of this page

UKT 141607: I must acknowledge that this article, which I wrote some 10 years ago, based on Unicode Consortium, http://www.unicode.org/versions/Unicode4.0.0/ch09.pdf is my first source in my systematic study of the Akshara system of writing. I could not get anything from MLC sources (including many hours of discussion with my friend U Tun Tint of MLC) which are still stuck in their views which they had inherit from the Western philologists of the British colonial era.

A drawback of these pdf pages is due to the regular page numbers are given only at the bottom of the page which makes checking with the pdf pages difficult. Because of this drawback, I am giving the number of the pdf pages only.

Devanagari
Unicode encoding principles

Principles of script
Rendering Devanagari letters

Consonants : {byæÑ: ak~hka.ra}
Basic consonants and consonant clusters

Vowels : {þa.ra. ak-hka.ra}
Vowel letter - independent
Vowel sign - dependent on a consonant

Virama - {a.þût}
Consonant conjuncts & medials
Explicit virama
Explicit half-consonants : conjuncts

Rendering
Combining marks
Digits
Punctuation and symbols

UKT notes
• Abugida
• Asoka and Kublai Khan
• Character type : Skt-Dev and Bur-Myan compared
• Inherent vowel:
- applicable to a basic consonant such as {ka.}, but not to canonical CVÇ syllable
• Writing systems :
- Abugida and Alphabet are different Abugida may be called "Akshara writing system"

Contents of this page

9.1 Devanagari
page: 219

UKT sometime in 2004, 140607:

Bur-Myan has many characters corresponding to Skt-Dev. At least, in following characters, we find the similarity: virama ् ( {a.þût}), visarga ः ( {wic~sa.}), danda । ( {poad-hprût}), and double danda ॥ ( {poad-ma.}). The dot-above {þé:þé:ting}, the nasalization sign, and dot-below {auc-mric}, the creak sign, are there, but Skt-Dev names are confusing.

Since Pali, the speech, is written in scripts such as Devanagari, Latin and Myanmar, a comparison of the scripts can be made by the study of Pali. Note: the reader must make a distinction in the following words:
   ¤ Bur-Myan (Bama speech written in Myanmar script) : differentiate between speech and script
   ¤ Rom-Lat {speech of Rome written in Latin script) : differentiate between a place and script
   ¤ Myanmar is the word applied to the citizens of Myanmarpré who are ethnically and speech-wise different: differentiate between ethnicity and geo-political citizenship of a country. Myanmars include Bama, Shan, Karen, Mon etc. who are different "racially" (the politically correct word at present-time is "ethnically"). These ethnics speak in different speeches (spoken languages) belongin to different linguistic groups such as Tib-Bur (Tibeto-Burman), Austro-Asiatic, etc. They all use the circularly-rounded script which is rightfully called Myanmar-script or abbreviated to Myanmar. Myanmar-script is based on circles, but others, such as southern-Indic also called "rounded" are not based on circles.
   ¤ Bama speech has many dialects, such as Dawei, Danu, intha, Rakhine, Yaw
   ¤ Karen speech has at least two dialects:
   ¤ Mon speech has three dialects, Puthain (now extinct), Dala (also called Peguan - now extinct), and Moattama (the only current dialect spoken in Myanmarpré and Thailand.

The Aksharas ( Abugidas) are phonetic scripts derived from Asokan, the script used by Emperor Asoka, which has been studied in India for thousand of years, whereas IPA (International Phonetic Alphabet) has come into existence for only 2 to 3 centuries. Aksharas are arranged in matrices to give the POA (Places of Articulation), and Manner of Articulation) and there is a one-to-one (almost) correspondence between script and speech. The Alphabets, on the other hand, are non-phonetic, and you get into trouble speaking according to the way the script is written.

(p.219, pdf 4/47)
The Devanagari script is used for writing classical Sanskrit and its modern historical derivative, Hindi. Extensions to the Sanskrit repertoire are used to write other related languages of India (such as Marathi) and of Nepal (Nepali). [UKT ¶]

In addition, the Devanagari script is used to write the following speeches ~~languages~~:

Awadhi,
Bagheli, Bhatneri, Bhili, Bihari, Braj Bhasha,
Chhattisgarhi,
Garhwali, Gondi (Betul, Chhindwara, and Mandla dialects),
Harauti, Ho,
Jaipuri,
Kachchhi, Kanauji, Konkani, Kului, Kumaoni, Kurku, Kurukh,
Marwari, Mundari,
Newari [- spoken by the blood relatives of Gautama Buddha],
Palpa, and
Santali.

UKT 140608: It is unfortunate that Unicode does not take the language-families into consideration. Different language-family means different phonology, and though two speeches may be using the same script, Devanagari, the pronunciation of the vowels are radically different. However, from the script, we can still get an idea of the meaning due to the same script used. I am using this idea to compare, Bur-Myan & Mon-Myan.

All other Indic scripts, as well as the Sinhala [Austro-Asiatic] script of Sri Lanka, the Tibetan [Tib-Bur] script, and the Southeast Asian scripts, are historically connected with the Devanagari script as descendants of the ancient Asokan ~~Brahmi~~ script. The entire family of scripts shares a large number of structural features.

The principles of the Indic scripts are covered in some detail in this introduction to the Devanagari script. The remaining introductions to the Indic scripts are abbreviated but highlight any differences from Devanagari where appropriate.

Contents of this page

Unicode Standards

(pdf 4/47)
The Devanagari block of the Unicode Standard is based on ISCII-1988 (Indian Script Code for Information Interchange). The ISCII standard of 1988 differs from and is an update of earlier ISCII standards issued in 1983 and 1986.

The Unicode Standard encodes Devanagari characters in the same relative positions as those coded in positions A0-F416 in the ISCII-1988 standard. The same character code layout is followed for eight other Indic scripts in the Unicode Standard: Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam. This parallel code layout emphasizes the structural similarities of the Brahmi scripts and follows the stated intention of the Indian coding standards to enable one-to-one mappings between analogous coding positions in different scripts in the family. [UKT¶]

Sinhala, Tibetan, Thai, Lao, Khmer, Myanmar, and other scripts depart to a greater extent from the Devanagari structural pattern, so the Unicode Standard does not attempt to provide any direct mappings for these scripts to the Devanagari order.

UKT 140608: A glaring mistake of Unicode is to group Bur-Myan with Thai. Bur-Myan, the speech of the majority of Myanmar citizens, should be grouped with northern-Indic - to Devanagari. Among the many speeches in the world, Bur-Myan and a few others such as the Shanhai dialect of Chinese is a pitch-register language, and is not tonal. On the other hand Thai is tonal.

In November 1991, at the time The Unicode Standard, Version 1.0, was published, the Bureau of Indian Standards published a new version of ISCII in Indian Standard (IS)13194:1991. This new version partially modified the layout and repertoire of the ISCII-1988 standard. Because of these events, the Unicode Standard does not precisely follow the layout of the current version of ISCII. Nevertheless, the Unicode Standard remains a superset of the ISCII-1991 repertoire except for a number of new Vedic extension characters defined in IS 13194:1991 Annex G-Extended Character Set for Vedic. Modern, non-Vedic texts encoded with ISCII-1991 may be automatically converted to Unicode code points and back to their original encoding without loss of information.

Contents of this page

Unicode encoding Principles

(pdf 4/47)
The writing systems that employ Devanagari and other Indic scripts constitute abugidas -- a cross between syllabic writing systems and alphabetic writing systems. [UKT ¶]

The effective unit of these writing systems is the orthographic syllable, consisting of a consonant and vowel (CV) core and, optionally, one or more preceding consonants, with a canonical structure of (((C)C)C)V . [UKT ¶]

UKT 140608: I have been paraphrasing the above as:

CVÇ - where C is the onset-consonant, V the peak or nuclear-vowel, and, Ç the coda (killed)consonant .

The onset-consonant may be made up of :
C = 0 , absence of onset-consonant
C = 1, only one basic consonant
C = 2. two basic consonants joined into a monosyllabic medial
up to C = 4. The basic requirement is that they must be monosyllabic.

V = 1 - all syllables must have a nuclear-vowel : it can be either short-duration (one eye-blink), or long (2 eye-blinks). There are no split vowels in Devanagari, but split in Bengali and Myanmar. It reminds one of English vowels with "magic-E" such as <not> & <note>, where t is between o & e . IPA does not allow split-vowels and Romabama follows the examples of IPA and Devanagari.

Ç = 0, 1 - since the coda-consonant has to be killed, only one basic consonant is allowed. No medials can be present. To meet this requirement in Skt-Myan, Romabama has to invent from {Sa.ha.hto:}: . And accept {sha.} as a basic consonant. Since {Sa.} - the dental fricative is already an invention to take care of Eng-Lat, and not - the palatal plosive-stop, U Tun Tint of MLC said it could be accepted by MLC. Note, I am using the same glyph for both, the only differentiation is in Romabama. I have to use different glyphs for the coda: {c} & {S}.

I have quoted this para from Unicode Consortium ever since my first writing in 2004.

(pdf 4/47)
The orthographic syllable need not correspond exactly with a phonological syllable, especially when a consonant cluster is involved, but the writing system is built on phonological principles and tends to correspond quite closely to pronunciation.

UKT 140610: The word "consonant cluster" needs to be explained. They are made up of mostly of just two consonants. There are two types, the vertical and the horizontal. In the vertical conjunct, one akshara is written above the other. They are common in Pal-Myan, but also present in Bur-Myan. MLC MED2006-272 defines them as {paaHT hsing.}. An example of vertical conjunct is:

Bur-Myan: {k~ka.}, and its equivalent the Skt-Dev: क्क «kka»
They are generally unpronounceable or mute.

The horizontal conjunct should be termed {paaHT twè:}. But the MLC is silent on that. In this type, to show that the two aksharas, lying side by side, are in combination, the way they are written is changed. They are unpronounceable. But most of us, (including myself at one time, and a learned monk whose name I will not reveal but who I know very well), think them to be pronounceable, and are just variations of the basic consonants. My learned friend the monk even said "Tha'gyi {þ~þa.} has a much heavier pronunciation than ordinary tha'lé {þa.}":

Bur-Myan: {Ta.} + {HTa.} = {T~HTa.}
Bur-Myan: {þa.} + {þa.} = {þ~þa.} - commonly known as Tha'gyi or Big Tha .
Pal-Myan: {ña.} + {ña.} = {Ña.} - commonly known as Nya'gyi or Big Nya .

Caveat: There is a viram (virama) aka {a.þût} 'vowel killer' involved in the formation of above. Because I am not showing the viram I have just used the equal sign instead of the arrow. We are now ahead of our discussion, but we will come back to it later. For the time being remember that {Ña.} is a basic consonant in Bur-Myan and can be killed. I have analyzed it to be the palatal fricative with its place by the side of {ya.} which was thought to be palatal is actually the velar fricative.

The Pal-Myan {ñ~ña.} on killing on breaks up into its components. That {Ña.} is a basic consonant in Bur-Myan is unknown, and that {Ñ} is a legitimate coda in Bur-Myan and must be taken into consideration in pronunciation and transcription. As example of such a word is {præÑ} 'country' which is pronounced as /pri/. Another is the name of the famous Pali grammarian monk praised by the Gautama Buddha himself, {shin kic~sæÑ:} - and NOT "Kaccayano". See A Pali grammar on the basis of Kaccayano, by Rev. F. Mason, 1868 - PEG-indx.htm . Because of his name, I claim that {shin kic~sæÑ:} was a native of Myanmarpré probably from Tagaung who had gone overland to become a disciple of the Buddha himself. Now, I will be branded a paralogist by everyone, or worse a Myanmar bigot !

A pronounceable conjunct is known as medial - implying that it has a pronunciation between the two members.

(pdf 5/47)
The orthographic syllable is built up of akshara ~~alphabetic~~ pieces, the actual characters ~~letters~~ of the Devanagari script. These pieces consist of three distinct character types: consonant letters, independent vowels, and dependent vowel-signs. In a text sequence, these characters are stored in logical (phonetic) order.

Contents of this page

Principles of the Script

Rendering Devanagari Characters

(pdf 5/47)
Devanagari characters, like characters from many other scripts, can combine or change shape depending on their context. A characters appearance is affected by its ordering with respect to other characters, the font used to render the character, and the application or system environment. These variables can cause the appearance of Devanagari characters to differ from their nominal glyphs (used in the code charts).

Additionally, a few Devanagari characters cause a change in the order of the displayed characters. This reordering is not commonly seen in non-Indic scripts and occurs independently of any bidirectional character reordering that might be required.

Contents of this page

Consonants

UKT 140609: Refer to BEPS Consonants - BHS-indx.htm / BEPS Consonants

Consonant Letters

UKT 140609:

Consonant glyphs aka Consonant-Akshara {byæÑ:ak~hka.ra} represents a sound-syllable. The first consonant we have to learn to articulate is the velar tenuis {ka.} क «ka». It can be distinctly heard. It is a sound-syllable because of its inherent vowel which is likened to the English short a . Unfortunately, the Europeans can neither articulate nor hear this sound. They could only manage its "allophone" {hka.} ख «kha». To us, {hka.} ख «kha» is not an allophone. It is a distinct akshara by itself.

Then comes the second problem when the Western philologists learn to distinguish it from {ka.}. They heard an "aspiration" and call it kh which the Indians faithfully follow. To us, the sound is not "aspiration" but comes from deep within the throat, and so Romabama has changed the position of h to front as hk . Because of this Romabama represents all the column-2 consonants differently from the Indians. The pix on the right is from my first source in Phonetics, Online Phonetics Course offered by Department of Linguistics, University of Lausanne, Switzerland in China, http://www.unil.ch/ling/english/index.html which my wife Daw Than Than and I had studied in the early 2000s. The internet link is no longer working. See what we have studied, in old TIL format which is to be updated sometime later - UNIL-indx.htm .

Each consonant letter, {byæÑ:ak~hka.ra} usually shortened to {byæÑ:}, represents a single consonantal sound but also has the peculiarity of having an inherent vowel, generally the short vowel /a/ in Devanagari and the other Indic scripts. Thus U0915 क DEVANAGARI LETTER KA represents not just /k/ but also /ka/. In the presence of a dependent vowel [ vow-sign (vowel-sign), such as /i/] , however, the inherent vowel associated with a consonant letter is overridden by the dependent vowel.

UKT 140609:
Bur-Myan: {ka.} /ka/ + /i./ --> {ki.} /ki./
Skt-Dev: क «ka» /ka/ + ि /i/ --> कि «ki» /ki/

The above are short vowels of one eye-blink duration. We have a similar case with long vowels of 2 eye-blinks. Unfortunately Eng-Lat does not differentiate visibly short vowels from long vowels and the IPA (designed for Western Alphabet) has to resort to suprasegmentals to represent our sounds. See suprasegmentals in https://www.langsci.ucl.ac.uk/ipa/supras.html 140609

(pdf 5/47)
Consonant letters may also be rendered as half-forms , which are presentation forms used to depict the onset ~~initial~~ consonant in consonant clusters. These half-forms [which look like "syllables"] do not have a nuclear ~~inherent~~ vowel. Their rendered forms in Devanagari often resemble the full syllable ~~consonant~~ but are missing the vertical stem, which marks a syllabic core. (The stem glyph is graphically and historically related to the sign denoting the ~~inherent~~ /a/ vowel.)

UKT 140609: The term half-form , had given us considerable confusion, until we realized that it is the position of the viram (short for virama) {a.þût} has to be included in our thinking, e.g.

{ka.} + {ka.} --> {ka.ka.} 'call of the crow' a disyllabic word
क «ka» + क «ka» --> कक «kaka»

{ka.} + viram + {ka.} --> {k~ka.} vertical conjunct
क «ka» + viram ् + क «ka» --> क्क «kka»

Note: {k~ka.} - the vertical conjunct cannot be pronounced in Bur-Myan. It is these conjuncts which are being called half-forms.

{ka.} + {ka.} + viram --> {kak} 'domino' a monosyllabic word
क «ka» + viram ् + क «ka» --> कक्
Both Bur-Myan and Skt-Dev allows a visible viram to be shown. However, Pal-Myan does not allow a visible {a.þût}.

{ta.} + {k~ka.} --> {tak~ka.} 'part of word for university' {tak~ka.þol}
- a disyllabic word which can be pronounced.
त «ta» + क्क «kka» --> तक्क «takka»

Note: Westerners think that kk is a double consonant . I disagree with them because the first k belongs to the first syllable. It is a coda-consonant. The second k belongs to the second syllable. It is an onset-consonant. I have pointed out that the English word <success> is of this type.

<success> /sək'ses/ (DJPD16-515)
where the first c /k/ can very well be a palatal-plosive stop. It is the coda-consonant of the first syllable. The second c is dental-hissing-fricative. It is the onset-consonant of second syllable. There is no such thing as a double consonant . See my discussion on Pronouncing the double C :
http://www.antimoon.com/forum/t9999.htm 140609

(pdf 5/47)
Some Devanagari consonant glyphs ~~letters~~ have alternative presentation forms whose choice depends upon neighboring consonants. This variability is especially notable for {ra.} U0930 र DEVANAGARI LETTER RA, which has numerous different forms, both as the initial element and as the final element of a consonant cluster. Only the nominal forms, rather than the contextual alternatives, are depicted in the code chart.

UKT 140610: The English <r> can have two pronunciations: /r/ in rhotic accents such as GA (General American), and /ɹ/ in non-rhotic or slightly rhotic accents such as RP (Received Pronunciation) commonly known as BBC-English.

It is also the case in Sanskrit words. There can be variations. There are two cases with which we are not familiar: repha and highly rhotic vowel. The repha form is similar to {king:si:} form (in our presentation of the {nga.}-killed). The highly rhotic vowel looks like our {ra.ric}-form. There is a third case similar to {ra.hswè:} of Mon-Myan.

The traditional Sanskrit/Devanagari character ~~alphabetic~~ encoding order for consonants follows articulatory phonetic principles, starting with velar consonants and moving forward to bilabial consonants, followed by liquids (approximants ?) and then fricatives. ISCII and the Unicode Standard both observe this traditional order.

UKT 140610: At one time, when my parents were young, every learned Myanmar Buddhist monk, nun, and elder was conversant in Pal-Myan, and some even in Sanskrit and Bengali. With every language, the script used was Myanmar with special aksharas added to take care of the uncommon pronunciations. However, at present few knows even Sanskrit. It is the situation summed up by Bhamo Sayadaw U Kumara, when I approached him some years ago to introduce me to someone in Yangon to help me with my study of Skt-Dev. In the end, I had to go alone and this pdf by Unicode Consortium was my first (non-human) teacher. At present my on-going work on A Practical Sanskrit Dictionary, by A. A. Macdonell 1893. MC-indx.htm (link chk 140610), is being done by myself alone with the help of my secretaries Daw KhinWutyi (whom I have taught Devanagari), and Daw Thuzar Myint.

Contents of this page

Vowels

UKT 140609: Refer to BEPS Vowels - BHS-indx.htm / BEPS Vowels

Vowel Letter or Independent Vowel

UKT 140610: Independent vowels are what we call vow-let (vowel letters), such as {I.} इ , {I}ई , {U.} उ , {U} ऊ , and {É} ए . Their counterparts are Dependent vowels or vow-sign (vowel sign, {i.}-sign ि , {i}-sign ी , {u.}-sign ु , {u}-sign ू , and {é}-sign े. The vow-sign must have a consonant such as {ka.} क , or a "dummy" such as {a.} to support them, e.g.

{ka.} + {i.}-sign --> {ki.}
क «ka» + ि «i»-sign --> कि «ki»

(pdf 5/47)
The independent vowels in Devanagari are characters ~~letters~~ that stand on their own. The writing system treats independent vowels as orthographic CV syllables in which the consonant is null. The independent vowel letters are used to write syllables that start with a vowel.

Contents of this page

Dependent Vowel Signs (Matras)

The [vow-signs] dependent vowels serve as the common manner of writing nuclear vowel ~~non-inherent~~ vowel [of the CVÇ syllable] and are generally referred to as vowel signs, or as matras in Sanskrit [UKT: मात्रा «mātrā» ?]. [UKT ¶]

The vow-signs or dependent vowels do not stand alone; rather, they are visibly depicted in combination with a base consonant ~~letterform~~. A single consonant, or a consonant cluster, may have a dependent vowel applied to it to indicate the vowel quality of the syllable, when it is different from the inherent vowel. Explicit appearance of a dependent vowel in a syllable overrides the inherent vowel of a single consonant letter.

UKT 140610: I use the term "inherent vowel" to the basic aksharas - those given in the akshara-matrix (given in - BHS-indx.htm ) only, e.g.,

"the akshara {ka.} has the phoneme /k/ and inherent vowel /a/.'

However, it is not the case with Unicode, and so take care to find out what it actually is whenever you see Unicode using "inherent". Moreover, Unicode has lumped the "base consonantal akshara" with "conjuncts" (which it calls "consonant clusters") together. Since the base consonant is always pronounceable, whereas most conjuncts are not, except those which are called medials , what Unicode has written above is inappropriate, and I have accordingly struck-through it.

The greatest variation among different Indic scripts is found in the way that the dependent vowels are applied to base letterforms. Devanagari has a collection of non-spacing dependent vowel signs that may appear above or below a consonant letter, as well as spacing dependent vowel signs that may occur to the right or to the left of a consonant letter or (p220end-p221begin) consonant cluster. [UKT ¶]

Other Indic scripts generally have one or more of these forms, but what is a non-spacing mark in one script may be a spacing mark in another. Also, some of the Indic scripts have single dependent vowels that are indicated by two or more glyph components -- and those glyph components may surround a consonant letter both to the left and right or may occur both above and below it.

The Devanagari script has only one character denoting a left-side dependent vowel sign: U093F ि DEVANAGARI VOWEL SIGN I . Other Indic scripts either have no such vowel signs (Telugu and Kannada) or include as many as three of these signs (Bengali, Tamil, and Malayalam).

A one-to-one correspondence exists between the independent vowels and the dependent vowel signs. Independent vowels are sometimes represented by a sequence consisting of the independent form of the vowel /a/ followed by a dependent vowel sign. Figure 9-1 illustrates this relationship (see the notation formally described in the Rules for Rendering later in this section).

UKT 140610: Fig. 9-1: Pr-Sc 150% pdf . Unicode transliterations such as अ « A_n » may be ignored. Just use IAST transliteration such as अ «a». When I first wrote this text, my knowledge of Skt-Dev was nil, and I had to follow what Unicode has given.

The combination of the independent form of the default vowel /a/ (in the Devanagari script, U0905 अ DEVANAGARI LETTER A ) with a dependent vowel sign may be viewed as an alternative spelling of the phonetic information normally represented by an isolated independent vowel form. However, these two representations should not be considered equivalent for the purposes of rendering. Higher-level text processes may choose to consider these alternative spellings equivalent in terms of information content, but such an equivalence is not stipulated by this standard.

Contents of this page

Virama (Halant) - {a.þût}

UKT 140610: I usually shorten "virama" to "viram". I rarely use "halant" because it is a Hindi term. The term in Bur-Myan is {a.þût} - meaning that it is the "vowel killer". Do not use the term "vowel suppressor" or "vowel omission sign" when {a.þût} in Bur-Myan simply means "to kill".

(pdf 6/47)
Devanagari employs a sign known in Sanskrit as the virama or vowel omission sign. In Hindi it is called hal or halant, and that term is used in referring to the virama or to a consonant with its vowel suppressed by the virama; the terms are used interchangeably in this section.

The virama sign, U094D ् DEVANAGARI SIGN VIRAMA, nominally serves to cancel (or kill) the inherent vowel of the consonant to which it is applied. [UKT ¶]

UKT 140610: Note the "flag" {tän-hkwun} to indicate {a.þût} . It is shown above the akshara, . However, in Skt-Dev and Hindi, the sign is shown below the below the akshara, ् . The terms used by Unicode in the following para, "live consonant" & "dead consonant", are not general terms and leave them as they are because of info they provide.

Remember, the basis of the Akshara system of writing (the Abugida) is the syllable. And since the pronunciation or "life" of the syllable is due to the vowel, loss of the vowel means loss of life, and the syllable becomes dead. The viram aka {a.þût} is the hallmark of the Akshara system, and it is not found in Alphabetic systems.

The canonical structure of the syllable is CVÇ, superficially the same as in Alphabetic system. Yet the difference is there because the coda is always a killed consonant (Ç) whereas the onset (C) is always made up of a basic consonant such as {ka.}. In Bur-Myan, the onset can be a medial (pronounceable conjunct) such as {kya.}.

However, you will see below that in Skt-Dev, the onset is usually a conjunct because of which they have to employ a schwa /ə/ in the conjunct. Thus, {kya.} /kya./ becomes {k~ya.} /kə'ya./. Pal-Myan uses {kya.}, Skt-Myan {k~ya.}. Now we are ahead of the discussion and we will come back to it later.

When a consonant has lost its inherent vowel by the application of virama, it is known as a dead consonant; in contrast, a live consonant is one that retains its inherent vowel or is written with an explicit dependent vowel sign. [UKT ¶]

UKT 140610: Explicit viram {a.þût} is only allowed in Bur-Myan, , and Skt-Dev, ् . It is not allowed in Pal-Myan even though the idea of the {a.þût} is there. Now, when someone tells you that "there is no {a.þût} in Pali" don't argue with him. Just keep in mind that he doesn't know his Aksharas!

In {k~ka.} , the top member is the killed consonant & the bottom the "live" consonant. On coupling with {ta.}

{ta.} + {k~ka.} --> {tak~ka.}
{k~ka.} is mute or "dead"
{tak~ka.} is pronounceable or "live"

We will come across the explicit vowel sign Tagun {tän-hkwun} later.

In the Unicode Standard, a dead consonant is defined as a sequence consisting of a consonant letter followed by a virama. The default rendering for a dead consonant is to position the virama as a combining mark bound to the consonant letterform.

For example, if C_n denotes the nominal form of consonant C, and C_d denotes the dead consonant form, then a dead consonant is encoded as shown in Figure 9-2.

UKT: Image captured from 150% size of PDF page.
The equivalent of the above in Bur-Myan:

{ta.} + ( {a.þût}) --> {t}

Contents of this page

Consonant Conjuncts - Conjuncts & Medials

UKT 140610: Whereas disyllabic conjuncts are common in Skt-Dev, it is only in Bur-Myan among the BEPS languages that we find monosyllabic medials such as {kya.}. It is because of this, almost all non-natives could not pronounce my Burmese name which eventually made me change my name officially when I became a Canadian citizen.

Medials and Medial formers

- UKT 140610

When two consonants are "tied" together as in a vertical conjunct {k~ka.}, the top member is the killed-consonant even though the viram {a.þût} is not shown. Conjunct are non-pronounceable to a purist as a Bur-Myan. But the Skt-Dev speakers just put in a schwa /ə/ and pronounce it as a disyllable.

The bottom member is what we call the "conjunct former". When we use an approximant , (see - BHS-indx.htm) such as {ya.} /ja./, {ra.} /ɹa./, and {wa.} /wa./, as conjunct former, the conjunct becomes pronounceable. It has become a medial and is pronounced as a monosyllable. Thus with {ka.} as the top member, we get {kya.} /kja./, {kra.} /kɹa./, and {kwa.} /kwa./.

In my attempt to come up with Skt-Myan from Skt-Dev, I have to formulate a new word, {kRi.} which could be mistaken for every-day Bur-Myan {kri.}. I use different hood-lengths to differentiate the two. {kRi.} कृ «kṛ» is not a conjunct, because it is formed from highly rhotic Skt-Dev vowel, ऋ / ृ « ṛ» which is widely known as vocalic-R because of which I had thought it to be consonant.

क «ka» + ृ «ṛ» --> कृ «kṛ»
- in the name "Krishna" कृष्ण «kṛṣṇa» - the Hindu déva-god

Note the absence of English vowels <a, e, i, o, u>. The transliteration «ṛ» is a regular vowel.

In Bur-Myan {ha.} is also a medial former, but it is mostly used with nasals and plosive-stops modified with other approximants. In Dawè dialect of Bur-Myan {la.} is also a medial former.

(pdf 7/47)
The Indic scripts are noted for a large number of consonant conjunct forms that serve as orthographic abbreviations (ligatures) of two or more adjacent letterforms. This abbreviation takes place only in the context of a consonant cluster. An orthographic consonant cluster is defined as a sequence of characters that represents one or more dead consonants (denoted C_d) followed by a normal, live consonant letter (denoted C_l).

Under normal circumstances, a consonant cluster is depicted with a conjunct glyph if such a glyph is available in the current font(s). In the absence of a conjunct glyph, the one or more dead consonants that form part of the cluster are depicted using half-form glyphs. In the absence of half-form glyphs, the dead consonants are depicted using the nominal consonant forms combined with visible virama signs (see Figure 9-3).

Image captured from 150% size of PDF page. The formation of Bur-Myan conjuncts shown with the position of the viram {a.þût} are:

{ga.} + viram + {Da.} --> {g~Da.}
{ka.} + viram + {ka.} --> {k~ka.}
{ka.} + viram + {Sa.} --> {k~Sa.} :
   ordinarily written as <ksa> or pseudo-Kha
{ra.} + viram + {ka.} --> {r~ka.}
   in repha form shown similar to {king:si:}

The above conjuncts are all mute. Yet Skt-Dev speakers put in a schwa /ə/ to pronounce them. You can see examples of Skt-Dev words using these conjuncts in my edition of A Practical Sanskrit Grammar by A. A. Macdonell, 1893. MC-indx.htm (link chk 140610)

A typical example is {k~Sa.} क्ष «ksa» which is used the place of {hka.} ख «kha». On p077 column3, of Macdonell (with my addition of U Hoke Sein, Pal-Myan Dictionary), we see:

• क्षत्रिय ksatr-iya = क ् ष त ् र ि य
   Skt: - a. ruling; m. ruler; man, â, f. woman, of the military caste; n. sovereign power, dominion: - Mac077c3
   Pal: {hkût~ti.ya.} - mfn. one who has command over water and land, ruling class. m. a person of ruling class -- UHS-PMD0343

The spelling difference between Skt-Dev (Indo-European), and Pal-Myan (Tibeto-Burman) can be explained in terms of linguistic groups. Skt-Dev, being IE is hissing-sibilant whereas Pal-Myan is non-hissing thibilant.

(pdf 7/47)
A number of types of conjunct formations appear in these examples:
(1) a half-form of GA in its combination with the full form of DHA;
(2) a vertical conjunct K.KA; and
(3) a fully ligated conjunct K.SSA, in which the components are no longer distinct.
Note that in example (4) in Figure 9-3, the dead consonant RA_dis depicted with the nonspacing combining mark RA_sup(repha).

A well-designed Indic script font may contain hundreds of conjunct glyphs, but they are not encoded as Unicode characters because they are the result of ligation of distinct letters. Indic script rendering software must be able to map appropriate combinations of characters in context to the appropriate conjunct glyphs in fonts.

Contents of this page

Explicit Virama (Halant)

UKT 140610: I have been told again and again that "there is no {a.þût} in Pali" . This statement though apparently correct is totally misleading in essence, because the idea of "killing the vowel" is in all Akshara way of writing which includes Pali. It is only that the {a.þût} indicated by a Tagun {tän-hkwun} 'flag' is not explicitly shown in Pali. Explicit viram {a.þût} is shown only in Bur-Myan, , and Skt-Dev, ् .

(pdf 7/47)
Normally a virama character serves to create dead consonants that are, in turn, combined with subsequent consonants to form conjuncts. This behavior usually results in a virama sign not being depicted visually. [UKT ¶]

UKT 140610: From now on, the Unicode text is being directed to the font maker and the designer of the rendering engine such as Arial Unicode MS and Lucida Sans Unicode. Though Arial Unicode MS is suitable for Skt-Dev, it is not suitable for Bangla-Bengali. For the latter, which has split vowels similar to {kau:} I have to use Lucida Sans Unicode.

Lucida Sans Unicode: কো

Now, try displaying in Arial Unicode MS:

Arial Unicode MS: কো

The Bengali akshara ক «ka» stays in the middle only in Lucida. In Arial it precedes the split vowel giving a unintelligible "word".

Occasionally, this default behavior is not desired when a dead consonant should be excluded from conjunct formation, in which case the virama sign is visibly rendered. To accomplish this goal, the Unicode Standard adopts the convention of placing the character U200C ‌ ZERO WIDTH NON-JOINER immediately after the encoded dead consonant that is to be excluded from conjunct formation. In this case, the virama sign is always depicted as appropriate for the consonant to which it is attached.

UKT: Characters such as U200C ZERO WIDTH NON-JOINER are presented in Code charts similar to the one on the right: where NWNJ stands for ZERO WIDTH NON-JOINER. See U2000 General Punctuation (broken link 140615)

Contents of this page

(pdf 8/47)
For example, in Figure 9-4, the use of ZERO WIDTH NON-JOINER prevents the default formation of the conjunct form क ्ष (K.SSA_n). (UKT: The previous character is obtained by inputting U0915 U094D U0937)

Image captured from 150% of PDF page.

Contents of this page

Explicit Half-Consonants

(pdf 8/47)
When a dead consonant participates in forming a conjunct, the dead consonant form is often absorbed into the conjunct form, such that it is no longer distinctly visible. In other contexts, the dead consonant may remain visible as a half-consonant form. In general, a half-consonant form is distinguished from the nominal consonant form by the loss of its inherent vowel stem, a vertical stem appearing to the right side of the consonant form. In other cases, the vertical stem remains but some part of its right-side geometry is missing.

In certain cases, it is desirable to prevent a dead consonant from assuming full conjunct formation yet still not appear with an explicit virama. In these cases, the half-form of the consonant is used. To explicitly encode a half-consonant form, the Unicode Standard adopts the convention of placing the character U200D ZERO WIDTH JOINER immediately after the encoded dead consonant. The ZERO WIDTH JOINER denotes a nonvisible letter that presents linking or cursive joining behavior on either side (that is, to the previous or following letter). Therefore, in the present context, the ZERO WIDTH JOINER may be considered to present a context to which a preceding dead consonant may join so as to create the half-form of the consonant.

For example, if C_h denotes the half-form glyph of consonant C, then a half-consonant form is encoded as shown in Figure 9-5.

Image captured from 150% of PDF page

• In the absence of the ZERO WIDTH JOINER, this sequence would normally produce the full conjunct form क्ष (K.SSA_n). This encoding of half-consonant forms also applies in the absence of a base letterform. That is, this technique may also be used to encode independent half-forms, as shown in Figure 9-6.

Image captured from 150% size of PDF page.

Contents of this page
p223

(pdf 9/47)
Consonant Forms. In summary, each consonant may be encoded such that it denotes a live consonant, a dead consonant that may be absorbed into a conjunct, or the half-form of a dead consonant (see Figure 9-7).

Image captured from 150% size of PDF page.

Contents of this page

Rendering

(pdf 9/47)
Rules for Rendering. This section provides more formal and detailed rules for minimal rendering of Devanagari as part of a plain text sequence. It describes the mapping between Unicode characters and the glyphs in a Devanagari font. It also describes the combining and ordering of those glyphs.

These rules provide minimal requirements for legibly rendering interchanged Devanagari text. As with any script, a more complex procedure can add rendering characteristics, depending on the font and application.

It is important to emphasize that in a font that is capable of rendering Devanagari, the number of glyphs is greater than the number of Devanagari characters.

Notation. In the next set of rules, the following notation applies:

	C_n	Nominal glyph form of consonant C as it appears in the code charts.
	C_l	A live consonant, depicted identically to C_n.
	C_d	Glyph depicting the dead consonant form of consonant C.
	C_h	Glyph depicting the half-consonant form of consonant C.
	L_n	Nominal glyph form of a conjunct ligature consisting of two or more component consonants. A conjunct ligature composed of two consonants X and Y is also denoted X.Y_n.
	RA_supA	A non-spacing combining mark glyph form of U0930 ः DEVANAGARI LETTER RA positioned above or attached to the upper part of a base glyph form. This form is also known as repha.
	RA_sub	A nonspacing combining mark glyph form of U0930 ः DEVANAGARI LETTER RA positioned below or attached to the lower part of a base glyph form.
	V_vs	Glyph depicting the dependent vowel sign form of a vowel V.
	VIRAMA_n	The nominal glyph form of the nonspacing combining mark depicting U094D ् DEVANAGARI SIGN VIRAMA.

• A virama character is not always depicted; when it is depicted, it adopts this nonspacing mark form.

Contents of this page

(pdf 10/47 begin)
Dead Consonant Rule. The following rule logically precedes the application of any other rule to form a dead consonant. Once formed, a dead consonant may be subject to other rules described next.

Rule 1. When a consonant C_n precedes a VIRAMA_n , it is considered to be a dead consonant C_d . A consonant C_n that does not precede VIRAMA_n is considered to be a live consonant C_l.

Image captured from 150% size of PDF page.

{ta.} + --> {t}

Consonant RA Rules. The character U0930 र DEVANAGARI LETTER RA takes one of a number of visual forms depending on its context in a consonant cluster. By default, this letter is depicted with its nominal glyph form (as shown in the U0900 Devanagari). In some contexts, it is depicted using one of two nonspacing glyph forms that combine with a base letterform.

Rule 2. If the dead consonant RA_d precedes a consonant, then it is replaced by the superscript nonspacing mark RA_sup , which is positioned so that it applies to the logically subsequent element in the memory representation.

Image captured from 150% size of PDF page.

Rule 3. If the superscript mark RA_sup is to be applied to a dead consonant and that dead consonant is combined with another consonant to form a conjunct ligature, then the mark is positioned so that it applies to the conjunct ligature form as a whole.

Image captured from 150% size of PDF page.

Rule 4. If the superscript mark RA_sup is to be applied to a dead consonant that is subsequently replaced by its half-consonant form, then the mark is positioned so that it applies to the form that serves as the base of the consonant cluster.

Image captured from 150% size of PDF page.

Contents of this page

Rule 5. In conformance with the ISCII standard, the half-consonant form RRA_h is represented as eyelash-RA. This form of RA is commonly used in writing Marathi and Newari.

Image captured from 150% size of PDF page.

Rule 5a. For compatibility with The Unicode Standard, Version 2.0, if the dead consonant RA_d precedes ZERO WIDTH JOINER, then the half-consonant form RA_h, depicted as eyelash-RA, is used instead of RA_sup .

Image captured from 150% size of PDF page.

Rule 6. Except for the dead consonant RA_d, when a dead consonant C_d precedes the live consonant RA_l , then C_d is replaced with its nominal form C_n , and RA is replaced by the subscript nonspacing mark RA_sub, which is positioned so that it applies to C_n.

Image captured from 150% size of PDF page.

Rule 7. For certain consonants, the mark RA_sub may graphically combine with the consonant to form a conjunct ligature form. These combinations, such as the one shown here, are further addressed by the ligature rules described shortly.

Image captured from 150% size of PDF page.

Rule 8. If a dead consonant (other than RA_d ) precedes RA_d, then the substitution of RA for RA_sub is performed as described above; however, the VIRAMA that formed RA_d remains so as to form a dead consonant conjunct form.

Image captured from 150% size of PDF page.

A dead consonant conjunct form that contains an absorbed RA_d may subsequently combine to form a multipart conjunct form.

Image captured from 150% size of PDF page.

Contents of this page
226

Modifier Mark Rules. In addition to vowel signs, three other types of combining marks may be applied to a component of an orthographic syllable or to the syllable as a whole: nukta, bindus, and svaras.

Rule 9. The nukta sign, which modifies a consonant form, is placed immediately after the consonant in the memory representation and is attached to that consonant in rendering. If the consonant represents a dead consonant, then NUKTA should precede VIRAMA in the memory representation.

Image captured from 150% size of PDF page.

Rule 10. The other modifying marks, bindus and svaras, apply to the orthographic syllable as a whole and should follow (in the memory representation) all other characters that constitute the syllable. In particular, the bindus should follow any vowel signs, and the svaras should come last. The relative placement of these marks is horizontal rather than vertical; the horizontal rendering order may vary according to typographic concerns.

Image captured from 150% size of PDF page.

Ligature Rules. Subsequent to the application of the rules just described, a set of rules governing ligature formation apply. The precise application of these rules depends on the availability of glyphs in the current font(s) being used to display the text.

Rule 11. If a dead consonant immediately precedes another dead consonant or a live consonant, then the first dead consonant may join the subsequent element to form a two-part conjunct ligature form.

Image captured from 150% size of PDF page.

Rule 12. A conjunct ligature form can itself behave as a dead consonant and enter into further, more complex ligatures.

Image captured from 150% size of PDF page.

A conjunct ligature form can also produce a half-form.

Image captured from 150% size of PDF page.

Contents of this page
p227

Rule 13. If a nominal consonant or conjunct ligature form precedes RA_sub as a result of the application of rule R6, then the consonant or ligature form may join with RA_sub to form a multipart conjunct ligature (see rule R6 for more information).

Image captured from 150% size of PDF page.

Rule 14. In some cases, other combining marks will combine with a base consonant, either attaching at a nonstandard location or changing shape. In minimal rendering there are only two cases, RA_l with U_vs or UU_vs.

Image captured from 150% size of PDF page.

Memory Representation and Rendering Order. The order for storage of plain text in Devanagari and all other Indic scripts generally follows phonetic order; that is, a CV syllable with a dependent vowel is always encoded as a consonant letter C followed by a vowel sign V in the memory representation. This order is employed by the ISCII standard and corresponds to both the phonetic and the keying order of textual data (see Figure 9-8).

Image captured from 150% size of PDF page.

Because Devanagari and other Indic scripts have some dependent vowels that must be depicted to the left side of their consonant letter, the software that renders the Indic scripts must be able to reorder elements in mapping from the logical (character) store to the presentational (glyph) rendering. For example, if C_n denotes the nominal form of consonant C, and V_vs denotes a left-side dependent vowel sign form of vowel V, then a reordering of glyphs with respect to encoded characters occurs as just shown.

Rule 15. When the dependent vowel I_vs is used to override the inherent vowel of a syllable, it is always written to the extreme left of the orthographic syllable. If the orthographic syllable contains a consonant cluster, then this vowel is always depicted to the left of that cluster. For example:

Image captured from 150% size of PDF page.

Sample Half-Forms. Table 9-1 shows examples of half-consonant forms that are commonly used with the Devanagari script. These forms are glyphs, not characters. They may be encoded explicitly using ZERO WIDTH JOINER as shown; in normal conjunct formation, they may be used spontaneously to depict a dead consonant in combination with subsequent consonant forms.

Contents of this page
p228

Image captured from 150% size of PDF page.

Sample Ligatures. Table 9-2 shows examples of conjunct ligature forms that are commonly used with the Devanagari script. These forms are glyphs, not characters. Not every writing system that employs this script uses all of these forms; in particular, many of these forms are used only in writing Sanskrit texts. Furthermore, individual fonts may provide fewer or more ligature forms than are depicted here.

Note to TIL editor 140611: I could not get a clear picture by regular method
of Screen-capture. I have to do it twice, once for the upper portion, and
second time for the lower portion. I have save them as, Table0902_1.gif,
& Table0902_2.gif, and them together on this html page.

Image captured from 150% size of PDF page. This table appeared across pages 229 and 230 in the original pdf file.

Sample Half-Ligature Forms. In addition to half-form glyphs of individual consonants, half-forms are used to depict conjunct ligature forms. A sample of such forms is shown in Table 9-3. These forms are glyphs, not characters. They may be encoded explicitly using as shown; in normal conjunct formation, they may be used spontaneously to depict a conjunct ligature in combination with subsequent consonant forms.

Image captured from 150% size of PDF page.

Language-Specific Allographs. In Marathi and some South Indian orthographies, variant glyphs are preferred for U0932 ल DEVANAGARI LETTER LA and U0936 श DEVANAGARI LETTER SHA, as shown in Figure 9-9. Marathi also makes use of the eyelash form of the letter RA, as discussed previously in rule R5.

Image captured from 150% size of PDF page.

Contents of this page

Combining Marks

Devanagari and other Indic scripts have a number of combining marks that could be considered diacritic. One class of these marks, known as bindus {bain~du.} (MLC MED2006-315), is represented by [UKT ¶]:

U0901 ँ DEVANAGARI SIGN CHANDRABINDU and

U0902 ं DEVANAGARI SIGN ANUSVARA.
Bur-Myan: {þé:þé:ting} 'dot above' . It imparts a nasal sound.

These marks indicate nasalization or final nasal closure of a syllable. [UKT ¶]:

U093C ़ DEVANAGARI SIGN NUKTA
   Bur-Myan: {auk-mric} 'dot below'

{auk-mric} 'dot below' gives a "creak". See MLC MED2006-620. It is used to shorten the short vowel duration from one eye-blink to 1/2 eye-blink. In Mon-Myan and Skt-Dev, its place is taken by {wic-sa.}. e.g.
   Mon-Myan: {na.} --> {na:.}
   Skt-Dev:      न «na» --> नः

[NUKTA] is a true diacritic. It is used to extend the basic set of consonant letters by modifying them (with a subscript dot in Devanagari) to create new letters. U0951 ॑.. U0954 ॔ are a set of combining marks used in transcription of Sanskrit texts.

Contents of this page
p230

Digits

Each Indic script has a distinct set of digits appropriate to that script. These digits may or may not be used in ordinary text in that script. European digits have displaced the Indic script forms in modern usage in many of the scripts. Some Indic scripts - notably Tamil - lack a distinct digit for zero.

Contents of this page

Punctuation and Symbols

U0964 । DEVANAGARI DANDA is similar to a full stop. Corresponding forms occur in many other Indic scripts. U0965 ॥ DEVANAGARI DOUBLE DANDA marks the end of a verse in traditional texts. U0970 ॰ DEVANAGARI ABBREVIATION SIGN appears after letters or combinations.

UKT: Danda and double danda correspond to Bur-Myan {poad hprût} (colloquially {poad hti:}) and {poad ma.} .

Many modern languages written in the Devanagari script intersperse punctuation derived from the Latin script. Thus U002C , COMMA and U002E . FULL STOP are freely used in writing Hindi, and the danda is usually restricted to more traditional texts.

Encoding Structure. The Unicode Standard organizes the nine principal Indic scripts in blocks of 128 encoding points each. The first six columns in each script are isomorphic with the ISCII-1988 encoding, except that the last 11 positions (U0955 .. U095F in Devanagari, for example), which are unassigned or undefined in ISCII-1988, are used in the Unicode encoding.

The seventh column in each of these scripts, along with the last 11 positions in the sixth column, represent additional character assignments in the Unicode Standard that are matched across all nine scripts. For example, positions U+xx66 ... U+xx6F and U+xxE6 ... U+xxEF code the Indic script digits for each script.

The eighth column for each script is reserved for script-specific additions that do not correspond from one Indic script to the next.

Contents of this page

Other Languages. Sindhi makes use of U0974 DEVANAGARI LETTER SHORT YA. Several implosive consonants in Sindhi are realized as combinations with nukta and U0952 ॒ DEVANAGARI STRESS SIGN ANUDATA . Konkani makes use of additional sounds that can be made with combinations such as U091A च DEVANAGARI LETTER CA plus U093C ़ DEVANAGARI SIGN NUKTA and U091F ट DEVANAGARI LETTER TTA plus U0949 ॉ DEVANAGARI VOWEL SIGN CANDRA O.

Contents of this page
p231

UKT notes

Abugida

From: http://www.everything2.org/index.pl?node=abugida

An abugida, also called an alphasyllabary, is a writing system wherein the basic symbols represent a consonant plus an unmarked vowel. When a different vowel is wanted, a diacritic or some other modification is made to the sign. The sign used to indicate the vowel is dropped is called a virama, or a "vowel killer".

The word "abugida" comes from the first few signs of the Ethiopic Amharic script, which is an example of an abugida. Devanagari is another abugida, used in India.

Go back abugida-note-b

Contents of this page

Asoka and Kublai Khan

syn. Ashoka

- UKT sometime in 2004, from various sources
now being edited: 140607

Asoka {a.þau:ka.}, known as “the Great”. Died 232 B.C. 1. King of Magadha (273-232) who was converted to Buddhism and adopted it as the state religion. -- AHTD

Genghis Khan also Jen·ghis Khan or Jenghiz Khan, Originally Temujin. 1162?-1227 1. Mongol conqueror who united the Mongol tribes and in 1206 took the name Genghis Khan ( “supreme conqueror ”). He annexed northern China, central Asia, Iran, and southern Russia. -- AHTD

Asoka {a.þau:ka.} was the grandson of Chandra Gupta of the Mauriya Empire. A feared warrior like his grand-father, ruled over a vast empire extending from Afghanistan in the west to the borders of Myanmar in the east, foot-hills of Himalayas in the north to about half of modern India in the south. He was converted to Buddhism due to the influence of his wife Devi.

Asoka wrote in a script now undeservedly named Brahmi, which the Hindu Brahmin-Poannas could not decipher when called upon by the then the emperor of India who was a Muslim. Edgerton calls the script Asokan or Asoka script.

Asoka definitely used the script throughout his vast empire for peoples who speak different languages. The name "Brahmi script" is a misnomer because Asoka was not a Brahmin (a member of Indo-European stock who practised Hinduism). Asoka belonged the class of rulers like Gotama Buddha (born about 250 years before). His language was Magadhi: Pali the language of Theravada Buddhism was derived from Magadha and Lankan of SriLanka by the Buddhist monks.

Magadha and Pali were not as rhotic as Sanskrit. Magadha being a Tib-Myan language was probably non-rhotic like Bur-Myan. Asoka extended his influence primarily due to his peaceful means augmented by his writing system. His influence extended to areas outside his empire.

It is probable that Kublai Khan about a thousand later adopted the method of reformed Asoka in inventing an abugida to be used throughout his vast empire.

It is interesting that the Myanmar script, used by Burmese, Karens, Mons, Shans and a few others, was supposed to be invented about the same period that Kublai was inventing his. Since, northern Myanmar had common borders with the Mauriya empire of Asoka, it is probable that the Bur-Myan script has descended directly from that of Asoka instead of going through the intermediary of the Mons as it is now accepted. (this view is my own and it is not accepted by the majority of my peers including U Tun Tint) -- UKT

Go back asok-note-b

Contents of this page

character types

by UKT, sometime in 2002, 140611:

The character types in Bur-Myan (Burmese-Myanmar) belonging to Tib-Bur (Tibeto-Burman) linguistic group, are given below.

Skt-Dev (Sanskrit-Devanagari) belongs to IE (Indo-European) linguistic group. Bur-Myan is non-inflexional, whereas Skt-Dev has inflexions. Bur-Myan is devoid of hissing sounds, except those of the palatal POA and is non-rhotic, whereas Skt-Dev has hissing sounds and is rhotic. We can summed up that the Bur-Myan and Skt-Dev are entirely different, yet because both are phonetic languages based on well-defined phonemic principles, are comparable in many ways.

Consonant letters:

{ka.} - named {ka.kri:}. Corresponds to क «ka» (U0915) in Devanagari. It is made up of two phonemes, /k/ & /a/ = /ka/. It is a syllable and can be pronounced. Its POA is velar, and it is classed as a plosive-stop. Remember, {ka.kri:} is the name of the akshara, with pronunciation {ka.}. In the akshara matrix, its position is r1c1 (row#1-column#1) the same as that of क «ka». The generic name for both {ka.} & क «ka» is "Ka-akshara" or you can refer to them as "r1c1 character".

Independent vowel:

{I} - named {ak-hka-ra i}. Corresponds to ई (U0908) in Devanagari with the same generic name "Akshara-i".

Dependent vowel:

The symbol {i} (which I am calling "vow-sign-i" ) has a corresponding symbol in Skt-Dev as ि. Both are /i/ in IPA which does not differentiate the short vowels from long vowels. Short vowel has a duration of one elye-blink, whereas the long lasts two eye-blinks. We can use suprasegmentals to differentiate the two.

This sign can be combined with {ka.} to form the syllable {ki} or with a "dummy" as {i.}.

The so-called "dependent vowels" are NOT vowels. They are just signs and have no sound. This term by Unicode had misled me into thinking that they were vowels and I had expected them to have sounds of their own.

Though {ka.}, the consonant-akshara, is a syllable and a word in its own right (meaning: "to dance"), when its inherent vowel /a/ is killed, it loses both sound and meaning. It should no longer be termed an akshara -- in fact it has become an alphabet. I use the simile of "a man and his corpse" - the "the corpse is not the man".

Go back char-type-b

Contents of this page

Inherent vowel

by UKT sometime in 2002

What is the inherent vowel? That has been my question ever since I came to study the akshara system of writing. To say that it is approximately the English "short-a", does not mean much, for the English <a> itself has a changing nature and can mean anything to even to a "native-English" speaker. And when you say a "native-speaker", it becomes more confusing because the US-American, Australian, British, Canadian, and New Zealander speak in their own sweet ways. Yet, they are all native-speakers. And unless you are familiar with the Eng-Lat (English-Latin) vowels, to say that the inherent vowel is close to /a/ is meaningless.

The inherent vowel sometimes appears as a schwa in words such as {a.ni} 'colour red' /ə.ni/. See the insert on non-rhotic English dialects (spoken by South Indians, formerly dubbed "the Hindus" and Myanmars) below.

From: http://dictionary.laborlawtalk.com/Rhotic

Non-rhotic dialects of English began to emerge in about the year 1600. The loss of the sound [r] is known as derhotacization. Evidence of the earliest date of the sound change is shown in the English word juggernaut, which is first attested in the 1630s. This represents the Hindi word jagannâth, meaning "lord of the universe"; the English spelling shows that the digraph er was chosen to represent a Hindi sound that is close to the English schwa.

A non-rhotic speaker pronounces the [r] in red, torrid, watery (in each case the [r] is followed by a vowel) but not the written [r] of hard, nor that of car or water except when the word is followed by a vowel. In most non-rhotic accents, if a word ending in written [r] is followed closely by another word beginning with a vowel the [r] is, however, sounded — as in water ice. This phenomenon is referred to as "linking [r]". Many non-rhotic speakers also insert epenthetic [r]s between vowels (droring for drawing). This so-called "intrusive [r]" is frowned upon by those who use the non-rhotic Received Pronunciation (RP) but even they frequently "intrude" an epenthetic [r] at word boundaries, pronouncing, for example, Africa and Asia as Africa-r-and Asia.

For non-rhotic speakers, what was historically a vowel plus [r] is now usually realized as a long vowel. So car, hard, fur, born are phonetically /kaː/, /haːd/, /fəː/, /bɔːn/ (see International Phonetic Alphabet for a key to phonetic symbols). This length is retained in phrases, so car owner is /kaːɹ oʊnə/. But a final schwa remains short, so water is /wɔːtə/. For some speakers some long vowels alternate with a diphthong ending in schwa, so wear is /wɛə/ but wearing is /wɛːɹiŋ/. Some pairs of words are homophonic for non-rhotic speakers but not for rhotic speakers; for example, spa and spar are pronounced identically by many non-rhotic speakers, but differently by rhotic speakers.

...

Areas with non-rhotic accents include Africa, Australia, most of the Caribbean, most of England (especially Received Pronunciation speakers), New Zealand, South Africa, the southeastern United States (although pockets of rhotic speakers do exist in the southern United States, especially in northwest Alabama, central Tennessee and peninsular Florida — in general the non-rhotic accent is more common in coastal Southern styles, while the Appalachian accent is rhotic), the northeastern United States (New England and New York State), and Wales.

Loss of the inherent vowel makes the consonant akshara loss its sound. Thus a "killed" {ka.} is soundless.

Go back inherent-vow-b | inherent-vow-b2

Contents of this page

Writing systems

- UKT, sometime in 2004, 140608

There are two main writing systems current: the Akshara (Abugida) and Alphabet. Other writing systems are described below

Izumi Suzuka, speaking at a student seminar in 2003 at The Laboratory of Professor Osato http://mlcr.nagaokaut.ac.jp/main1/signs_of_syllables.htm spoke on Writing Systems -- Signs of Syllables. The speaker classified Writing Systems into:

1. Alphabet: A writing system, in which consonants and vowels are represented equally by separate letters. Greek and Roman alphabets, Cyrillic alphabet, and some artificial alphabets such as Armenian that was invented in 405 and still in use today belong to this category.

2. Consonantal alphabet: Alphabets that consist of consonantal letters only. Some vowels may be optionally indicated in writing, however vowel sounds had to be supplied by the speaker or reader. North Semitic* languages such as Arabic and Hebrew, and also the languages that use modified Arabic letters: Persian, Urdu (Pakistan), Uighur belong to this category. (*Semitic: A language family that includes Arabic, Hebrew, Amharic (spoken in Ethiopia) and Tigrinya (Northern Ethiopia and Eritrea).)

3. Alpha-syllabary (or Syllabic alphabet, Semi-syllabary): Writing systems, in which characters sometimes represent a single consonant or vowel, as in an alphabet, and sometimes a syllable, as in a syllabary. More specifically, each basic consonant character is represented by the graphic syllable where the consonant is modified by the inherent vowel, the most frequently used vowel, usually /a/. When a consonant is modified by any other vowel, a signature (or diacritic form) of the vowel is added around the graphic syllable.
Alpha-syllabary is a characteristic for Amharic and languages in or derived from India such as Devanagari, Bengali, Oriya, Assamese, Gujarati, Tamil, Malayalam, Telugu, Sinhalese, Burmese, Thai, Lao, Khmer, and Tibetan.

4. Syllabary: Writing system in which each character represents a syllable, typically consisting either CV or V-type syllable: e.g. Japanese Katakana and Hiragana.

5. Logogram (or Ideogram): A character in writing which represents complete word is called logogram. Examples are Chinese character, and early Egyptian hieroglyph and Sumerian cuneiform.

Go back writ-system-note-b

Contents of this page
End of TIL file

Devanagari script

9.1 Devanagari page: 219

Unicode Standards

Consonant Letters

Vowel Letter or Independent Vowel

Dependent Vowel Signs (Matras)

Virama (Halant) - {a.þût}

Consonant Conjuncts - Conjuncts & Medials

Medials and Medial formers

Explicit Virama (Halant)

Lucida Sans Unicode: কো

Arial Unicode MS: কো

Asoka and Kublai Khan

Consonant letters:

Independent vowel:

Dependent vowel:

9.1 Devanagari
page: 219