Jim Monty wrote:The following sequence of characters (four Unicode code points)…
เพื่
U+0E40 THAI CHARACTER SARA E
U+0E1E THAI CHARACTER PHO PHAN
U+0E37 THAI CHARACTER SARA UEE
U+0E48 THAI CHARACTER MAI EK
…constitutes one Unicode extended grapheme cluster. Is it just one Thai letter? In general, are Unicode extended grapheme clusters equivalent to Thai letters and vice versa?
I'm a computer programmer. I don't speak or read Thai.
Thank you very much for your kind response to my inquiry.
Jim Monty
U+0E1E =>
พพ is a stand-alone in the sense that it need not occur with other characters to define a component of sound in Thai.
พ in syllable-initial position is /ph/ (voiceless aspirated bilabial stop) and simply /p/ in final position.
Thus,
พา /phaa/ ‘to take (along), to guide’;
พูด ‘/phûut/ to speak, to say (as a literal utterance);
พิมพ์ /phim/ ‘to print, type’;
สะพาน /saphaan/ ‘bridge’
U+0E40 =>
เ /ee/ (as in the girl’s name Renee’), a non-standalone that precedes in the writing the consonant it actually *follows* in pronunciation.
Thus,
เท /thee/ ‘to pour out; slanting’;
เขน /khěen/ ‘shield, buckler’;
เสน่ห์ /sanèe/ ‘charm(s), spell’
U+0E37 =>
อื (the upper character only), a non-standalone that is written above the consonant it actually *follows* in pronunciation. Use of this character
อื /yy/ (lower-high central, like the ‘i’ in US English ‘robin’) requires a final consonant.
Thus,
มืด /mŷyt/ ‘dark, obscure’;
หรือ /ry̌y/ ‘or’;
คือ /khyy/ ‘that is to say, (literally) is’;
ดื้อ /dŷy/ ‘heastrong’
Not shown in your listing is
อ, which, in addition to a mute consonant which makes possible syllables beginning with a vowel, is also a vowel symbol used in specific combinations with other symbols to denote long and short vowels and vowel dipthongs.
เพื่อ /phŷa/ ‘for the purpose of, in order to’
Here, the
เ,
อื (upper character only) and
อ *together* as one larger unit, denote the dipthong /ya/.
Likewise,
เนื้อ /nýa/ ‘meat, flesh’;
เลือด /lỳad/ ‘blood’;
เสื่อม /sỳam/ ‘to decline, deteriorate, wear out’
In each of these, the
เ,
อื and
อ => /ya/
U+0E48 =>
' , a non-standalone that denotes the tone (pitch contour of the voice). With *certain* consonants like
พ, it denotes a falling tone:
เพื่อ /phŷa/ ‘for the purpose of, in order to’, in which /^/ is a fairly common transcription for the falling tone.