Khmer Unicode Mailing List 2001/11/24

To Concerned party,

I have read the opinions and assertions of many people so far. I beg all of the concerned parties to pay the attention to following three points for sensible judgment.

1. What is the appropriate Khmer encoding?
During the discussion, some people agreed to the proposal of Cambodian national body and some people insisted on the existing Khmer encoding. The idea proposed by Cambodian national body was made based on efficient process matching to the structure of Khmer script. On the other hand, the existing Khmer encoding was made under Indic approach saying original root of Khmer script belonged to Brahmic script. After a careful examination on Virama model applied to Khmer script in the existing Khmer encoding it became apparent that such model was not correct 'Virama model' of original Indic. Consequently, the authors of existing Khmer encoding was obliged to call the model "COENG" instead of Virama. In another words, the authors of existing Khmer encoding created a new artificial model, which means the "Coeng model" is not an appropriate model for Khmer encoding. This background was clearly specified in the document N2380, N2380R and N2406.

2. What have we discussed in the past?
It is my great regret that some people merely insisted on the existing Khmer encoding without discussing appropriate encoding matching to Khmer script in a constructive manner. Their theory was " Khmer script should follow Indic script. For good or bad, existing Khmer encoding is part of standard, so you must accept it." I, however, point out the fact that there was no participation of Cambodian national body, no endorsement by the Cambodian government in this standardization process. The standardization committee never asked Cambodian national body for even a comment during the process - this negligence couldn't be justified.

3. What do we learn from the past failure?
Some people insist that implementers and end-users will be disturbed and confused if we do not use the existing Khmer encoding. I trust that the Cambodian national body have no any intention to cause troubles and confusions to the users. The Cambodian national body desire to improve the encoding system of his national script in a reasonable process as stated in the document N2380R and N2406. If every party does not deliberate the requests of the Cambodian national body now, there will be no constructive result. We should learn from the past failure but should not close the eyes and ears. I solicit all of the people to refer the issue to your own conscience.

Let me attach a mail in conclusion. The author of the mail is Prof. Alain Daniel, a Khmer linguist who made many efforts in this issue in order not to repeat past failures. I offer special thanks to Prof. Alain Daniel and a pardon for publication of the mail here.

Also, I would like to ask everyone to reread carefully relevant documents of JTC1 and so on. For instance,

ISO/IEC JTC1/SC2/WG2 N893, Dr. Yannis Haralambous

ISO/IEC JTC1/SC2/WG2 N2380, N238R, Cambodian official objection to the existing Khmer block in UCS, Cambodian National Body

ISO/IEC JTC1/SC2/WG2 N2385, Response to Cambodian official objection to Khmer Block (N2380), Maurice Bauhanh and Michael Everson

ISO/IEC JTC1/SC2/WG2 N2406, Response to WG2 Document N2385, Cambodian National Body

ISO/IEC JTC1/SC2 N3571, Letter from Cambodia to JTC1 Chairman regarding Khmer character encoding in ISO/IEC 10646

Letters of objection from the National Information Communications Technology Development Authority of Cambodia (NiDA) to ISO Central Secretariat, to JTC1 Chairman, to IEC Central Office, 30 and 31 May
2001

Letters of supporting Cambodian effort from eASEAN Task Force to ISO Central Secretariat, to IEC Central Office, to JTC1 Secretariat, 22 Aug. 2001

Letter from the Industrial Standards Bureau of Cambodia (ISC) to IEC General Secretary, 09 Oct. 2001

Kind regards,
Lao Kim Leang
Member of Khmer Philology Project

*************

It is disappointing that this discussion should be dragged back to a polemic versus a rational consideration of the issues.

This debate may be suffering from the recent legacy (how is that for a dialectic?) of glyph-based encodings. Excessive attention has been given to visual representations at the expense of phonetic realities. Happily, new
font technology liberates encoders from that quagmire.

-----Original Message-----
Lao Kim Leang
Sent: 24 November 2001 00:55
To: khmer@unicode.org

> script in the existing Khmer encoding it became apparent that such model was not correct 'Virama model' of original Indic. Consequently, the authors of existing Khmer encoding was obliged to call the model "COENG" instead of Virama. In another words, the authors of existing Khmer encoding created a new artificial model, which means the "Coeng model" is not an appropriate model for Khmer encoding.

There is a saying, 'A rose by any other name is just as sweet'. The word COENG is used when audibly spelling every subscript which I have ever heard in Khmer. COENG performs the same function as VIRAMA of removing the vowel from the character that precedes it. But perhaps my experience is too limited: Please supply a sound file of a subscript-containing Khmer word being verbally spelled that does not contain COENG.

The virama model is one characteristic that unites Indic scripts. It has been localised in the Cambodian context to a verbal COENG model (whereas in many other Indic scripts there is a commonly used visual glyph for the VIRAMA). The assertion that this model has nothing to do with Khmer conveniently ignores the history of the Khmer script and the fact the model 'fits' with Khmer script facts. The details of these facts may be conveniently reviewed in our earlier discussions which are archived at: http://www.bauhahnm.clara.net/Khmer/KhmerUnicodeMailingList.html The point has been repeated raised in these discussions that if the end user is offended by the COENG model, it is relatively easy to hide that from them.

Furthermore, we need to be mindful that this Khmer Unicode is also (in addition to Khmer) meant to facilitate encoding of Sanskrit and Pali in the Khmer script. U+17D2 is obviously an appropriate piece in the puzzle for
them.

> the fact that there was no participation of Cambodian national body,

Please do not accuse the drafters of Khmer Unicode of failing to attempt to involve the Cambodian government in the encoding. Numerous occasions and conferences addressing Unicode were held with Cambodian government officials before Khmer Unicode was standardised. Extensive discussions held with an official government committee of Khmer linguists formed the basis of this encoding (see scans of the four pages of that report starting at
http://www.bauhahnm.clara.net/Khmer/Page1.JPG). If they had wanted to dictate code points they could have done so. It is puzzling that the recently constituted Cambodian national body is rejecting the preceding linguists committee decisions. Furthermore, the possibility that Khmer subscripts might be implemented using a COENG method was not opposed by that committee.

I agree that the Khmer government did not stamp the codes of Khmer Unicode with its approval. When the final drafts of Khmer Unicode were being debated (in a mailing list of which this is a reincarnation!) and in advance of the review period, Norbert Klein (a member of the recent Cambodian delegation to Singapore) was tasked with contacting government officials. He had the motivation to involve government officials to weigh in on his side of the argument (he personally argued against the COENG model). Ask Norbert why the Cambodian government did not become involved in the encoding at that point.

Obviously it was not possible four years ago to consult a Khmer encoding committee which only came into being this year.

Alain Daniel is entitled to his opinions. The facts as elaborated in our earlier discussions tend to weigh against them.

Nevertheless, the argument at this point has moved beyond the relative merits of an explicit subscript versus COENG model. A COENG model has already been encoded in Khmer Unicode. It is not judicious or practical at this point to change that foundation.

Let us move on (as was the UNANIMOUS decision of WG2) to work together to in effect fine tune the existing Khmer Unicode.

Sincerely,

Maurice

**********

> It is disappointing that this discussion should be dragged back to a
> polemic versus a rational consideration of the issues.
>
> This debate may be suffering from the recent legacy (how is that for a
> dialectic?) of glyph-based encodings. Excessive attention has been given to
> visual representations at the expense of phonetic realities. Happily, new
> font technology liberates encoders from that quagmire.

I'm very disappointed that you use our discussions just only for forcing your idea ahead, by using our presence in these discussions . You have no intention to find out the solution to solve this problem.

You did not response yet to our document N2406 and you proposed only your idea for the sorting. We will waiting for the response from Michael.

> Let us move on (as was the UNANIMOUS decision of WG2) to work
> together to in effect fine tune the existing Khmer Unicode.

You said that it is an UNANIMOUS decision but there are nobody who can convince with your proposal that time because there are no their presence. You use the presence of some people at the final stage and they can not do anything. It's not fair.

Svay Leng