To:
"Ietf-Provreg (E-mail)" <ietf-provreg@cafax.se>
From:
Robert Burbidge <robert.burbidge@poptel.coop>
Date:
Mon, 12 Aug 2002 17:51:33 +0100
Sender:
owner-ietf-provreg@cafax.se
Subject:
RE: Byte order marks and character sets
>This is addressed in more detail in section 5, though perhaps a "UTF-8 is >RECOMMENDED" would make things more clear. I don't think it wise to say >that only UTF-8 can be used as the protocol may clearly be useful in >localized environments where something other than UTF-8 or UTF-16 might be >more useful. In that case perhaps you can add "UTF-8 is RECOMMENDED" in next draft. > So what's your suggested improvement? Section 4.3.3 of the XML rec (a > normative reference) already says that "XML processors must be able to use > this character to differentiate between UTF-8 and UTF-16 encoded documents", > and there's no mention of BOMs in RFC 2279 (UTF-8) so I don't know what you > mean by "the UTF-8 BOM". See http://www.w3.org/TR/REC-xml#sec-guessing which mentions BOM for UTF-8. My understanding of this is that a BOM does exist for UTF-8, although the document is discussing non-normative interferencing of character encodings. The .NET framework for example will generate the EF BB BF UTF-8 BOM. A few minutes with Google also found http://www.unicode.org/unicode/faq/utf_bom.html#25. My problem seems to be that no clients read the UTF-8 BOM. I think they should understand it even if it's optional; that's what a conforming XML parser should do. If I'm right, then the problem is not strictly a problem with the EPP spec but with client implementations which is outside the strict scope of this discussion group but hey.... > Sorry, I don't see BOMs as a transport issue. It's an encoding issue that > should be addressed in the core document, and as far as I can tell it > already is through the normative reference to the W3C XML recommendation. I think you are correct on this point. I just felt that it was insufficiently clear taking the document set as a whole. Rob Burbidge