To:
"'Robert Burbidge'" <robert.burbidge@poptel.coop>, "Ietf-Provreg (E-mail)" <ietf-provreg@cafax.se>
From:
"Hollenbeck, Scott" <shollenbeck@verisign.com>
Date:
Wed, 14 Aug 2002 09:32:05 -0400
Sender:
owner-ietf-provreg@cafax.se
Subject:
RE: Byte order marks and character sets
> My problem seems to be that no clients read the UTF-8 BOM. I > think they > should understand it even if it's optional; that's what a > conforming XML > parser should do. If I'm right, then the problem is not > strictly a problem > with the EPP spec but with client implementations which is outside the > strict scope of this discussion group but hey.... OK, investigation complete. Use of the BOM to identify UTF-8 is covered in the errata for the XML 1.0 second edition recommendation: http://www.w3.org/XML/xml-V10-2e-errata#E22 While it MAY be used it's certainly not necessary and most people I talked to strongly suggested that it's use should be discouraged. Thus, this is what I intend to add to Section 2 of the EPP document to address the issue: "Normative section 4.3.3 and non-normative appendix F of [XML] describe use of a byte order mark (BOM) to identify the character encoding in the absence of an XML declaration or encapsulating headers. Appendix F includes a BOM to represent UTF-8 encoding, though section 4.3.3 notes that a BOM is not needed to identify UTF-8 encoding. Section 4.3.3 was later amended (see [XMLE]) to clarify that a BOM MAY be used to identify UTF-8 encoding. EPP clients and servers MUST accept a UTF-8 BOM if present, though emitting a UTF-8 BOM is NOT RECOMMENDED."