[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


To: "'Robert Burbidge'" <robert.burbidge@poptel.coop>, "Ietf-Provreg (E-mail)" <ietf-provreg@cafax.se>
From: "Hollenbeck, Scott" <shollenbeck@verisign.com>
Date: Wed, 14 Aug 2002 09:32:05 -0400
Sender: owner-ietf-provreg@cafax.se
Subject: RE: Byte order marks and character sets

> My problem seems to be that no clients read the UTF-8 BOM. I 
> think they
> should understand it even if it's optional; that's what a 
> conforming XML
> parser should do. If I'm right, then the problem is not 
> strictly a problem
> with the EPP spec but with client implementations which is outside the
> strict scope of this discussion group but hey....

OK, investigation complete.  Use of the BOM to identify UTF-8 is covered in
the errata for the XML 1.0 second edition recommendation:

http://www.w3.org/XML/xml-V10-2e-errata#E22

While it MAY be used it's certainly not necessary and most people I talked
to strongly suggested that it's use should be discouraged.  Thus, this is
what I intend to add to Section 2 of the EPP document to address the issue:

"Normative section 4.3.3 and non-normative appendix F of [XML] describe
use of a byte order mark (BOM) to identify the character encoding in
the absence of an XML declaration or encapsulating headers.  Appendix F
includes a BOM to represent UTF-8 encoding, though section 4.3.3
notes that a BOM is not needed to identify UTF-8 encoding.  Section
4.3.3 was later amended (see [XMLE]) to clarify that a BOM MAY be used
to identify UTF-8 encoding.  EPP clients and servers MUST accept a
UTF-8 BOM if present, though emitting a UTF-8 BOM is NOT RECOMMENDED."


Home | Date list | Subject list