To:
"Tan, William" <William.Tan@Neustar.biz>
Cc:
Dan Maharry <dan@mcd.coop>, ietf-provreg@cafax.se
From:
Gregory Berezowsky <gberezow@ca.afilias.info>
Date:
Thu, 12 Apr 2007 14:04:36 -0400
In-Reply-To:
<461E513A.7000903@neustar.biz>
Sender:
owner-ietf-provreg@cafax.se
Subject:
Re: [ietf-provreg] Re: EPP Extensions for IDN
On 12-Apr-07, at 11:33 AM, Tan, William wrote: > > Dan Maharry wrote: >> RFC4646 maybe a way forward, but is it possible to look at this >> unambiguously but in a more compact way? >> IDNs are domain names using the Unicode character set which is >> divided into codepages. >> So maybe all the <domain:create> command actually needs is the >> Unicode name of the domain and the codepage on which those >> characters are located. > When you say codepages, do you mean scripts? "Code page" is a > legacy encodings in my dictionary, and have very little to do with > Unicode other than the fact that Unicode imported characters from > them. > >> From reading this discussion, that covers the Simplified Chinese >> \Kanji issue of writing out 'East Capital' (I think) - the owner >> of Tokyo.com should register two IDNs, one for each codepage \ >> character set. >> >> Registries could let it be known which code pages they support and >> registrars likewise >> >> So you get >> <domain:create> >> <name>blahblah.com</name> >> <IDN:codepage>1232</IDN:codepage> >> ... >> </domain:create> >> >> And the A-name is derived later on - name and codepage are the >> primary key for the name in the registry. >> > > I would advocate for looser interpretation of the script/language/ > codepage value. Registries today use a variety of tag types for > that field. NeuStar uses ISO 639-2 to identify our tables. It is > quite conceivable that another registry may allow many Latin > characters in a table, and then allows the registration to be > optionally tagged with a language just for statistics collection > purposes. A possibility could be for the value to be a reference to a table (or subtable...) as suggested in informational RFC 4290. The URI for the supported table(s) for a given registry could be published in its EPP greeting. This would allow quite open-ended definition of what a registry allows while making it fairly simple (and standardized) to determine from the client side what a registry actually supports. > > Likewise, different registries collect and store variations of U- > label and A-label combination. Some only take the A-label, does > IDNA toUnicode to validate the characters, and then toASCII again > to make sure it matches the original A-label. Some take both the U- > label and A-label, and perform a variation of the previous steps. > So, my suggestion is to offer the ability to hold a U-label in EPP, > but make it optional in the schema. > > For example, we could have: > > <domain:create> > <name>xn--rsum-bpad.com</name> > <IDN:ulabel>résumé.com</IDN:ulabel> > <IDN:tag type="iso639-2">fr</IDN:tag> > </domain:create> > > > > =wil --- Gregory Berezowsky Systems Architecture Team Lead Afilias Canada