[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


To: Dan Maharry <dan@mcd.coop>
Cc: ietf-provreg@cafax.se
From: "Tan, William" <William.Tan@neustar.biz>
Date: Thu, 12 Apr 2007 11:33:14 -0400
In-Reply-To: <557337CEDE6F294DBF74610FD7FF7BDF3851AF@nbhex1.osgcs.local>
Sender: owner-ietf-provreg@cafax.se
User-Agent: Thunderbird 2.0.0.0 (Windows/20070326)
Subject: Re: [ietf-provreg] Re: EPP Extensions for IDN


Dan Maharry wrote:
> RFC4646 maybe a way forward, but is it possible to look at this unambiguously but in a more compact way?
> IDNs are domain names using the Unicode character set which is divided into codepages.
> So maybe all the <domain:create> command actually needs is the Unicode name of the domain and the codepage on which those characters are located. 
>   
When you say codepages, do you mean scripts? "Code page" is a legacy 
encodings in my dictionary, and have very little to do with Unicode 
other than the fact that Unicode imported characters from them.

> From reading this discussion, that covers the Simplified Chinese\Kanji issue of writing out 'East Capital' (I think) - the owner of Tokyo.com should register two IDNs, one for each codepage \ character set.
>
> Registries could let it be known which code pages they support and registrars likewise
>
> So you get 
>
> <domain:create>
> 	<name>blahblah.com</name>
> 	<IDN:codepage>1232</IDN:codepage>
> ...
> </domain:create>
>
> And the A-name is derived later on - name and codepage are the primary key for the name in the registry.
>   

I would advocate for looser interpretation of the 
script/language/codepage value. Registries today use a variety of tag 
types for that field. NeuStar uses ISO 639-2 to identify our tables. It 
is quite conceivable that another registry may allow many Latin 
characters in a table, and then allows the registration to be optionally 
tagged with a language just for statistics collection purposes.

Likewise, different registries collect and store variations of U-label 
and A-label combination. Some only take the A-label, does IDNA toUnicode 
to validate the characters, and then toASCII again to make sure it 
matches the original A-label. Some take both the U-label and A-label, 
and perform a variation of the previous steps. So, my suggestion is to 
offer the ability to hold a U-label in EPP, but make it optional in the 
schema.

For example, we could have:

<domain:create>
  <name>xn--rsum-bpad.com</name>
  <IDN:ulabel>résumé.com</IDN:ulabel>
  <IDN:tag type="iso639-2">fr</IDN:tag>
</domain:create>



=wil

Home | Date list | Subject list