To:
"'ietf-provreg@cafax.se'" <ietf-provreg@cafax.se>
From:
"Hollenbeck, Scott" <shollenbeck@verisign.com>
Date:
Fri, 16 Nov 2001 12:43:03 -0500
Sender:
owner-ietf-provreg@cafax.se
Subject:
XML Schema <choice> Problems
While working with one of my implementers I became aware of an issue with XML Schema's <choice> model groups. The situation impacts several schemas as currently written, and while I have a request for info out to some XML Schema gurus I have a sneaking suspicion that we're looking at an inherent limitation of the language. First, a description of the issue: Several of the schemas that we're working with have choice model groups that look like this: <choice minOccurs="1" maxOccurs="x"> <Several elements, all of which are optional> </choice> This is commonly found in the <update> structures for each object, and it's used to ensure that at least one <add>, <rem>, or <chg> element is present. The problem is that this construct allows multiple occurrences of the same element. For example, if "x" == 3, the parser would allow three instances of the <chg> element instead of just one. I'm not sure if that's a good idea; I'd much rather see just one <add>, <rem>, and/or <chg> element to keep things smaller and to avoid inefficient element duplication. So what are the alternatives? I believe there are three assuming that it's not possible to specify "all are optional but at least one unique element MUST be present": 1. We could leave things as-is, which means that the schemas will not detect instances of repeated elements. This also means that if you put in three <add> elements (using the example above), there's no room for any <rem> or <chg> elements. I think this is bad, and that's the whole reason for this note. 2. We could go back to the older <sequence> constructs that were used in earlier versions of the specs, but this option would allow the parser to accept empty sequences. It was this "empty sequence" situation that prompted the change to the <choice> construct in the first place. 3. We could change the schemas so that each <add>, <rem>, or <chg> would have to be performed in a separate <update> command. This would preclude repeated elements and empty sequences, but it means changes to the way <updates> are done and ultimately more <update> commands. I'm leaning towards option #2. It wouldn't be hard to check for an empty sequence outside the parser and return an error code noting that a required element is missing, and this change should have minimal impact for anyone who's working with code right now -- unless you happened to have noticed that the <choice> allows repeated elements and you're taking advantage of that fact. -Scott-