[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


To: dnsop@cafax.se
From: Robert Elz <kre@munnari.OZ.AU>
Date: Sat, 08 May 1999 12:45:47 +1000
Reply-To: dnsop@cafax.se
Sender: owner-dnsop@cafax.se
Subject: Re: Experiments in multi-placed root servers

I think I understand what the aim is here, and how it is to be accomplished.
I'm afraid I can't see what the problem is supposed to be, at least as such to
prevent an experiment.

The issues would seem to be whether or not the routing system can reasonably
handle a route being injected in multiple different places, ending at
different local nets (or originating at), but all with the same number.

Then, whether the root can handle more servers.

And last, if there happens to be a problem (that is local to a single
server), whether it will be possible to locate at which server the problem
is being caused.

For the first of those, the answer is very simple - just try it.   I'm
sure the internet core is not going to be bothered by seeing one more
number - it shouldn't matter a lot which particular version (or versions)
of the number the core actually believes.   Note: we don't need to stick
root servers on this advertised net to test this - just pick the address
at which the root server would be located, advertise the route to it from
all over the place, and then have a bunch of people (like say, the members
of this list) ping the address, and see if they get responses (and perhaps
even measure the RTTs).  A few traceroutes would not hurt either.

Perhaps that has already been done, I don't know.  If not, it should be.
If it has, the results ought be reported.   In any case, the proposal can
clearly go nowhere unless this can be made to work.  If the core sees the
route from multiple sources, throws up its hands (metaphorically of course)
and refuses to deal with the route at all, then until (perhaps) some changes
are made there, nothing more can be done.   If pings get through and replies
are received, then there's no reason to believe that DNS packets wouldn't
work just as well.

On the second point, we already have 13 root servers.   It is hard to believe
that increasing that to 14, 15, or 16 (as a first step) is going to make 
anything noticeably less stable than it now is.   For sure, the more servers
there are the greater the probability that one of them will be misbehaving (or
not behaving at all) at any particular time - but the incremental risk here
looks pretty small to me.   Further, this never stopped more root servers being
added before.

And last, on the third, we'll never really know until we test it with real
live root servers.   Fortunately, the failure mode here is not too
unpalateable.  If one of the replicated servers (one sharing an address
with others) fails in some way, then a smaller fraction of the net will be
affected than if one of the current root servers fails.  (nb: if it simply
fails to respond at all, that is normal, and the resolver will just pick a
different root server address to try - it's only when bad answers are
returned that there's a real problem.   Fortunately from the root servers
that is pretty rare (where "bad" answer here means one different from what
the root zone maintainer instructed the root to reply, not one that is perhaps
not the ideal reply .. eg: if by accident, the delegation of COM was removed
from the root zone, giving NXDOMAIN for ever .COM lookup would not be "bad"
for this purpose...)

I would expect that there would be two fairly easy methods available to find
such a bad root server - the most obvious is just to run a traceroute to the
shared IP - the hop before the shared net number will indicate which root
server is being reached from any particular source (actually run the
traceroute at the time queries are being sent to it and bad answers returned,
the routing system might not be a paragon of stability - but it doesn't
generally flap that wildly either).

Second, each of these new replicated root servers ought have two addresses.
One the one that is advertised in the DNS as the IP address of 
X.root-servers.net, and the other which isn't advertised that way at all,
but which is unique to each server (ie: an address taken from the normal
addressing space wherever the server is connected).   Those could be made
widely known outside the DNS (perhaps with an alternate name for each of
the servers, not listed asn an NS for '.').   Then, as needed, each of the
servers could be trivially be queried using their alternate addresses if
there's a problem.   (If there's anyone attempting to set up such a root
server who doesn't understand how to associate multiple addresses with the
server, then they're probably not qualified to be running it).


About the only other issue we need to deal with I think is the possibility
of someone inserting a new route to their own server (not an authorised one,
and one which does not provide the answers that root servers are supposed to
provide), and capturing a part of the net that way.    To an extent, that is
possible right now of course, and is prevented only by the route filtering
that the various net providers accomplish.   I don't see that anything needs
to change there, with the sole expection that for this one magic IP number,
the people who do route filtering would not all be following a common database
indicating where the route should be directed (who is permitted to advertise
it).   Or perhaps they would (I'm afraid I haven't been following the routing
policy discussions in the past several years).   If it is now possible for
a route to originate from several sources, then this would just be one of
those (it would appear just like a  widely multi-homed site to the routing
system).   If that isn't possible, then different ISPs would simply have to
pick one of the possible sources each, and allow that one.   They would be
wise to pick a nearby one, but nothing breaks if they simply pick one at
random (perhaps one "special" one hat the rouuting policy says is "the"
way to the address - by default everyone would go to one of these replicated
servers, only by arrangement would a different one be used).   If anything
new is needed there it seems like it would be quite small - and in any case
this need would be discovered, and handled (or not) during the initial "ping"
test of this scheme (the first step above) - long before any root server
data were affected.

Lastly, I don't see in this anywhere at all where the DNS would be relying
upon the routing system any more than it does now.   We already rely upon
it delivering packets to all the DNS servers (root, and otherwise), that's
not about to change.   Beyond that, if there's some other problem with this
multi-address scheme, that results in packets not being delivered to this
magic address, that's just equivalent to a root server being down.   I'm sure
that happens from time to time, and I know I don't notice it especially...

I'd suggest hat the ping test be done, if it hasn't been, and once that is
known to work OK, a replicated root server test be tried (just using one of
the 13 servers to start with of course).

kre


Home | Date list | Subject list