To:
David Terrell <dbt@meat.net>
cc:
ngtrans@sunroof.eng.sun.com, namedroppers@ops.ietf.org, ipng@sunroof.eng.sun.com, dnsop@cafax.se
From:
Robert Elz <kre@munnari.OZ.AU>
Date:
Thu, 09 Aug 2001 10:21:35 +0700
In-Reply-To:
<20010808182954.B31440@pianosa.catch22.org>
Sender:
owner-dnsop@cafax.se
Subject:
Re: (ngtrans) Joint DNSEXT & NGTRANS summary
Date: Wed, 8 Aug 2001 18:29:54 -0700 From: David Terrell <dbt@meat.net> Message-ID: <20010808182954.B31440@pianosa.catch22.org> For the purposes of this discussion, the difference between: | I think they're talking about reestablishing existing connections | if the address published in the DNS changes. and | Application protocols should (where appropriate) be able to reconnect, is immaterial. The question is whether the user sees an interruption, not how the interruption is handled. Keith was asking for it to be fixed below the level of the applications, presumably because updating lots of applications (different kinds, and just different implementations) is going to be a very long process, whereas if things could be fixed below that everything would see instant benefits. That's also why SCTP isn't the immediate answer, as aside from actually getting that implemented and deployed itself, the applications would still need to be converted to use it. | or users can But that is exactly what people are claiming is unacceptable. I suspect that what it really depends upon is the particular application's nature. Which also suggests that handling this in the applications, rather than below, is the appropriate place. | -- and DNS records | near a renumbering event should have low TTLs, or multiple A* records | for a multihomed situation, and applications should not be caching | records excessively (or at all, really), and making multiple attempts | at multiple A* records. Yes - but the question is how the app determines that it needs to do this. If a peer renumbers, and you don't get told that happened, then all you can expect to see is either packets vanishing into the void, or perhaps an ICMP host unreachable. Both of those are also consistent with a net link going down. If you wait long enough to be fairly confident that it isn't just a transient net blip, then you've already caused enough of an interruption to the service that there's a noticeable problem. So, the question then is just how you decide that you should be establishing a new connection (or somehow making an old one shift addresses). That's where the possibility of looking in the DNS and seeing that the address you were told to use before is no longer there arose in the first place. If there's some way that you can do that before there's any packet interruption, then the renumbering can be made truly invisible to the users. If we decide that we have to wait until we get some kind of failure indication (even just a single RTT (or RTO) without a response - even though that usually just indicates mild congestion) then you have already altered the operation of the app - though perhaps imperceptably. If it was possible to create a signalling protocol so peers could be informed of address changes, that would solve the problems (any address validity overlap time would allow for a seamless address switch). But I can't see how to make that work for stateless protocols (DNS, NFS, ...) where the server is renumbered - basically the server simply has no idea who should be informed. Perhaps we can live with these failing, given that it is likely to be rare that such things cross site boundaries (the general DNS case is already doing DNS watching for address changes of course, using the TTL - it is a proof by example that it can be made to work - the cases that fail are the two where addresses are configured, that is root servers (where the problems are so great that the answer seems to be to simply give those immutable addresses forever) and locating a back end resolver for a stub - which is unlikely to cross a site boundary). As for apps caching DNS replies - I disgaree there. The TTL is there to allow caching, anyone who agrees to respect the TTL should be able to cache a DNS reply and re-use it until the TTL expires. Attempt to outlaw that and you just end up with semantic quibbles "That's not my application caching that address, it is my application's local resolver that is doing the caching..." kre