mathis@FORNAX.ECE.CMU.EDU (Matt Mathis) (06/30/89)
Does anybody have a description of the bug (and fix) of the interaction between Yellow Pages and the Domain Name System? This bug causes some systems to send DNS requests at high rates for sustained periods in response to remote DNS server or network failures. High rates means typically 20 per second for workstations. I have clocked some at as much as 100 pps! Needless to say this is hard on gateways, and disaster to people behind 56k links. I have a vague description, so I don't need another. What I really want is a document which I can hand to the administrator of some wayward host and say "Fix this or else turn it off." Needless to say some of said host administrators are running broken "plug and play" systems, and need pretty explicit directions. Also, it would be extremely useful if someone has a remote test bed which can be pointed at a suspect host to determine if it has the bug. Thanks, --MM--
vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (07/02/89)
In article <8906292105.AA07216@fornax.ece.cmu.edu>, mathis@FORNAX.ECE.CMU.EDU (Matt Mathis) writes: > Does anybody have a description of the bug (and fix) of the interaction > between Yellow Pages and the Domain Name System? This bug causes some > systems to send DNS requests at high rates for sustained periods in response > to remote DNS server or network failures. High rates means typically 20 > per second for workstations. I have clocked some at as much as 100 pps! > Needless to say this is hard on gateways, and disaster to people behind > 56k links. If this is not what you are talking about, please excuse me. repeat by: 1) some program decides to do gethostbyname(foo.bar) or gethostbyaddr(1.2.3.4), checks with portmap & ypbind, and sends an rpc request to the correct ypserv. 2) ypserv gets the request, fails to find the key in the YP map, and since YP-to-DNS is turned on, forks a child which does an DNS lookup. 3) the link to the DNS root or correct authorative server is down or congested, so the child does not get an answer for a while. 4) meanwhile, the original program in step #1 is waiting for the answer. If step #3 takes long enough, the original program does a normal YP-rpc timeout, retries, and everything is repeated from step #1 This is worse than it looks because the time-out in step #4 is less than the one in used by the child in step #3. One can get large numbers of children of ypserv, all asking the local DNS server for the same answer. Some programs, ypmatch may be one, seem to try forever. This would generate an unbounded, linearly increasing amount of DNS traffic, except that one usually runs out of resources for the local nameserver and ypserv parent. The Internet link to sgi.sgi.com is only 9.6b/s. When this happens here, it is noticed. One version of ypserv from Sun adjusted the time the parent waits for its children to reduce the number of children. That helped. We've taken to doing more. First, our ypserv limits the number of children it has outstanding. Second, it keeps track of what its children are asking and does not start new ones while old ones are asking the same DNS question. Third, it caches both negative answers and time-outs from children, and responds immediately rather making a new child to ask DNS. Caveats: Least bad values for the cache aging are not obvious to me. We've been running the sum of these fixes for a short time, and cannot be certain they are sufficent. Not all of these fixes are yet in currently shipping SGI products. Vernon Schryver Silicon Graphics vjs@sgi.com
cpj@ENG.SUN.COM (Chuck Jerian) (07/03/89)
I've made about the same changes to a version of yp but never put them into a release, and have a newer version that does positive and negative caching, and never forks, its based on a new asynchronous resolver interface. What it does is record that a request is outstanding and start up the asynchronous resolver, this is recorded, and any further yp requests for this question are dropped until the asynchronous resolver succeeds or fails, the success or failure is cached and used to answer subsequent requests for some lifetime. I've also overloaded the no_more_keys response to be soft error, for hosts by name, and have arranged for multiple ip addreses to be returned by returning a yp response of the form: "bogus.com 1.2.3.4\nbogus.com 1.5.4.1" I've also changed rpc to use exponential backoff at all times, this eliminates the close spacing of the rpc requests. --cpj
medin@NSIPO.NASA.GOV ("Milo S. Medin", NASA ARC NSI Project Office) (07/03/89)
This is silly. Either the DNS is fully authoritative or it isn't. This is like checking hosttable information and if not there, asking the nameserver to take a crack at it. This may seem sensible on the surface, but if you think about it a bit, it's a lose. Say the YP domain has more info than the DNS. In this case, people outside the YP domain will get back incorrect or incomplete information, since they don't have access to the YP information. Say the DNS has more info than the YP system. In this case, you can get into a case where the YP system returns one thing and the DNS returns something different. There is of course the issue of who is right, but more important is that it's different. Once you get things consistent, it's a lot easier to make sure it's saying the correct thing. The only case where you win is where the DNS and the YP system are totally consistent with each other, in which case you could have used straight DNS all along, and not have gone through this mess. The implementation issue is orthogonal to the issue of which system is more authoritative. Even if YP had the concept of a 'soft-err', you still have the consistency issue to deal with. The YP hostinfo architecture works fine in an environment where you don't have any reason to talk to the DNS. If you're connected to the Internet and want to interoperate, then you have a problem trying to make things play together. What we have done since 3.5 is rebuild the libc.a library after replacing the YP hostinfo code completely with DNS resolver equivalents. This has worked very well, though under 3.X meant you had to relink the 3.5 utilities, which was a bit difficult since SUN didn't ship you the .o files. Under 4.X, it's much easier, as the hostent structures are consistent with 4.3 BSD, and all you need to do is rebuild the shared libc library, and all those dynamically linked utilities automagically work fine. The stuff that's left (rcp, arp, etc...) can be rebuilt with 4.3 code or SUN source. The kind folks at SUN have even gone so far as to do this for you and make it available for FTP from uunet.uu.net. They are even handing out the -pic .o archives (the .o's just before the are linked into the shared libc), so you can replace whatever pieces you like. I even hear these things will be shipped as a standard part of 4.1. So SUN at least is trying to do the right thing (backwards compatibility can really be painful sometimes). Note that I've talked about replacing just the YP hostinfo code. The rest of the YP system is left intact (so automounter and the rest work fine) to use if you want... Thanks, Milo
vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (07/04/89)
In article <8907022301.AA01289@erendira.arc.nasa.gov>, medin@NSIPO.NASA.GOV ("Milo S. Medin", NASA ARC NSI Project Office) writes: > > This is silly. Either the DNS is fully authoritative or it isn't... > > Note that I've talked about replacing just the YP hostinfo code. The rest of > the YP system is left intact (so automounter and the rest work fine) to > use if you want... This is reasonable, but there is a minor hassle. The named stuff I know about is not nearly as easy to set up in a small site as YP. There is more typing, and room for choice with 4.3+ named configuration files. Given that DNS is more powerful, one would be surprised if it were otherwise. In its current state, named is more appropriate for sites such as ours with network hacks who like to fiddle with resolv.conf's, named.boot's, .rev files, and so on. Is there an alternative to a resolv.conf listing the servers on each client? The broadcast RPC of YP is insecure, but it is easier to reassign servers. Named is also a bit unforgiving--ever notice what happens if you happen to put a period after a host address in a database file? You define a name=-1. Don't forget the fun chaos one can make with resolver loops. Obviously, most of these are characteristics of the 4.3 implementation and not DNS itself. Many (but not all) of the well known bad parts of YP are implementation shortcomings rather than protocol botches. Notice that the vast majority of all workstations do not have access to the Internet, are on very small networks, and do not have an a priori need for DNS. Always relying on either DNS or YP is an incomplete answer. It is sometimes necessary to have your own, private extensions to the central government's data. For example, imagine that you do not have root passwords for the DNS/YP server(s), and that you want to use rcp to a host which is not correctly in the databases--maybe one of central governments has not processed the paper work needed before adding a new hostname, or made a mistake. Vernon Schryver Silicon Graphics vjs@sgi.com
hedrick@geneva.rutgers.edu (Charles Hedrick) (07/04/89)
I accept your justification for providing the option to use pure YP to handle host names. What I think people are complaining about is the combination of YP and DNS. If people are on a self-contained network that doesn't use the DNS, by all means let them use YP. However if they need the DNS, they should use it directly -- not frontended by YP. Having two different sources for the same information is asking for trouble, and having systems as complex as YP and the DNS interact, particularly with caches in between, is going to make problem diagnosis a nightmare. If somebody is using the mixed YP/DNS, they're going to have to learn how to set up the DNS anyway. You haven't gained them anything by mixing the two. You've just made their setup a lot more complex. Sun's approach of providing different versions of the sharable libc is a good one. (So is providing libc_pic.a, since it lets me replace the resolver with the newest Berkeley one.) Even easier would be an approach where gethostbyname checks for some file (e.g. /etc/resolv.conf) to see which approach to use. That way they'd only have to distribute one binary. Pyramid checks for /etc/nameserver. It's a file whose contents don't matter. If it is present, their gethostbyname talks to the DNS rather than the host table.
lindberg@cs.chalmers.se (Gunnar Lindberg) (07/04/89)
In article <37409@sgi.SGI.COM> vjs@rhyolite.wpd.sgi.com (Vernon Schryver) writes: > ... >Always relying on either DNS or YP is an incomplete answer. It is >sometimes necessary to have your own, private extensions to the central >government's data... Yes, isn't the answer "lets have both"? We've tried to get around most problems with a "gethostbyname()" that uses an algorithm like: if (index(host, '.')) /* with '.', use DNS first */ { if ( ! (hp = get_ns_hostbyname(host))) hp = get_yp_hostbyname(host); } else /* no '.', assume local YP */ { if ( ! (hp = get_yp_hostbyname(host))) hp = get_ns_hostbyname(host); } This way we can use short names for hosts within several domains (yes I know I'm lazy, :-) and catch them via YP, while real domain names always get resolved by DNS (of course without "ypserv -i"). Possibly one could argue against at all using YP for names which contains a '.' - personally I don't think that does anything bad. We haven't tried this in "sharable libc" yet, but I would guess that's rather straight forward. If anyone wants the changes theyr're available for "ftp chalmers.se [129.16.1.1], get ucb/named-4.8-cth.shar". Gunnar Lindberg
roy@phri.UUCP (Roy Smith) (07/04/89)
In <Jul.3.18.13.06.1989.7187@geneva.rutgers.edu> hedrick@geneva.rutgers.edu (Charles Hedrick) writes: > What I think people are complaining about is the combination of YP and > DNS. [...] If somebody is using the mixed YP/DNS, they're going to have > to learn how to set up the DNS anyway. You haven't gained them anything > by mixing the two. You've just made their setup a lot more complex. For companies distributing complete operating systems, I probably agree with Charles. But, what about for somebody like me who is trying to get an MX-groking sendmail to run under an originally YP-based system like SunOS-3.5? Sendmail needs to talk directly to the DNS system because it has to get at MX records. But for the zillions of other programs that have to do name translation, YP works just fine, and Sun's idea of having YP hand off to DNS any query it can't resolve itself seems logical. You don't really expect me to recompile *every* program that calls gethostbyname() do you? Besides, for all it's grossness (and there is plenty) YP still provides a reasonably convenient way to share files with local additions. For example, all our suns share /etc/printcap using YP. Every machine that has direct control of a printer has a local /etc/printcap for that printer. The printcap parsing routines were written to read the local file first and only go to YP if the name can't be resolved there. Show me a way to do that that's easier than YP. Yes, for some applications, YP is a big loose and you need to go full-frontal DNS. But just because a new and better tool comes along doesn't mean you should completely throw out the old ones; for some applications they might actually be better. -- Roy Smith, Public Health Research Institute 455 First Avenue, New York, NY 10016 {allegra,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@alanine.phri.nyu.edu "The connector is the network"
david@ms.uky.edu (David Herron -- One of the vertebrae) (07/04/89)
The latest Ultrix has an interesting file called /etc/svcorder. It contains lines telling which name service to consult in what order when looking up names. Like local bind "local" means /etc/hosts, in which we list information for our local machines and is generated from the same file which the nameserver information comes from. The person who administers that file isn't as trusting of BIND as I am ... :-) The point is that you can mix & match as you please and even use yp. There aren't any dire warnings in the Ultrix manuals about possibly un- debuggable situations arising from mixing bind and yp. Oh well. Nor anything about authoratative -vs- non-authoratative information (i.e. generating both "views" from the same file). In any case, I happen to like this particular way of specifying this behaviour. Now, I fail to see all but the most incidental connection to TCP/IP in this discussion. A better newsgroup to discuss this in is comp.protocols.tcp-ip.domains. -- <- David Herron; an MMDF guy <david@ms.uky.edu> <- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <- <- New word for the day: Obnoxity -- an act of obnoxiousness
casey@gauss.llnl.gov (Casey Leedom) (07/05/89)
| From: vjs@rhyolite.wpd.sgi.com (Vernon Schryver) | | 1) some program decides to do gethostbyname(foo.bar) or | gethostbyaddr(1.2.3.4), checks with portmap & ypbind, and sends | an rpc request to the correct ypserv. | 2) ypserv gets the request, fails to find the key in the YP map, and | since YP-to-DNS is turned on, forks a child which does an DNS | lookup. | 3) the link to the DNS root or correct authorative server is down or | congested, so the child does not get an answer for a while. | 4) meanwhile, the original program in step #1 is waiting for the answer. | If step #3 takes long enough, the original program does a normal | YP-rpc timeout, retries, and everything is repeated from step #1 | | This is worse than it looks because the time-out in step #4 is less than | the one in used by the child in step #3. One can get large numbers of | children of ypserv, all asking the local DNS server for the same answer. | | Some programs, ypmatch may be one, seem to try forever. This would | generate an unbounded, linearly increasing amount of DNS traffic, except | that one usually runs out of resources for the local nameserver and | ypserv parent. Vernon has described the situation accurately. Most of the time the application in question is telnet, ftp or some other user instigated application. When this happens the user usually gets tired of waiting after a while and aborts. This causes a minor transient load on the network and the machine running the YP server, but usually nothing you couldn't live through. One application in particular doesn't get tired though: sendmail. For every piece of mail you have queued up, there will be a sendmail waiting for address resolution. In the stock versions of the Sun OS 3.X, sendmail will hang forever. (Note that it really isn't sendmail, but rather the gethostbyXXXX(3) library routines.) In any case, when I ran into this problem in November of 1987 I talked to Bill Nowicki at Sun about it who was their current sendmail guru (and probably still is) and he gave me two new sendmail binaries. Either one solves the problem described above. The first binary is pretty much identical to the standard sendmail binary, but includes timeouts around all the gethostbyXXXX calls. He set the timeout to 90 seconds which I think is way too long, but it gets the job done. If a timeout occurs, sendmail just leaves the mail queued up assuming a temporary delivery failure. It will return the mail after the normal three days of trying if it can't deliver it in that time. The second binary is much more interesting however. It completely bypasses YP and goes directly for the name server itself. Thus, you get MX support! The only thing you have to do to run the MX sendmail is include the file /etc/resolv.conf if the name server isn't running on the local host. If anyone wants, I have both Sun2 binaries for SUN OS 3.X. (The binaries will run fine on a Sun3 - trust me, I've been running them for a year and a half now.) They are available via anonymous ftp from lll-crg.llnl.gov under llnl/named/sun: % ls -l ~ftp/llnl/named/sun total 461 -r--r--r-- 1 root 1214 Jan 3 1989 nslookup.help -rw-r--r-- 1 root 786 Dec 1 1987 sun.rc.local.diff -rwxr-xr-x 1 ftp 172032 Dec 1 1988 sun2.sendmail* -rwxr-xr-x 1 ftp 196608 Dec 1 1988 sun2.sendmail.mx* -rwxr-xr-x 1 root 81920 Jan 3 1989 sun3.nslookup* lrwxr-xr-x 1 root 13 Dec 24 1988 sun3.sendmail@ -> sun2.sendmail lrwxr-xr-x 1 root 16 Dec 24 1988 sun3.sendmail.mx@ -> sun2.sendmail.mx Casey
medin@NSIPO.ARC.NASA.GOV ("Milo S. Medin", NASA ARC NSI Project Office) (07/05/89)
This is reasonable, but there is a minor hassle. The named stuff I know about is not nearly as easy to set up in a small site as YP. There is more typing, and room for choice with 4.3+ named configuration files. Given that DNS is more powerful, one would be surprised if it were otherwise. In its current state, named is more appropriate for sites such as ours with network hacks who like to fiddle with resolv.conf's, named.boot's, .rev files, and so on. Is there an alternative to a resolv.conf listing the servers on each client? The broadcast RPC of YP is insecure, but it is easier to reassign servers. Named is also a bit unforgiving--ever notice what happens if you happen to put a period after a host address in a database file? You define a name=-1. Don't forget the fun chaos one can make with resolver loops. If you ran named on the clients and told them about the root servers they could figure out everything else from there. Of course you could always broadcast named queries (ugly, but possibly useful for finding things initially). If you went to as much effort in making the user interface to BIND as nice as YP, you'd have it capable of being run more easily in small non-connected islands as well as in the Internet. In addition, there are available awk or perl scripts that take a hosttable as input and output a set of DNS files to feed to named. Sendmail is complicated too, but people didn't throw it away and write something more friendly. They put time and effort into making configuration easier. It's also a lot more robust than it used to be. Obviously, most of these are characteristics of the 4.3 implementation and not DNS itself. Many (but not all) of the well known bad parts of YP are implementation shortcomings rather than protocol botches. Agreed. It's too bad people reject protocol architectures because of implementation issues as opposed to fundamental problems with the architecture. That's why we have ugliness like Appletalk in the world... :-( Notice that the vast majority of all workstations do not have access to the Internet, are on very small networks, and do not have an a priori need for DNS. How many times have we seen nets of small organizations winding up connected to the Internet? Also don't forget that DNS has things like multiple address per hostname and MX record support. YP doesn't. Even in a small organization, these things can be useful. It's much easier grwoing if you treat small systems the same way as big ones, rather than coming up with different solutions. Look what happens when you get a large pile of PC's running some brand X proprietary LAN software package and then you have to connect to a set of heterogeneous systems (like mini's or workstations). Here you throw away the PC oriented approach and go with something more standard. Besides, small systems and small networks have a strong tendency to get bigger, because of technology issues. Always relying on either DNS or YP is an incomplete answer. It is sometimes necessary to have your own, private extensions to the central government's data. For example, imagine that you do not have root passwords for the DNS/YP server(s), and that you want to use rcp to a host which is not correctly in the databases--maybe one of central governments has not processed the paper work needed before adding a new hostname, or made a mistake. Most utilities let you use numbers rather than names. The fact that rcp and rlogin don't is an artifact of the implementation. That is, no one has bothered to fix it. In your example, I'd just use FTP. You will always get mistakes that break things, whether it's YP or DNS systems. Depending on how the YP space is set up, you may not be any better off than the DNS case in your example. There are many organizations (like Apple) who have private DNS data that doesn't escape to the outside. That isn't an argument for YP, it's an argument for knobs to BIND to make that easier. Remember, the DNS is a protocol architecture, and not BIND. Maybe what you're saying is that sometimes YP is the right answer and sometimes DNS is. I buy that, especially given certain implementation issues. But trying to get them to work together is a problem. That's my main argument anyway. I'm arguing for consistency. If YP is what you want, that's fine, just don't try and mix it up DNS information. Vernon Schryver Silicon Graphics vjs@sgi.com Thanks, Milo PS Usual disclaimers apply of course...
mar@ATHENA.MIT.EDU (07/05/89)
Date: 4 Jul 89 13:42:16 GMT From: phri!roy@rutgers.edu (Roy Smith) Besides, for all it's grossness (and there is plenty) YP still provides a reasonably convenient way to share files with local additions. For example, all our suns share /etc/printcap using YP. Every machine that has direct control of a printer has a local /etc/printcap for that printer. The printcap parsing routines were written to read the local file first and only go to YP if the name can't be resolved there. Show me a way to do that that's easier than YP. But we do the exact same thing here at Athena using our Hesiod name service. Hesiod is a set of simple library routines layered over BIND that give us the ability to lookup account information, access groups, filesystems, printers, etc. We have the flexibility of the DNS, and only one kind of database to maintain for hosts and other information. We find that it's faster to look up something though Hesiod than scanning a local text file, even if it's not already in the local named cache. -Mark Rosenstein
chris@GYRE.UMD.EDU (Chris Torek) (07/05/89)
From: phri!roy@rutgers.edu (Roy Smith) ... Yes, for some applications, YP is a big loose [sic] and you need to go full-frontal DNS. But just because a new and better tool comes along doesn't mean you should completely throw out the old ones; for some applications they might actually be better. YP and the domain system are completely different beasts, and should never be asked to talk to each other. Host name service should be DNS based, except for isolated nets (here one might want to use YP simply because it makes for one less thing to learn). Here is why. The domain service system is a fully distributed database. It deals with such issues as network partitioning and multiple administration (to the tune of thousands of individuals or groups administering their own part of the database). YP is a centralised authoritarian database. It can, to some extent, handle a partitioned network, but it does not allow independent administration. It believes in a master/slave relationship; and for someone to be able to set up (e.g.) password or printer service one needs to be the master. By definition, everyone else is a slave. Distributed and centralised databases just do not mix. One caveat: We have never used YP at all here (at the CS Department; there are other groups at the College Park campus that do use YP), so in some cases I am speaking through a metaphorical hat. No doubt if I am wrong we will all hear about it. Chris