[net.bugs.4bsd] unix domain loop or mbuf loss

arwhite@watmath.UUCP (Alex White) (02/21/84)

Subject: Loss of mbufs - or - system hang
Index:	sys/uipc_usrreq.c 4.2BSD

Description:
	Tearing down queued connections in the Unix domain upon a soclose
	is done incorrectly.  As distributed, it will cause a hang while
	things loop in the kernel; the fix distributed:
		From: madden@sdccsu3.UUCP (Jim Madden)
		Newsgroups: net.bugs.4bsd
		Subject: 4.2 IPC machine hang
		Article-I.D.: sdccsu3.1238
		Posted: Mon Nov  7 03:09:21 1983
		Organization: U.C. San Diego, Computer Center
	will fix the hang, but will cause you to loose 3 mbuf's every
	time you have a queued connection which hasn't yet been accepted
	when you do the soclose.
Repeat-By:
	First do a netstat -m. Run the following in the background:
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/socket.h>
	#include <sys/un.h>

	main()
	{
		struct sockaddr_un address;
		int s;

		s = socket(AF_UNIX, SOCK_STREAM, 0);
		address.sun_family = AF_UNIX;
		strcpy(address.sun_path, "xxx");
		if(bind(s, &address, sizeof(address.sun_family) + strlen(address.sun_path)) < 0) {
			perror("bin");
			exit(1);
		}
		listen(s, 5);
		pause();
	}
	Then run as many instances of the following as you want - 8 seems
	to be the max.
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/socket.h>
	#include <sys/un.h>

	main()
	{
		struct sockaddr_un address;
		int s, i;

		s = socket(AF_UNIX, SOCK_STREAM, 0);
		address.sun_family = AF_UNIX;
		strcpy(address.sun_path, "xxx");
		if((i = connect(s, &address, sizeof(address.sun_family) + 
			strlen(address.sun_path))) < 0) {
			perror("connect");
			exit(1);
		}
	}
	Kill the first process.  Do a netstat -m and compare, you will find that
	there are 8 extra mbuf's allocated to socket structures, 8
	to protocol control blocks, and 8 to socket addresses.
	(Note - if you didn't put in the fix from madden@sdccsu3
	you will loop in the kernel)
Fix:
	madden's fix was totally wrong - the other protocols all free up
	the socket in their cleanup routines - some to a disastrous extent
	such as udp_usrreq, which in PRU_ABORT it does a sofree right
	after invoking in_pcbdetach which also does one; and before doing
	a soisdisconnected on the socket it just freed!
	(I suspect you should delete the sofree call).
	However, for the above described problem first take out the
	fix from madden@sdccsu3.
	Then in uipc_usrreq.c, unp_drop change
		unp_disconnect(unp);
	to
		unp_detach(unp);
		sofree(unp->unp_socket);
	**DISCLAIMER: This works and fixes the above bug.  I haven't the
	foggiest idea if it'll not break various other things; for example
	the flow of control it different in the above various routines
	for datagram service and has a different set of queues and I really
	don't know if it will blow it for them.  If somebody has a better
	fix please send it to me.