[comp.sys.sgi] digging out stuff from process address space

rayan@cs.toronto.edu (Rayan Zachariassen) (07/28/89)

q1: (curiosity) How can I reliably determine the end of the envp strings
	in the process address space?  I.e. what is between the last
	environment string value and stackbas(0) (from core.h) ?

q2: (problem) What magic do I need to read the top of user stack through
	/debug/<pid>?  I tried the following sequence:

		fd = open("/debug/<pid>", 0);
		fcntl(fd, DFCSTOP, 0);
		nfd = fcntl(fd, DFCOPENT, x);
		lseek(nfd, stackbas(some low number), 0)
		read(nfd, buf, sizeof buf)
		fcntl(fd, DFCRUN, &(something which is CLEARNOSIG))

	with x being variously 0, i, and &i, where i=0 or
	i=stackbas(some low number) without luck.  I also tried seeking
	and reading from fd, but got something that looks like initialized
	data and/or symbol table.  stackbas(some low number) ~= 0x7fffc700

q3: (curiosity) What is the reason to require the process to be stopped
	before doing DFCOPENT or read() from it?  This limits the usefulness
	of the operations provided.

My purpose is to rummage through the inherited environment of an arbitrary,
running, possibly critical, process (like login shells...).

Thanks for any info

rayan

jeffd@norge.sgi.com (Jeff Doughty) (07/29/89)

I can answer two of the three questions.  Perhaps one of my colleagues can
answer #1.

> q2: (problem) What magic do I need to read the top of user stack through
> 	/debug/<pid>?  I tried the following sequence:
> 
> 		fd = open("/debug/<pid>", 0);
> 		fcntl(fd, DFCSTOP, 0);
> 		nfd = fcntl(fd, DFCOPENT, x);
> 		lseek(nfd, stackbas(some low number), 0)
> 		read(nfd, buf, sizeof buf)
> 		fcntl(fd, DFCRUN, &(something which is CLEARNOSIG))

This is almost correct.  The problem is that you are reading the wrong file
descriptor.  DFCOPENT returns the file descriptor of the file that contains
the binary.  For example, if you are running "/usr/tmp/prog", the file
descriptor returned is as if your code said open("/usr/tmp/prog", 0).
This is of interest to debuggers who need symbol tables, but you are
interested in reading from the running process.  So read from fd, not nfd.

> q3: (curiosity) What is the reason to require the process to be stopped
> 	before doing DFCOPENT or read() from it?  This limits the usefulness
> 	of the operations provided.

This is due to a sticky implementation issue - I'll try to explain briefly.
Our kernel works on multiprocessors as well as uniprocessors.  Therefore
shared data must be protected from simultaneous access, which implies a lock
of some kind.  If an arbitrary process was to examine the address space of
another processor, the structures that hold this information would have to
be locked.  Well, in 99.9% of the time, this data is private to the process,
and need not be locked.  It needs only to be protected when dbx (which is
the primary user of /debug) is debugging the process.  By stopping a process,
we don't need locks since you know that the data structures aren't going to
be mucked with. In addition, it is arguably desirable that the process be
stopped when being examined by the debugger.

In the future, we may be relaxing this restriction by only locking when
modifying the virtual address space - a relatively infrequent event.
In the meantime, your code above is correct - stop the process and restart it.

> Thanks for any info

> rayan

Hope this helps.

					Jeff Doughty

rayan@CS.TORONTO.EDU (Rayan Zachariassen) (08/01/89)

Jeff, thanks for your response.  It did indeed guide me the right way.
My confusion about DFCOPENT was due to its description in the manual page:

	  DFCOPENT
	       If the given argument is	zero, return the file
	       descriptor corresponding	to the text region of the
	       process.	 If the	given argument is non-zero, interpret
	       this value as a virtual address of the process.	Return
	       the file	descriptor corresponding to the	region
	       containing this address.	 This call may be used to
	       locate symbol tables of the process.

This gave me the impression that opening /debug/<pid> would give only the
text segment of the process and one would have to use DFCOPENT with a virtual
address to get the stack segment, for example.  I got even more confused by
seeing strange things in the top page of the stack segment as read from a
/debug/<pid> fd... once it was a symbol table, another time it was the
contents of the root crontab (which I had edited shortly before).  I get
the feeling that page isn't being cleared when mapped into the process
address space (if so, you will want to fix it, if not, I remain confused).

For people listening in, the reason I was wondering about this was to
be able to print the hosts people are logged in from in a who listing.
We have this (suid, of course) in /local/bin/who.  I really hate having to
stop login shells like this.

#include <stdio.h>
#include <sys/types.h>
#include <utmp.h>
#include <sys/schedctl.h>
#include <core.h>
#include <sys/fs/dbfcntl.h>
#include <sys/signal.h>


main(argc, argv)
	int argc;
	char *argv[];
{
	int i;
	char *at, *host;
	struct utmp *utp;
	extern struct utmp *getutent();
	extern char *ctime(), *gethost();

	/* if this isn't a normal everyday vanilla 'who', punt to the SGI one */
	if (argc > 1) {
		(void) execv("/bin/who", argv);
		(void) execv("/usr/bin/who", argv);
		fprintf(stderr, "%s: can't find normal 'who', giving up.\n",
				argv[0]);
		exit(1);
	}

	/* setup for critical region in gethost() */
	/*	ignore all signals */
	for (i = 1; i < MAXSIG; ++i)
		(void) signal(i, SIG_IGN);
	/*	up my priority so this runs fast... */
	(void) schedctl(NDPRI, 0, NDPHIMAX);
	
	while ((utp = getutent()) != NULL) {
		if (utp->ut_type != USER_PROCESS)
			continue;
		at = ctime(&utp->ut_time);
		at[16] = '\0';
		printf("%-8.8s %-8.8s%s", utp->ut_name, utp->ut_line, at+4);
		if ((host = gethost(utp->ut_pid)) != NULL)
			printf("   (%s)", host);
		putchar('\n');
	}
	exit(0);
}

char *
gethost(pid)
	int pid;
{
	register char *cp;
	int fd, n, i;
	static char buf[0x3000];
	extern char *strchr();

	(void) sprintf(buf, "/debug/%d", pid);
	if ((fd = open(buf, 0)) < 0) {
		perror("open");
		return NULL;
	}
	/* critical section start */
	if (fcntl(fd, DFCSTOP, 0) < 0) {
		perror("fcntl 1");
		return NULL;
	}
	if (lseek(fd, (long)stackbas(sizeof buf), 0) < 0) {
		perror("lseek");
		return NULL;
	}
	if ((n = read(fd, buf, sizeof buf)) < 0) {
		perror("read");
		return NULL;
	}
	i = CLEARNOSIG;
	if (fcntl(fd, DFCRUN, &i) < 0) {
		perror("fcntl 3");
		return NULL;
	}
	/* critical section end */
	cp = buf + n;
	while (--cp > buf) {
		if (*cp == 'R' && *(cp+1) == 'E'
		    && strncmp(cp, "REMOTEHOST=",11) == 0)
			return strchr(cp, '=') + 1;
	}
	return NULL;
}