[unix-pc.sources] "Unstrip" command

ditto@cbmvax.UUCP (Michael "Ford" Ditto) (08/23/88)

Here is the "unstrip" source and man page (nroff -man -T37 unstrip.1 | col)
as advertised in unix-pc.bugs.  Basically, you run

	$ ld -r /lib/shlib.ifile ${program}
	$ unstrip a.out
	$ adb

And you are debugging an "unstripped" version of ${program}.  Only the
"public" global symbols from the shared library will be present, but
you can also add in any symbols you discover by putting them in an
ifile and including it in the ld line.  For example, if you discover
that "main" is at address 80036 you could put

	main = 0x80036;

in "mysyms" and add the namne "mysyms" to the ld command line above
and run unstrip again.

This program allows easier debugging of non-stripped programs which
include the shared library, since it makes the shared-library symbols
accessable by adb (You see "jsr printf" instead of "jsr 0x3000c6" or
whatever).  Unstrip is also useful for tracking down bugs in system
programs, as I demonstrated in a recent unix-pc.bugs article.

Speaking of that article, I realized later that I was setting a bad
precedent when I posted a dozen or so lines of disassembled code from
the passwd program...  The Unix PC License agreement not only forbids
distributing any part of Unix, but also forbids disassembly itself,
even for personal use!  Now, AT&T is known for having a constructive
attitude toward posting of small code excerpts for bug-elimination
purposes, but we should all keep the license agreement in mind.

Technical note:  All unstrip does is patch the "scnum" (section number)
field in the symbol entry.  The absolute symbols taken from the ifile
are not placed in any section by ld (because ld doesn't know where
/lib/shlib's sections are).  Unstrip looks at /lib/shlib to find out
where its text and data are and makes the necessary changes to the
file it is "unstripping".

Now, here are the goodies...

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	unstrip.1
#	unstrip.c
# This archive created: Sun Aug 21 20:53:23 1988
export PATH; PATH=/bin:$PATH
echo shar: extracting "'unstrip.1'" '(2370 characters)'
if test -f 'unstrip.1'
then
	echo shar: will not over-write existing file "'unstrip.1'"
else
sed 's/^X//' << \SHAR_EOF > 'unstrip.1'
X.TH UNSTRIP 1
X.SH NAME
Xunstrip \- add shared library symbols to stripped executable
X.SH SYNOPSIS
X.B unstrip
X[ -l libname ] files ...
X.SH DESCRIPTION
X.I Unstrip
Xmodifies the symbol table entries in the executable
X.I files,
Xmarking them as belonging to either the "text" or "data" section.
XThis allows the absolute symbols defined by the shared library ifile,
X.B /lib/shlib.ifile,
Xto be recognized and displayed by such programs as adb(1).
X.P
X.I Unstrip
Xrequires a symbol table to exist already.  The symbol table
Xmight be the result of linking the program with the shared library ifile
Xand not stripping it, or a stripped program can be re-linked with the
Xifile using the -r option.
X.I Unstrip
Xis then run, modifying the absolute symbols into text and data references
Xwhich can be seen by adb.
X.SH OPTIONS
XThe -l option causes
X.I libname
Xto be used instead of
X.B /lib/shlib
Xto find the locations of the text and data areas.  This option is normally
Xnot needed.
X.SH EXAMPLES
Xld -r /lib/shlib.ifile /bin/sync
X.br
Xunstrip a.out
X.br
X.RS
XThis adds the shared library public symbols to the "sync" program, placing
Xthe result in "a.out".  The symbol entries have been "fixed" so that if adb
Xis run on "a.out" it will correctly display names of shared library
Xfunctions.
X.RE
X.P
Xscc -o myprogram myprogram.c
X.br
Xunstrip myprogram
X.br
X.RS
XThis allows "myprogram" to be debugged with calls to library functions
Xbeing disassembled with the proper symbolic name.
X.B "scc"
Xis the shared-library-cc command, equivalent to cc(1) but using the shared
Xlibrary.
X.RE
X.SH DIAGNOSTICS
Xunstrip: filename: No symbols
X.br
X.RS
X.I filename
Xhas no symbols.  Perhaps they need to be added using "ld -r" as shown in
Xthe example above.
X.RE
X.P
XIgnoring address <address>
X.br
X.RS
XA symbol for
X.I <address>
Xdoes not belong to any section, but <address> is not within the text or
Xdata sections of
X.I libname
X(/lib/shlib).  This will not happen with stripped programs, but
Xabsolute symbols defined by the linker (like end, etext, edata)
Xcan fall into this category.
X.RE
X.P
XOther error messages are meant to be self-explanatory.
X.SH WARNING
XThe license agreement for Unix on the Unix PC specifies that the binary
Xexecutables provided under that agreement may not be disassembled or
Xotherwise converted to source code form.  The author of "unstrip" does
Xnot recommend violation of the Unix license agreement.
SHAR_EOF
if test 2370 -ne "`wc -c < 'unstrip.1'`"
then
	echo shar: error transmitting "'unstrip.1'" '(should have been 2370 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'unstrip.c'" '(4677 characters)'
if test -f 'unstrip.c'
then
	echo shar: will not over-write existing file "'unstrip.c'"
else
sed 's/^X//' << \SHAR_EOF > 'unstrip.c'
X/************************************************************
X *
X * This program was written by me, Mike "Ford" Ditto, and
X * I hereby release it into the public domain in the interest
X * of promoting the development of free, quality software
X * for the hackers and users of the world.
X *
X * Feel free to use, copy, modify, improve, and redistribute
X * this program, but keep in mind the spirit of this
X * contribution; always provide source, and always allow
X * free redistribution (shareware is fine with me).  If
X * you use a significant part of this code in a program of
X * yours, I would appreciate being given the appropriate
X * amount of credit.
X *				-=] Ford [=-
X *
X ************************************************************/
X
X#include <stdio.h>
X#include <fcntl.h>
X#include <a.out.h>
X
Xextern char *ctime();
Xextern long ftell();
X
X
Xchar *progname;
X
Xchar *libname = "/lib/shlib";
Xint textscn, datascn;
Xlong textstart, textend, datastart, dataend;
Xstruct filehdr filehdr;
Xstruct aouthdr aouthdr;
Xstruct scnhdr scnhdr;
Xstruct syment syment;
X
Xchar buf[256];
X
Xmain(argc, argv)
Xint argc;
Xchar *argv[];
X{
X    int status;
X    FILE *fp;
X
X    progname = *argv;
X
X    while ( (++argv,--argc) && **argv=='-' && argv[0][1] )
X    {
X	register char c, *p;
X
X	p = *argv+1;
X	while (c= *p++) switch(c)
X	{
X	case 'l':
X	    if (!(++argv, --argc))
X		goto badflag;
X	    libname = *argv;
X	    break;
X	default:
X	badflag:
X	    fprintf(stderr, "%s: bad flag `-%c'\n", progname, c);
X	usage:
X	    fprintf(stderr, "Usage: %s [-l libname] files\n", progname);
X	    return -1;
X	}
X    }
X
X    if ( (fp=fopen(libname, "r")) == NULL )
X    {
X	sprintf(buf, "%s: can't open %s", progname, libname);
X	perror(buf);
X	return -1;
X    }
X
X    if (readhdrs(fp, libname))
X    {
X	fclose(fp);
X	return -1;
X    }
X
X    textstart = aouthdr.text_start;
X    textend = aouthdr.text_start + aouthdr.tsize;
X    datastart = aouthdr.data_start;
X    dataend = aouthdr.data_start + aouthdr.dsize + aouthdr.bsize;
X
X#ifdef DEBUG
X    printf("textstart = 0x%06lx\n", textstart);
X    printf("textend   = 0x%06lx\n", textend);
X    printf("datastart = 0x%06lx\n", datastart);
X    printf("dataend   = 0x%06lx\n", dataend);
X#endif /* DEBUG */
X
X    while (argc-- > 0)
X	status += unstrip(*argv++);
X
X    return status;
X}
X
X
Xunstrip(file)
Xchar *file;
X{
X    int i;
X    FILE *fp;
X
X    if ( (fp=fopen(file, "r+")) == NULL )
X    {
X	sprintf(buf, "%s: can't open %s", progname, file);
X	perror(buf);
X	return 1;
X    }
X
X    if (readhdrs(fp, file))
X    {
X	fclose(fp);
X	return 1;
X    }
X
X    for ( i = 0 ; i<filehdr.f_nscns ; ++i )
X    {
X	if (fread((char *)&scnhdr, sizeof scnhdr, 1, fp) != 1)
X	{
X	    sprintf(buf, "%s: error reading scnhdr from %s", progname, file);
X	    perror(buf);
X	    fclose(fp);
X	    return 1;
X	}
X
X	if (scnhdr.s_flags & STYP_TEXT)
X	    textscn = i+1;
X	else if (scnhdr.s_flags & STYP_DATA)
X	    datascn = i+1;
X    }
X
X    if (!filehdr.f_symptr)
X    {
X	fprintf(stderr, "%s: %s: No symbols\n", progname, file);
X	fclose(fp);
X	return 1;
X    }
X
X    if (fseek(fp, filehdr.f_symptr, 0))
X    {
X	sprintf(buf, "%s: can't seek to symbol information in %s",
X		progname, file);
X	perror(buf);
X	fclose(fp);
X	return 1;
X    }
X
X    for ( i=0 ; i<filehdr.f_nsyms ; ++i )
X    {
X	if (fread(&syment, sizeof syment, 1, fp) != 1)
X	{
X	    sprintf(buf, "%s: can't read symbol information in %s",
X		    progname, file);
X	    perror(buf);
X	    fclose(fp);
X	    return 1;
X	}
X
X	if (syment.n_scnum == -1)
X	{
X	    if (syment.n_value >= textstart && syment.n_value < textend)
X		syment.n_scnum = textscn;
X	    else if (syment.n_value >= datastart && syment.n_value < dataend)
X		syment.n_scnum = datascn;
X	    else
X		fprintf(stderr, "Ignoring address 0x%06lx\n", syment.n_value);
X
X	    if (syment.n_scnum>0)
X	    {
X		fseek(fp, (long) -(sizeof syment), 1);
X		if (fwrite(&syment, sizeof syment, 1, fp) != 1)
X		{
X		    sprintf(buf, "%s: can't update symbol information in %s",
X			    progname, file);
X		    perror(buf);
X		    fclose(fp);
X		    return 1;
X		}
X		fseek(fp, 0L, 1);
X	    }
X	}
X
X	if (syment.n_numaux)
X	{
X	    fseek(fp, (long) sizeof (union auxent) * syment.n_numaux, 1);
X	    i += syment.n_numaux;
X	}
X    }
X
X    fclose(fp);
X    return 0;
X}
X
X
Xreadhdrs(fp, file)
XFILE *fp;
Xchar *file;
X{
X    if (fread((char *)&filehdr, sizeof filehdr, 1, fp) != 1)
X    {
X	sprintf(buf, "%s: error reading filehdr from %s", progname, file);
X	perror(buf);
X	return 1;
X    }
X
X    if (filehdr.f_opthdr != sizeof (aouthdr))
X    {
X	fprintf(stderr, "%s: no aouthdr in %s\n", progname, file);
X	return 1;
X    }
X
X    if (fread((char *)&aouthdr, sizeof aouthdr, 1, fp) != 1)
X    {
X	sprintf(buf, "%s: error reading aouthdr from %s", progname, file);
X	perror(buf);
X	return 1;
X    }
X
X    return 0;
X}
SHAR_EOF
if test 4677 -ne "`wc -c < 'unstrip.c'`"
then
	echo shar: error transmitting "'unstrip.c'" '(should have been 4677 characters)'
fi
fi # end of overwriting check
#	End of shell archive
exit 0
-- 
					-=] Ford [=-

	.		.		(In Real Life: Mike Ditto)
.	    :	       ,		ford@kenobi.cts.com
This space under construction,		...!ucsd!elgar!ford
pardon our dust.			ditto@cbmvax.commodore.com