[comp.os.minix] More on sh/shar

paradis@encore.UUCP (Jim Paradis) (05/28/87)

Remember my last message complaining about /bin/sh being unable to
unpack "real" shell archives?  Well, last night I got SO sick&tired
of the situation that I took a deep breath and dove head-first into
the code for /bin/sh.  Getting it to compile was a feat unto itself...
(seems the MINIX compiler handles globals and externs differently
from the PC/IX compiler...). 

Side note:  Why is it that command interpreter code is almost ALWAYS
devoid of useful comments?  I've had to swim around in System V sh,
BSD4.2 csh and MINIX sh, and in all three cases the only useful
comments in the code were those added after the fact by the local
developers...

Anyway, after dredging through the code, I found that my worst fears
were indeed true:  on "<<" input redirections, sh does indeed try to
read the ENTIRE redirected input into memory.  What's REALLY stupid
is that when it actually goes to feed the redirected input into a
command, it creates a temporary file and writes the data out to the
file.  Why not just dump the data into the temporary file in the
first place and skip the memory-hogging?  What's REALLY REALLY stupid
is that after it creates the temp file, it doesn't bother to release
the memory that was used to store the data!!

Anyhow, I modified the shell so as to read the data in and dump
it directly into a temporary file -- the only in-memory buffering
is that of the line currently being read (limit of 512 chars).
The damn thing actually worked the first time, too!  Anyway, as soon
as I'm done cleaning up the code a bit (making sure it releases all 
the memory it allocates, making sure it properly deletes the temporary
file, etc). I'll post the changes for all to share.  I'll also post
the changes that were required for me to recompile the thing under
MINIX...

   +----------------+  Jim Paradis                  linus--+
+--+-------------+  |  Encore Computer Corp.       necntc--|
|  | E N C O R E |  |  257 Cedar Hill St.           ihnp4--+-encore!paradis
|  +-------------+--+  Marlboro MA 01752           decvax--|
+----------------+     (617) 460-0500             talcott--+
You don't honestly think ENCORE is responsible for this??!!

rmtodd@uokmax.UUCP (Richard Michael Todd) (05/31/87)

In article <1643@encore.UUCP>, paradis@encore.UUCP (Jim Paradis) writes:
> Remember my last message complaining about /bin/sh being unable to
> unpack "real" shell archives?  Well, last night I got SO sick&tired
> of the situation that I took a deep breath and dove head-first into
> the code for /bin/sh.  Getting it to compile was a feat unto itself...
> (seems the MINIX compiler handles globals and externs differently
> from the PC/IX compiler...). 
Yep.  Actually it's a difference between how UNIX-type linkers (of which
PC/IX's is apparently one) and most everybody else's handle globals.
Under UNIX, you can have multiple files each with a global var. defined,
say:
file foo.c:
	int a;
...

file bar.c:
	int a;
...
and the UNIX linker will see the variable declared multiple times in the
modules and allocate space in the result file only once.  Most other linkers
consider duplicate variable declarations to be errors.  For C programs
on those systems, the global definition can appear in only one file, and
the other files must use extern declarations (e.g. "extern int a;").
On a side note, I've also gotten the shell to recompile, both under MINIX
and under Aztec C (running the output thru dos2out).  I changed all the 
offending definitions in sh.h (which gets included in every file) to
read "EXTERN int whatever;" and #defined EXTERN extern in every file except
one.  Very minimal changes were needed to get it to compile under Aztec
C after I'd gotten it to compile under MINIX cc.
--------------------------------------------------------------------------
Richard Todd
USSnail:820 Annie Court,Norman OK 73069
UUCP: {allegra!cbosgd|ihnp4}!okstate!uokmax!rmtodd

fdg@sortac.UUCP (06/03/87)

In article <581@uokmax.UUCP> rmtodd@uokmax.UUCP (Richard Michael Todd) writes:
>In article <1643@encore.UUCP>, paradis@encore.UUCP (Jim Paradis) writes:
>> the code for /bin/sh.  Getting it to compile was a feat unto itself...
>> (seems the MINIX compiler handles globals and externs differently
>> from the PC/IX compiler...). 
>Yep.  Actually it's a difference between how UNIX-type linkers (of which
>PC/IX's is apparently one) and most everybody else's handle globals.
>Under UNIX, you can have multiple files each with a global var. defined,
>say:
>file foo.c:
>	int a;
>...
>
>file bar.c:
>	int a;
>...
>and the UNIX linker will see the variable declared multiple times in the
>modules and allocate space in the result file only once.  Most other linkers
>consider duplicate variable declarations to be errors.  For C programs
>on those systems, the global definition can appear in only one file, and
>the other files must use extern declarations (e.g. "extern int a;").
>On a side note, I've also gotten the shell to recompile, both under MINIX
>and under Aztec C (running the output thru dos2out).  I changed all the 
>offending definitions in sh.h (which gets included in every file) to
>read "EXTERN int whatever;" and #defined EXTERN extern in every file except
>one.  Very minimal changes were needed to get it to compile under Aztec
>C after I'd gotten it to compile under MINIX cc.
>--------------------------------------------------------------------------
>Richard Todd
The "BOOK" is on order; therefore, I do not have the MINIX code yet and can
not comment on its handling of globals.  I do, however, use C and UNIX
daily and can comment on that.  In your example, "int a" is not global in
either case.  It is in fact local to the file that declares it.  To be
global, it must be declared in a header file, such as sh.h, or be declared
outside of the function.  An example would be

int a, b;  /* These are global. */

main()
{
	extern int a, b;   /* a & b are globals used by main */
	int c, x;          /* c, x, & y are local to main    */
	char *y;
...
	foo(x,y);
...
	bar();
...
}

foo(w,m)
int w;                     /* w & m are local to foo */
char *m;
{
	extern int a;      /* a is a global used by foo */
...
}

bar()
{
	extern int b;      /* b is a global used by bar 
	int a;             ** a is local to bar - Note: the
			   ** compiler and linker should complete
			   ** but lint will warn you that a
			   ** has been redeclared here masking the
			   ** global a.
		           */
...
}

This posting is not meant to be critical.  I am attempting to clear up a
point of confusion.

Fred Gant  akgua!sortac!fdg
AT&T Network Systems