[alt.sources.wanted] Problem with PD ksh, want sdb

bengtl@maths.lth.se (Bengt Larsson) (12/16/90)

Help! I've tried to compile the PD ksh posted to alt.sources a while ago.
This is on a 3B2, SYSV.3.0.

The problem is: ksh doesn't seem to recognize end-of-file, neither from
the terminal, a file, or a file included with the "." command. It just
hangs. Otherwise ksh seems to work :-)

The only debugger on this machine is sdb, and I haven't got any manpages.
Could anyone send the manpage for sdb to me? Thanks!

(Or alternatively: has anyone else had the same problem wth the PD ksh
 and solved it? More thanks!)
 
Bengt Larsson, hacking on the 3B2 which only has "sh".

-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl@maths.lth.se             SUNET:    TYCHE::BENGT_L

wcs) (12/17/90)

In article <1990Dec16.143234.21154@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes:
> Help! I've tried to compile the PD ksh posted to alt.sources a while ago.
> This is on a 3B2, SYSV.3.0.
> The problem is: ksh doesn't seem to recognize end-of-file, neither from
> the terminal, a file, or a file included with the "." command. It just

I haven't tried it, since we have the REAL ksh on our 3B2s, but I
can guess the problem.  You probably need to set the #defines or
other parameters to indicate that your machine uses unsigned characters.

The classic hunk of code that demonstrates this is:
	char c;	/* WRONG - SHOULD BE int c; */
	while ( (c=getchar()) != EOF ) do_something(c);
If you look at the getchar() manual page, it says that getchar()
returns an int, which will be EOF ( -1 ) at end of file, or the numeric
value of the character read if successful.  Some machines use signed
characters (8 bits, values from -128 to 127), while others like the
3B2 use unsigned (8 bits, values 0 to 255), while others have
different-sized bytes.  Note that EOF is NOT a valid unsigned-character.

Assuming your machine does twos-complement integer arithmetic,
like most modern machines, here's what happens:
	getchar hits EOF, returns -1 = 1...1 11111111   (16 or 32 1's)
	value is stored in char c = 11111111	(8 1's)
	to compare c with (integer) EOF, expand c to int:
		1..1 11111111	(if characters are signed
		0..0 11111111	(if characters are unsigned
	compare the expanded version with 1..1 11111111 :
		TRUE	(if signed)
		FALSE	(if unsigned)
If you had correctly declared int c; the return-value from getchar()
would have been stored as 1..111111111, compared with EOF, and
resulted in TRUE, regardless of whether characters are signed or unsigned.

What this means for the user is that buggy code like the above will
work fine on VAXes and Suns, but fail on 3B2s, just like buggy code
which dereferences NULL pointers will work fine on VAXes and fail on 3B2s.
Is this good or bad?  I used to think that the unsigned-character
behavior was annoying, but then I started doing graphics work on a
signed-character machine.  The character 11111111 doesn't happen
much in ASCII text, so code with the char c; bug works fine.
But in graphics, 11111111 and 00000000 are white and black pixels or
groups of pixels, and are very common - the buggy code will stop
reading the first time it hits a white spot on the screen.
Several activities I used a lot were to use a character as an index
for an array, and to compare the brightness of two pixels -
both of these are easy with unsigned characters and annoying with
signed characters.

Anyway, back to your PD-KSH on 3B2 problem.  If PD-KSH is written
correctly, there will be a #define in one of its .h files or a
variable in a Makefile that indicates whether you have signed or
unsigned characters, or does something like
	typedef small char;
which you will have to change to
	typedef small short int;
-- 
# Bill Stewart 908-949-0705 erebus.att.com!wcs AT&T Bell Labs 4M-312 Holmdel NJ
# "If it weren't for us, American troops would be invading exotic places like
# Lebanon and Grenada, and the Air Force would do stuff like bombing Libya"
#				Abbie Hoffman