[net.lang.c] Questions about C on a Prime

hokey@plus5.UUCP (Hokey) (03/19/86)

I went over to the local Prime office the other day in order to port some
software.  After the initial shock of having of discovering I will have to run
each file through dd in order to put it into the compressed/high-bit-on format
(a real thrill, they should have a special tar program to do it all at once),
I was told that their C port keeps the 8th bit *on* for all ascii characters.

This seems kind of strange.  Near as I can tell, this means the following
code fragment won't work:

	strcpy(dst, src)
	    char *dst;
	    char *src; {

		while(*dst++ = *src++) ;
		return;}

Can this be true?  If so, are there any other details I should know before
I try to do this again?
-- 
Hokey		..{ihnp4,seismo}!plus5!hokey
		314-725-9492

jsdy@hadron.UUCP (Joseph S. D. Yao) (03/22/86)

In article <988@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes:
>I was told that their C port keeps the 8th bit *on* for all ascii characters.
>This seems kind of strange.  Near as I can tell, this means the following
>code fragment won't work:
>	strcpy(dst, src)
>	    char *dst;
>	    char *src; {
>		while(*dst++ = *src++) ;
>		return;}

I've never really liked this kind of code:  it always seemed to me
to be assuming something that, someday, on some weird machine, would
fail.  Surprise!
My rule is to always use an explicit reference to a defined constant:
	while ((*dst+= = *src++) != NUL);
This way, if my character set changes, I worry about this less.  By
the way, will your P**** machine take a constant like
	#define NUL	'\0'
and turn it into an eighth-high character?

By the way, your indentation above seems to me a tad peculiar.
char *strcpy(dst, src)
  char *dst;
  register char *src;
{
	register char *d = dst;

	while ((*d++ = *src++) != NUL);
	return(dst);
}
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

gwyn@BRL.ARPA (VLD/VMB) (03/24/86)

Any C implementation that insists on having the high bit set
for normal character (char)s should also treat (char) as
(unsigned char), or else there will be sign-propagation
constants when (char)s are widened to (int).  Otherwise,
this is a permissible (although unusual) implementation choice.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/24/86)

In article <325@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>In article <988@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes:
>>I was told that their C port keeps the 8th bit *on* for all ascii characters.
>>This seems kind of strange.  Near as I can tell, this means the following
>>code fragment won't work:
>>	strcpy(dst, src)
>>	    char *dst;
>>	    char *src; {
>>		while(*dst++ = *src++) ;
>>		return;}
>I've never really liked this kind of code:  it always seemed to me
>to be assuming something that, someday, on some weird machine, would
>fail.  Surprise!
>My rule is to always use an explicit reference to a defined constant:
>	while ((*dst+= = *src++) != NUL);
>This way, if my character set changes, I worry about this less.  By
>the way, will your P**** machine take a constant like
>	#define NUL	'\0'
>and turn it into an eighth-high character?

No, the NUL terminator (both in practice and as you #define it)
will be a 0-valued byte.  The original code, although ugly, is
correct.

blarson@oberon.UUCP (Bob Larson) (03/26/86)

[I answered the original authors questions via mail.  I'm relpying to
the replies posted by people not using primes.]

Prime C does normally use characters with the most significant bit set.
Under Primos, this is for compatability with their other compilers and
utilities.  On Primix, (Primes port of Unix Sys Vr0 on a primos kernal
and filesystem, concurently with  Primos) this is due to sharing the 
file system with Primos.  Character constansts of the form '\nnn' do not
have the high bit set unless nnn > 177.  '\0' is still the normal string
terminator.  As Doug Gywn deduced, characters are unsigned.  

There is a compiler option to produce character constants with the high
bit cleared.  Using this option will probably force you to write all of
your own i/o routines.

Actully, the high bit is not the most obnoxious thing about the Prime
file system.  Other things to watch for are space compression, trailing
blank deletion, and padding lines to an even number of characters.  
(Disk files cannot contain an odd number of bytes.)

My experence with Primix is from a conference almost a year ago, before
it was released.  I do use the C compiler under Primos.  (The same compiler
is used with Primix, but with different libraries.)

(Character handling is a dog on Prime C without the -ix option and a Prime
that will handle it.)

-- 
Bob Larson
Arpa: Blarson@Usc-Ecl.Arpa
Uucp: ihnp4!sdcrdcf!oberon!blarson

bud@hcrvx1.UUCP (Bud Greasley) (03/27/86)

In article <2023@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>In article <325@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>>In article <988@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes:
>>>I was told that their C port keeps the 8th bit *on* for all ascii characters.
>>>This seems kind of strange.  Near as I can tell, this means the following
>>>code fragment won't work:
>>>	strcpy(dst, src)
>>>	    char *dst;
>>>	    char *src; {
>>>		while(*dst++ = *src++) ;
>>>		return;}
>>I've never really liked this kind of code:  it always seemed to me
>>to be assuming something that, someday, on some weird machine, would
>>fail.  Surprise!
>>My rule is to always use an explicit reference to a defined constant:
>>	while ((*dst+= = *src++) != NUL);
>>This way, if my character set changes, I worry about this less.  By
>>the way, will your P**** machine take a constant like
>>	#define NUL	'\0'
>>and turn it into an eighth-high character?

>No, the NUL terminator (both in practice and as you #define it)
>will be a 0-valued byte.  The original code, although ugly, is
>correct.

	This is correct.  The Prime C-compiler understands
	'\0' (octal 0) as the null character.  Comparisons such
	as described above will work correctly, as will comparisons
	of the form:
	  	while(*s++ != 0);

In article <2020@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:

>Any C implementation that insists on having the high bit set
>for normal character (char)s should also treat (char) as
>(unsigned char), or else there will be sign-propagation
>constants when (char)s are widened to (int).  Otherwise,
>this is a permissible (although unusual) implementation choice.

	This also is the case.  The Prime C-compiler does not
	do sign extension.
-- 
 	Bud Greasley
	{decvax|watmath|utzoo}!hcr!hcrvax!bud

bob@primerd.uucp (04/07/86)

On the other hand, character IO on a Prime with 32ix mode installed *really* 
screams.

By the way, it's files that are padded, not lines...


Bob Pellegrino
Prime Computer, Inc.

UUCP: decvax!genrad!mit-eddie!primerd!bobsun!bob
ARPA: Pellegrino@bbna

rbj@icst-cmr (Root Boy Jim) (04/09/86)

	Any C implementation that insists on having the high bit set
	for normal character (char)s should also treat (char) as
	(unsigned char), or else there will be sign-propagation
	constants when (char)s are widened to (int).  Otherwise,
	this is a permissible (although unusual) implementation choice.

Both K&R & the ANSI standard say that a printing character is
guaranteed to be positive. If the implementation (hardware) insists on
characters with the high order bit set, then my guess is that it must
be done in very low level code, possibly within putc just before
the char is sent to the device. Varian 620i (sounds like a BMW)
aka V-7x's wanted this as I seem to recall.

In any case, the following program must print `positive', for any
printing character 'x'.

main()
{	char c;
	c = 'x'
	if (c > 0)
		printf("positive\n");
	else	printf("netative\n");
}
	
References:	K&R	page 183	sect 6.1
(4/30/85)	ANSI	page 16		sect C.1.2.5