[comp.cog-eng] Null-terminated C strings

ralphw@TEMP.IUS.CS.CMU.EDU (Ralph Hyre) (12/24/87)

In article <422@anuck.UUCP> jrl@anuck.UUCP (j.r.lupien) writes:
>In article <14116@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
>> In article <174@quick.COM> srg@quick.COM (Spencer Garrett) writes:
>> >2) Having a CHARACTER to mark the end of a string is ever so much
>> >more convenient and efficient than having to compare lengths all the
>> >time
>> 
>> The problem with this is that you must reserve a character.  I've had
>
>This is a good point, but it is not a fatal problem. Note that whatever
>definition of strings you adopt, it is only relevant to string
>operations and string libraries...

This is true, but think about how many operations involving strings there
are in *nix.  One sloppy compiler or doprintf implementation (see below) will
kill you.  I submit that it's easier to screw up with null-terminated
than with length + data (or some variant, like dope vectors).

In article <2447@hall.cray.com>, blu@hall.cray.com (Brian Utterback) writes:
>Amen to that.  I just spent hours trying to find out what was wrong with
>a rasterfile to laserprinter filter.  It turned out that the problem is 
>that fprintf cannot output a null.  At least the compiler should issue
>a warning if it eats a null.  I mean, what is the use of being able to
>specify a character in a string (i.e. \000) if the compiler won't really
>use it?  And it KNEW it, and didn't tell me. Sheesh.

Ideally string libraries that 'eat' characters would internally use 
byte-stuffing to encode NULLs in the strings.  I guess that would violate
the 'simple but stupid' C philosophy:-)

>As far as terminals are concerned, a null is a null. A driver that 
>returns 0 on no data without setting ERRNO or something is broken,
>and must be fixed.
>> Strings with lengths ALWAYS work.
>> 
>No they don't. Did you read the article you responded to? 
>Given a fixed format count (is it int? short? long?) there is 
>a length of string you can't give the length of, due to overflow.
>Null terminated strings ALWAYS work in this regard. 

Strings with lengths work to the limits of practicality.  A 32-bit length
will handle strings the size of a processors' address space.  If you've
got a string that long you should probably use other data structures.

If your string usage requirements are different, perhaps you could implement
a string package the Lisp hacker's way:

string:	implementation type ('length+data' or 'NULL-terminated' or 'complex')
[this is where those tag bits in LISP-oriented processors pcome in handy]
	# of substrings
	address of substring #1
	length of substring #1
	address of sunstrings #2
	length of substring #1

Anyway, I think that anything that can be said on this subject has been said.
Please stop.
--
					- Ralph W. Hyre, Jr.

Internet: ralphw@ius2.cs.cmu.edu    Phone:(412)268-{2847,3275} CMU-{BUGS,DARK}
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA
-- 
					- Ralph W. Hyre, Jr.

Internet: ralphw@ius2.cs.cmu.edu    Phone:(412)268-{2847,3275} CMU-{BUGS,DARK}
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA