GQ.RLG@forsythe.stanford.edu (Dick Guertin) (02/07/89)
In article <28200268@mcdurb>, aglew@mcdurb.Urbana.Gould.COM writes: -> ->Abstract from: comp.arch 8281, 6 Feb 89, 56 lines. -> ->May I encourage people implementing string libraries to use an extra ->level of indirection? Instead of length immediately preceding the string, ->let length be associated with a pointer to the string. Makes ->substringing operations much easier, and has the ability to reduce ->unnecessary copies (at the risk of increased aliasing). -> -> +------+---+ -> |length|ptr| -> +------+---+ -> | -> +------+ -> | -> V -> +---+---+---+---+---+---+---+---+---+---+---+---+---+ -> | H | E | L | L | O | , | | W | O | R | L | D | \n| -> +---+---+---+---+---+---+---+---+---+---+---+---+---+ Such an implementation has adverse effects when the string is sent to/from an external device, such as a file. The 'length' must be with the string, or the string needs a terminator character. Furthermore, when a 'ptr' is changed to point to a new string, what happens to the 'length' information for the old string?
jk3k+@andrew.cmu.edu (Joe Keane) (02/08/89)
Dick Guertin writes: > Such an implementation has adverse effects when the string is sent to/from an > external device, such as a file. The 'length' must be with the string, or the > string needs a terminator character. Look what read() and write() use: length and pointer. The DMA operations they do probably use the same thing. For these operations, you can't use a terminator character, and i don't see why you'd want the length next to the characters.
shapiro@rb-dc1.UUCP (Mike Shapiro) (02/10/89)
In article <1944@lindy.Stanford.EDU> GQ.RLG@forsythe.stanford.edu (Dick Guertin) writes: <<< deleted reference >>> >Such an implementation has adverse effects when the string is sent >to/from an external device, such as a file. The 'length' must be >with the string, or the string needs a terminator character. >Furthermore, when a 'ptr' is changed to point to a new string, >what happens to the 'length' information for the old string? Note that in Bell Labs language language designed for string handling, SNOBOL4, which appeared in the 1960s (before C, UNIX, et al), string descriptors had three fields for strings in storage: -- pointer to start of string in memory -- offset from start of string storage where string actually starts -- length of string Because the language had many operations on strings, a substring was easy to compute. Copy the string descriptor to the substring descriptor and then adjust the offset and length fields. For more information of string operation implementation, see Ralph Griswold's book on the Macro Implementation of SNOBOL4. Or see his later work on the Icon language. (If desperate for material on how this relates to computer architecture, dig up a copy of my 1972 dissertation on the architecture of a machine for string manipulation -- a SNOBOL machine.) Michael Shapiro