shelley@atc.sps.mot.com (Norman K. Shelley) (02/15/90)
New Desired Upgrade for Eiffel 3.0 (14Feb90) Currently Using; Eiffel 2.2B Class upgrade: STRING Problem Statement: 1.) The current Eiffel model of the class "STRING" is based on the C model of strings found in the C subroutine library. The model requires an allocation of one more byte than the number of characters in a string and this byte is the last byte in a string and it is NULL signifing the end of the string. String lengths are dynamically computed and there is no idea of saving/updating this state. 2.) Eiffel needs a better model that reflects the unique features Eiffel offers without sacrificing execution speed. Assertions heavily rely on checking the length of the string. This is a function and not an attribute thereby demanding much more CPU resources then should be required. Givens: 1.) The feature "count" of class "STRING" is an Eiffel function that calls the C function "strlen". When "strlen" is called it takes the input pointer and starts a count of the number of characters pointed to by the input pointer. It stops the count when it finds a NULL character. Therefore this function is expensive in time. This can become compounded when it is called repeatedly or for long strings. 2.) The class "STRING" uses the function "count" in MANY of its assertions therefore increasing the number of calls to this expensive function. 3.) Not counting uses of the function "count" in assertions it is called in seven different places (mainly from append, prepend, extend, and precede) Desired Features: 1.) Until a more suitable model is found and implemented the function "count" should be implemented as an attribute. Problems in modifing class STRING: 1.) Only an I.S.E. employee can change I.S.E.'s compiler and the compiler treats class "STRING" specially e.g. a constant STRING never calls the normal STRING create therefore the count field can not be set for STRING constants. 2.) Other C calls like "rm_all" change the string with no concept of updating a dynamic count field. Norman Shelley Motorola - ATC 2200 W. Broadway AZ09/M350 Mesa, AZ 85202 ...!uunet!dover!atc!shelley shelley@atc.sps.mot.com (602) 962-2473
alonzo@microsoft.UUCP (Alonzo GARIEPY) (02/20/90)
In article <48a5572e.12c9a@digital.sps.mot.com> shelley@atc.sps.mot.com (Norman K. Shelley) writes: > Problem Statement: > > 2.) Eiffel needs a better model that reflects the unique features Eiffel > offers without sacrificing execution speed. Assertions heavily rely on > checking the length of the string. This is a function and not an attribute > thereby demanding much more CPU resources then should be required. I'm not convinced that the speed performance of assertions is of primary concern. Assertions will be turned off in high-performance versions of a program. The speed and memory performance of production programs on production machines is more important. Zero-terminated strings are convenient because they save memory and interface well with runtime libraries intended for use with C. Many algorithms execute faster with zero-terminated strings because a separate index need not be maintained. If your style of programming uses lengths and indexes more than pointers or iterators, counted strings are probably better. I tend to agree that Eiffel is such a lengths-and-indexes language, and that the use of zero-terminated strings isn't buying much performance except to avoid continual conversions when dealing with the operating system. But conversion can require a very annoying amount of memory management. The ideal would be a string type that has both a length attribute and zero- termination. You can keep the length and termination updated all the time or calculate and cache one of them as necessary. We have been using strings of this kind at Microsoft for years. Zero terminated strings are a more primitive type, so it doesn't surprise me that languages rely on the programmer to optimize the length calculation themselves. A little more support for compile time calculation would make subclassing a viable solution to this kind of problem: constant objects could be fully constructed at compile time, making a programmer's classes equal in performance to the built-in ones. Alonzo Gariepy // The opinions expressed above are not alonzo@microsoft // necessarily those of Microsoft Corp.
bbadger@x102c.harris-atd.com (Badger BA 64810) (02/24/90)
In article <51017@microsoft.UUCP> alonzo@microsoft.UUCP (Alonzo GARIEPY) writes: [...] >I tend to agree that Eiffel is such a lengths-and-indexes language, and that >the use of zero-terminated strings isn't buying much performance except to >avoid continual conversions when dealing with the operating system. But >conversion can require a very annoying amount of memory management. The >ideal would be a string type that has both a length attribute and zero- >termination. You can keep the length and termination updated all the time >or calculate and cache one of them as necessary. We have been using strings >of this kind at Microsoft for years. [...] >Alonzo Gariepy // The opinions expressed above are not >alonzo@microsoft // necessarily those of Microsoft Corp. In a recent issue of ACM SIGPlan notices, a hybrid technique was described which puts a string length indicator at the end of the string buffer. That is, when a fixed-length buffer is allocated, the *last* byte tells how many bytes to subtract to find the zero-terminator byte. By design, this is exactly 0 when the last byte itself must serve as the zero terminating byte. I don't recall if this could be extended to buffers of size > 256. Something like buf[last] = number of bytes before buf[last] which contain the count. Might do it: "ABCDEFGHIJ" = buf[10] = 0 "ABCDEFGHI." = buf[10] = 1, buf[9]= 0 "ABCDEFGH.." = buf[10] = 1, buf[9]= 1, buf[8] = 0 Length of 300 - 259 = "0123 ... BCDEFGH.." = buf[300] = 2, buf[299]= 1(*256), buf[298] = 3 ---- Bernard A. Badger Jr. 407/984-6385 |"Get a LIFE!" --- J.H. Conway bbadger@x102c.ess.harris.com |Buddy, can you paradigm? bbadger%x102c@trantor.harris-atd.com |'s/./&&/g' Tom sed expansively.