gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/09/90)
When passing strings to other functions, what is the BEST way to find the bytes remaining in the formal string parameter (to prevent over- writting the end while in the function)?? Does it involve using the current starting address of the string parameter and calculating (somehow) the DEFINED end?? Thanks for any help here...
gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/09/90)
Just a note of clarification here...I am talking about a character array and I am looking for a solution (not the obvious '...add another length parameter')...I would like the function to be able to 'figure it out!' Thanks again! -- John L. Bradberry |Georgia Tech Research Inst|uucp:..!prism!gt4512c Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism 2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI) |GTRI:jbrad@msd.gatech. Marietta, Ga. 30067 |404 438-4181 (SCI) |'...is this thing on..?'
karl@ima.isc.com (Karl Heuer) (11/10/90)
In article <16758@hydra.gatech.EDU> gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: >Just a note of clarification here...I am talking about a character array >and I am looking for a solution (not the obvious '...add another length >parameter')...I would like the function to be able to 'figure it out!' Here are the options that spring to mind: (a) Pass a length parameter. (b) Pass a pointer to the end of the string. (c) Implement a string structure that does one of the above for you, e.g. typedef struct { char *start; char *current; char *end; } string_t; (d) Use only implementations that support the "Read Operator's Mind" syscall. Next question? Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
gwyn@smoke.brl.mil (Doug Gwyn) (11/10/90)
In article <16752@hydra.gatech.EDU> gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: >When passing strings to other functions, what is the BEST way to find >the bytes remaining in the formal string parameter (to prevent over- >writting the end while in the function)?? Does it involve using the >current starting address of the string parameter and calculating >(somehow) the DEFINED end?? What on earth do you mean? You don't pass strings to C functions; the best you can do is pass pointers to arrays containing chars. There is no way to determine in the called function where the end of the array allocation might be, given merely a pointer into it. Conventionally, C programming relies heavily on 0-terminated char arrays to represent character strings, but the 0 terminator value does not normally indicate anything about the valid extent of the array within which it lies. (For string LITERALS it does, but you can't count on being able to write into a string literal. Some systems put them into read-only memory.)
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/10/90)
In article <16758@hydra.gatech.EDU>, gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: > Just a note of clarification here...I am talking about a character array > and I am looking for a solution (not the obvious '...add another length > parameter')...I would like the function to be able to 'figure it out!' When you pass an array to a function, the function gets a pointer. Even ANSI C has no way of declaration a function argument that really _is_ an array. The syntax "SomeType an_arg[]" declares a pointer. That pointer looks just like any other pointer into that array. The function can't even find the _beginning_ of the array, let alone the end. (Not portably, at any rate.) If you want your function to receive things like one-dimensional arrays in Smalltalk or Lisp or Pop or Algol 68 or PL/I, then you'll have to implement that data structure yourself using C primitives. -- The problem about real life is that moving one's knight to QB3 may always be replied to with a lob across the net. --Alasdair Macintyre.
fraser@bilby.cs.uwa.oz.au (Fraser Wilson) (11/10/90)
In <16752@hydra.gatech.EDU> gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: >When passing strings to other functions, what is the BEST way to find >the bytes remaining in the formal string parameter (to prevent over- >writting the end while in the function)?? Does it involve using the >current starting address of the string parameter and calculating >(somehow) the DEFINED end?? The language supports no feature like this. All the function knows is that it has a pointer to char. If you want to know how much space is available, you have to do it yourself. eg #define LAST_CHARACTER '\1'; char s[32]; space(char *s) { int i; for(i=0;s[i]!=LAST_CHARACTER;i++); return i; } main() { s[31]=LAST_CHARACTER; printf("%i\n",space(s)); } >Thanks for any help here... You're welcome. Hope it does. Fraser.
hagins@gamecock.rtp.dg.com (Jody Hagins) (11/11/90)
In article <16758@hydra.gatech.EDU>, gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: |> Just a note of clarification here...I am talking about a character array |> and I am looking for a solution (not the obvious '...add another length |> parameter')...I would like the function to be able to 'figure it out!' |> |> Thanks again! |> |> -- |> John L. Bradberry |Georgia Tech Research Inst|uucp:..!prism!gt4512c |> Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism |> 2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI) |GTRI:jbrad@msd.gatech. |> Marietta, Ga. 30067 |404 438-4181 (SCI) |'...is this thing on..?' There are several ways to do this. 1. typedef struct { int size; /* Alternatively, this could be the last address */ char * s; } string_t; void strinit(string_t *sp, char *cp, int size) { sp->s = cp; sp->size = size; } main() { char some_var[SOME_SIZE]; string_t string; strinit(&string, some_var, SOME_SIZE); ... } Whenever you want to use the C string functions, send string->s. If you want to use your own, send &string and access both start address and the length. 2. #define DEFINED_EOS ((char)1) /* Any char you can guarantee not in a string */ #define FILL_CHAR '\0' /* Any char except DEFINED_EOS */ char *strinit(char *s, int size) { int i; for(i=0; i<size; i++) s[i] = FILL_CHAR; s[SIZE] = DEFINED_EOS; return(s); } int defined_strlen(char *s) { int i=0; while (*s++ != DEFINED_EOS) i++; return(i); } main() { char s[SOME_SIZE+1]; /* For this implementation, you will need one extra byte to store the defined-end-of-string location. Otherwise, it could get overwritten by '\0'. */ strinit(s, SOME_SIZE); ... } Now, the C routines work, and you can get the defined length of any string by calling defined_strlen(s). Hope this helps. Jody Hagins hagins@gamecock.rtp.dg.com
gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/11/90)
The 'curt' response here is unnecessary! The question was raised only because this can be done in many other languages (easily!) I was simply looking for a few creative ideas if possible in C. If it's not possible (the current average response) a simple 'it can't be done in C' would suffice...Maturity seems elusive for some... -- John L. Bradberry |Georgia Tech Research Inst|uucp:..!prism!gt4512c Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism 2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI) |GTRI:jbrad@msd.gatech. Marietta, Ga. 30067 |404 438-4181 (SCI) |'...is this thing on..?'
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (11/11/90)
After much email, John's got the picture. (Hint.) In article <14411@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >In article <16752@hydra.gatech.EDU> gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: >>When passing strings to other functions, what is the BEST way to find >>the bytes remaining in the formal string parameter (to prevent over- > >What on earth do you mean? You don't pass strings to C functions; the >best you can do is pass pointers to arrays containing chars. There is >no way to determine in the called function where the end of the array >allocation might be, given merely a pointer into it. "pointers to arrays containing chars"? typedef char A[SIZE]; ... A *foo; ... func(foo); func( A *bar ) /* bar points to array of chars */ { printf( "%d\n", sizeof (*bar) ); } Note, however, that foo may not point to an arbitrary string (unless you're careful to put all your strings in these arrays). Note also that we're not really using an arbitrary string, but only one that's stored in an array SIZE of char. Note also that, since it is a pointer to an array rather than a pointer to char, you have to get the chars through `*foo[i]' or `**foo' rather than `*foo' or `foo[i]'. The most important thing to note is that `func()' *knows* the size of the array. You have compiled it in. This defeats certain purposes. The usual convention for "passing strings" is to pass a pointer to the first character of the array rather than a pointer to the array. This depends on the fact that you've stored the string as chars in contiguous memory locations and on the assumption that the last character of the string is a '\0'. This is also how strings are handled in all of the library functions and system calls you're likely to encounter. It provides flexibility; otherwise, all strings would have a minimum storage allocation and a maximum length (although, technically, with the current convention the minimum usable is one char ('\0'), and certain features of systemic i/o often lead one to limit the SIZE consistently to BUFSIZ...)). The only reason you should want the size of the array a string is stored in is if you are at some point more interested in the array than the string (as is fgets(), for example, which of course asks for a pointer to the first character's location and for the size). --Blair "Did I hear an 'oops'?"
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (11/13/90)
In article <921@inews.intel.com> bhoughto@cmdnfs.intel.com (Blair P. Houghton) writes: >"pointers to arrays containing chars"? > typedef char A[SIZE]; > A *foo; > >Note also that, since it is a pointer >to an array rather than a pointer to char, you have to get >the chars through `*foo[i]' or `**foo' rather than `*foo' >or `foo[i]'. I'm sorry, this is wrong. The `[]' have a higher precedence than the `*', so it would have to be `(*foo)[]' for the subscripted version, if `foo' were a pointer to a pointer. In order to use `foo' (as anything other than NULL) you have to provide an array of the proper type, properly allocated, and then assign the "address" of that array to `foo'. But, in the act of assigning that address, you turn the array into the pointer to its first element. There is in fact no way to get the address of an array without actually getting the address of the first element. This still does not make the two of them compatible types[*]. However, it means that `**foo' is not correct. `*foo' is the proper pointer dereference to get `array[0]', `foo[i]' is the proper subscripted version, even though `foo' is a proper pointer-to-array. [*] The assignment `foo = &bar' is right, the assignment `foo = bar' is incorrect (gcc -ansi just produces a warning, but, if you believe in the Pointer Fairy (as you and Doug know I do :-), you believe that all pointers are the same, anyway, so `foo = bar' works even though it's hubris; but don't trust it). --Blair "The PF still owes me a quarter. She's probably just waiting for those pointer-subtraction semantics I promised I'd post... :-/"
weimer@ssd.kodak.com (Gary Weimer) (11/13/90)
In article <1990Nov09.183957.15122@dirtydog.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes: >In article <16758@hydra.gatech.EDU> gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) writes: >>Just a note of clarification here...I am talking about a character array >>and I am looking for a solution (not the obvious '...add another length >>parameter')...I would like the function to be able to 'figure it out!' > >Here are the options that spring to mind: >(a) Pass a length parameter. >(b) Pass a pointer to the end of the string. >(c) Implement a string structure that does one of the above for you, e.g. > typedef struct { char *start; char *current; char *end; } string_t; >(d) Use only implementations that support the "Read Operator's Mind" syscall. (e) Use the "standard" C workaround for this problem. As other people have pointed out, this question looks like it was posed by a C crossover from another language, so why don't we tell them what C can do, instead of what it can't. NOTE: for both solutions given below, don't forget to count the extra space (byte, or whatever you want to call it) required by the end-of-string character (\0). EASY SOLUTION: If all the strings you will be using are less than some number N (and you have enough memory), then create a constant: #define MAX_LEN N where N can be any number greater than 0 (I like 255 for most cases). Now define all your character arrays as: char name[MAX_LEN]; when performing loops, range checking, etc., use MAX_LEN ROBUST SOLUTION: If you don't have a maximum length, or can't afford to waste memory, use character pointers and malloc() memory as it is needed. This will allow you to continue using the C string library; however, functions like strcat() should probably be avoided (unless you malloc'd enough space for this). An example (note I didn't say good) of a strcat() replacement is: char *mystrcat(char *s, char *t) { char *str; str = (char *) malloc(strlen(s) + strlen(t) + 1); /* should add check for str == NULL here (malloc() failed) */ strcpy(str, s); strcat(str, t); /* NOTE: these next 2 statements disallow passing an */ /* array of char as s (use strcat() for this) */ free(s); s = str; return(s); } With this solution, you will still want one string of some maximum size to read in strings of unknown length. This could then be copied to a string of the appropriate size (strdup() might be a good method): char *strdup(char *s) /* no, not a C library function */ { char *str; str = (char *) malloc(strlen(s) + 1); strcpy(str, s); return(str); /* notice that s is unchanged, and could */ /* have been declared: char s[MAX_LEN] */ } (OH BOY, now I get to see how many people think this is stupid...) Gary Weimer
gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/14/90)
In article < 34449 weimer@ssd.kodak.com> (Gary Weimer) writes: >>>In article <16758@hydra.gatech.EDU> gt4512c@prism.gatech.EDU >>>(BRADBERRY,JOHN L) writes: . . . >>>Here are the options that spring to mind: >>>(a) Pass a length parameter. >>>(b) Pass a pointer to the end of the string. >>>(c) Implement a string structure that does one of the above for >>>you, e.g. >>> typedef struct { char *start; char *current; char *end; } >>>string_t; >>>(d) Use only implementations that support the "Read Operator's >>>Mind" syscall. >> >>(e) Use the "standard" C workaround for this problem. > >As other people have pointed out, this question looks like it was >posed by a C crossover from another language, so why don't we >tell them what C can do, instead of what it can't. > Actually, in the graphics and signal processing area, I frequently have to port (rewrite) thousands of lines of code from other languages to C. In the process, I find it quite interesting to attempt where practical (possible) to duplicate some features so that the code algorithms appear as similar as possible. C makes this possible more often than not. The original post was in no way a criticism of C (I think the language is tremendous!), but a question of how something might be done! To date I've gotten close to 100 very creative 'workarounds' which is the next best thing to an exact solution. For that I am very thankful because I'm sure few of us would like to 'intentionally' recreate the wheel... -- John L. Bradberry |Georgia Tech Research Inst|uucp:..!prism!gt4512c Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism 2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI) |GTRI:jbrad@msd.gatech. Marietta, Ga. 30067 |404 438-4181 (SCI) |'...is this thing on..?'
hamish@mate.sybase.com (Just Another Deckchair on the Titanic) (11/15/90)
In article <931@inews.intel.com> bhoughto@cmdnfs.intel.com (Blair P. Houghton) writes: > < --Blair > "The PF still owes me a quarter. < She's probably just waiting for those > pointer-subtraction semantics I promised < I'd post... :-/" Ummm, Blair, the pointer fairy keeps telling me it was pointer *addition* you owe her for. My how the humble are risen.... This is your past tapping you on the shoulder... Hamish ---------------------------------------------------------------------------- Hamish Reid Sybase Inc, 6475 Christie Ave, Emeryville CA 94608 USA +1 415 596-3917 hamish@sybase.com ...!{mtxinu,sun}!sybase!hamish
msb@sq.sq.com (Mark Brader) (11/17/90)
> EASY SOLUTION: > If all the strings you will be using are less than some number N (and > you have enough memory), then create a constant: > #define MAX_LEN N > ... Now define all your character arrays as: > char name[MAX_LEN]; > when performing loops, range checking, etc., use MAX_LEN It generally seems to me to produce clearer code if the constant that one defines specifies, not the length of the buffer (as above), but the maximum length of the string contained in it. That is: char name[MAX_LEN+1]; /* +1 for '\0' */ If you use this declaration style routinely, you get rid of a lot of -1's scattered through the code wherever there are loops and limit checks; and if you do make an off-by-one error, it tends to fail safe. It is conceded that the preference for this is somewhat a matter of opinion, and followups merely to agree or disagree with this opinion are dissuaded. Likewise for the presence or absence of the comment. -- Mark Brader "It's simply a matter of style, and while there SoftQuad Inc., Toronto are many wrong styles, there really isn't any utzoo!sq!msb, msb@sq.com one right style." -- Ray Butterworth This article is in the public domain.
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (11/19/90)
In article <11757@sybase.sybase.com> hamish@mate.sybase.com (Just Another Deckchair on the Titanic) writes: >Ummm, Blair, the pointer fairy keeps telling me it was pointer >*addition* you owe her for. My how the humble are risen.... Addition, subtraction, who can tell the diff? :-) --Blair "Sedition, Abstraction, who can tell a Wizard?"
boyd@necisa.ho.necisa.oz (Boyd Roberts) (11/20/90)
In article <1990Nov17.070228.29295@sq.sq.com> msb@sq.sq.com (Mark Brader) writes: >It generally seems to me to produce clearer code if the constant that >one defines specifies, not the length of the buffer (as above), but >the maximum length of the string contained in it. That is: > > char name[MAX_LEN+1]; /* +1 for '\0' */ Ever written any Pascal? Therein lies madness. All these fixed length character strings are a poor man's solution. Dynamic structures aren't that hard to code up, and once you have the right library routines it's trivial to use them with future code. Code them once, use them everywhere. Only today, did my mail user agent trash the RFC 822 address parser because it [the parser*] decided that lines _never_ exceeded 256 characters. My user agent says lines are long as you've got virtual memory for. So it's time to persuade the parser about dynamics. Should be easy. Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...'' * Snarfed off the net at some stage.
martin@mwtech.UUCP (Martin Weitzel) (11/24/90)
In article <1946@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes: [in a thread dealing with the C representation for character strings] > >All these fixed length character strings are a poor man's solution. >Dynamic structures aren't that hard to code up, and once you have >the right library routines it's trivial to use them with future code. >Code them once, use them everywhere. I can second this. I once tried to enhance C's elegant and space efficient character string representation with dynamic space allocation. I ended up in a similar trick as the one often used in malloc: Character strings of varying lenght were represented by a pointer to the first byte and terminated by a '\0'-character - exactly the way C does this normaly. So, a program could normaly use "char *" variables for such strings and hand them to all functions with only read-access in the normal way. +----------+ for "read only" access | '\0' | use normaly, for write | ........ | access use special functions | 3rd char | +--------+ | 2nd char | | char * -------------------------> | 1st char | +--------+ +----------+ ^ | length | | +----------+ +--------------------------------- char ** | back-pointer +----------+ Additionally for such varying strings there were a special setup operation for the said "char *" variable, which allocated space for the string itself and additionally, at some adress immediatly *below* the string space for a length field and a back-pointer to the variable. After such an initialization, things looked basically as above. There were some special functions, that - when changing the string - looked for the length field, and did eventually realloc the space. The only thing which had to be done carefully (besides only using the special functions to change them and not to forget freeing them when the pointer goes out of scope) was not to set up pointers into them, as the reallocation could only change the one reference which it new from the back-pointer. This included that these strings should never be passed to functions as arguments, and then changed more than once from within the function ... (Well, no problem without solution: One could either pass pointers to them or "swap them out and in" from a local instances.) -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83