scs@adam.pika.mit.edu (Steve Summit) (07/16/89)
Two minor points: In article <547@cybaswan.UUCP> iiit-sh@cybaswan.UUCP (Steve Hosgood) writes: >In article <2619@yunexus.UUCP>, davecb@yunexus.UUCP (David Collier-Brown) writes: >> 0) 6 characters is a lower limit of significance. You're allowed to use more. >Yeah, but though you may be able to use more, a 'strictly conforming' program >can't, otherwise it won't port to sites with 6-character linkers. Presumably the posters and most readers of this group understand it, but there are apparently many who interpret statements like the above to mean that external identifiers may not have more than six characters at all. I'm always seeing subroutine packages that have incredibly strained, abbreviated identifier names, to keep them six characters long. You're allowed to use extra characters for readability; the only problem is that if you say int identifier1, identifier2; you may get "identi: multiply defined." It should be perfectly legal to say int identifier, anotheridentifier; >If Ada becomes important due to the >military backing, can't 'C' ride its wake (so to speak) and insist on a >linker with sensible namewidth? I believe X3J11 is on record as strongly encouraging better linkers, and indicating that the six-character significance limit is likely to disappear in future revisions. (Happily, this would be a backwards-compatible change, except for programs with typoes.) No one (not even the maintainers of the systems that have them) likes the situation with respect to old-fashioned linkers. However, the compromise had to be made to ensure a standard that could and would be used. Steve Summit scs@adam.pika.mit.edu P.S. I think that ADA places much more complicated demands on a linker than simply that it have more than six characters' worth of significance. (That is, ADA tends to require a new linker anyway, not just for longer names.) Every ADA implementation I've seen (not many) comes with its own linker, at least partially superseding the operating system's default one, even on systems such as VMS which already have remarkably powerful standard linkers and object file formats. A language that supports object/package/cluster concepts typically requires complicated name resolution and binding in the link phase which many (most) existing linkers simply aren't equipped to do. (C++ has the same problem.)
ps@celerity.UUCP (Patricia Shanahan) (07/26/89)
In article <12711@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes: >Two minor points: > >In article <547@cybaswan.UUCP> iiit-sh@cybaswan.UUCP (Steve Hosgood) writes: >>In article <2619@yunexus.UUCP>, davecb@yunexus.UUCP (David Collier-Brown) writes: >>> 0) 6 characters is a lower limit of significance. You're allowed to use more. >>Yeah, but though you may be able to use more, a 'strictly conforming' program >>can't, otherwise it won't port to sites with 6-character linkers. > >Presumably the posters and most readers of this group understand >it, but there are apparently many who interpret statements like >the above to mean that external identifiers may not have more >than six characters at all. I'm always seeing subroutine >packages that have incredibly strained, abbreviated identifier >names, to keep them six characters long. You're allowed to use >extra characters for readability; the only problem is that if you >say > > int identifier1, identifier2; > >you may get > > "identi: multiply defined." > >It should be perfectly legal to say > > int identifier, anotheridentifier; I strongly dislike systems that accept extra characters in identifiers and ignore them. This feature creates a difference between the program as the programmer reads it, and the program as the compiler reads it. It can in fact move porting problems from compile time detection to run time errors. Suppose I have a large and complicated program that uses two libraries, libA and libB. Suppose also that identifier1 is the name of a function in libA and identifier2 is the name of a function in libB. A short identifier system that treats ignored characters in identifiers as an error will flag the use of more than six characters in the function names during the library make, and will also flag the use of the long names for external references during the compile. The problem will be found and easily fixed before the first successful make. A short identifier system that simply ignores excess characters will silently resolve references to both identifier1 and identifier2 to whichever appears in the first library examined. If you are lucky and have good tests, it will be found during program test. Even after the existence of the bug is known because of a crash or wrong result, it may still be difficult to fix. A programmer reading the code will see the correct function being called. If the system is sufficiently large, the programmer who is trying to fix a failure in a piece of code using identifier1 may not even be conciously aware of the existence of identifier2. If they do a grep for identifier1 it will not find identifier2. This is especially serious if the significant identifier length is scope dependent. The same type of problem can happen because the scope of a function has been changed from static to external and it has been moved to a library. Patricia Shanahan uucp : ucsd!celerity!ps arpa : ucsd!celerity!ps@nosc phone: (619) 271-9940
david@psitech.UUCP (david Fridley) (07/28/89)
Following is free code which anybody may use, it demonstrates how to build symbols with out having to place and restrictions on their length. I hope that everbody who writes a compiler, linker, assembler, etc will look at this. Beleive me, there don't need to be arbitrary limits, and I preffer products that do not have arbitrary limits. david. DISCLAIMER: If it's important have a backup. If it ain't broke don't fix it. Proceed at your own risk. My oponions are MY own. Spelling does not count. My fondest dream is to leave this planet. ----cut here--- /***************************************************************************** * getsym.c * * This module implements a get sym(bol) function which reads the next * symbol in from the input file descriptor, and returns a pointer to it. * * if TEST is defined at compile time, the following two routines are provided * in order to test, and demonstrate getsym() * * put_sym_in_table() add symbols to a static symbol table. * * main() is a simple program to read words from the standard input, build * a symbol table, print out the symbol table, free the symbols created, * and then free the symbol table. * * DISCLAIMER: This code worked the first time I tried it, so obviously there * is something wrong with it. Assume it is defective until proven otherwise. * * if BULLET_PROOF is defined at compile time, additional bullet proofing is * added, that is not required for normal operation. * * BASIC_SYMBOL_SIZE can be defined on the command line to override the basic * symbol size assumed by this module. * * HINT: tab 4,9 * * First Created: 25 July 1989 by david * Last Modified: 25 July 1989 by david *****************************************************************************/ #include <stdio.h> extern char *malloc(),*realloc(); /* * these values, BASIC_SYMBOL_SIZE and BASIC_TABLE_SIZE are given small * values for debugging purposes. */ #ifndef BASIC_SYMBOL_SIZE #define BASIC_SYMBOL_SIZE 4 /* this value defines the first guess*/ /* as to the size of the symbol. If that */ /* guess is wrong, it is guessed that that */ /* much more space will be required. The */ /* actual value effects the speed of */ /* the routine, that's all */ #endif #ifndef BASIC_TABLE_SIZE #define BASIC_TABLE_SIZE 4 /* this value defines the first guess */ /* as to the number of table entries. */ /* if this guess is wrong, it is guessed */ /* that that many more entries will be */ /* required */ #endif /***************************************************************************** * char *getsym(f) * * INPUT: * f is the file descriptor to read the next symbol from. * * OUTPUT: * a pointer to the next symbol on the input is returned. This buffer has * been malloc()ed and should be free()ed when it is nolonger needed. if * NULL is returned there was an error getting the next string, if (-1) is * returned there were no more symbols. * * First Created: 25 July 1989 by david * Last Modified: 25 July 1989 by david *****************************************************************************/ char *getsym(f) FILE *f; { char *tmp; int actual_symbol_size; int symbol_size; char c; actual_symbol_size=BASIC_SYMBOL_SIZE; symbol_size=0; /* get the initial buffer for the symbol */ if((tmp=(char *)malloc(actual_symbol_size))==NULL) { #ifdef BULLET_PROOF fprintf(stderr,"getsym: malloc() returned NULL\n"); exit(0); #endif return NULL; } while((c=getc(f))!=EOF) { if( (c>='A' && c <='Z') || (c>='a' && c <='z') ) { if(symbol_size>=actual_symbol_size) /* if we need more buffer area */ { actual_symbol_size+=BASIC_SYMBOL_SIZE; if((tmp=(char *)realloc(tmp,actual_symbol_size))==NULL) { #ifdef BULLET_PROOF fprintf(stderr,"getsym: realloc() returned NULL\n"); exit(0); #endif return NULL; } } tmp[symbol_size++]=c; }else { if(symbol_size) /* if there is atleast one character in the symbol */ { if((tmp=(char *)realloc(tmp,symbol_size+1))==NULL) { #ifdef BULLET_PROOF fprintf(stderr,"getsym: realloc() returned NULL\n"); exit(0); #endif return NULL; } tmp[symbol_size]='\0'; /* terminate the symbol */ return(tmp); /* return the pointer to the symbol */ }else /* eat up separators until there is atleas one non separator */ { continue; } } } /* we got an EOF */ if(symbol_size) /* if we read in a symobol, return it */ { return(tmp); }else /* if we did not find a symbol, return (-1) */ { return((char *)(-1)); } } #ifdef TEST /***************************************************************************** * put_sym_in_table(sym) * * put the symbol in the symbol table. If there is a problem, print a useful * message and exit(); * * First Created: 25 July 1989 by david * Last Modified: 25 July 1989 by david *****************************************************************************/ static char **symbol_table; static unsigned int table_size=0; static unsigned int actual_table_size=0; put_sym_in_table(sym) char *sym; { if(actual_table_size==0) { if((symbol_table=(char **) malloc(actual_table_size*sizeof(char *)))==NULL) { fprintf(stderr,"put_sym_in_table: malloc() returned NULL\n"); exit(0); } } if(table_size >= actual_table_size) { actual_table_size+=BASIC_TABLE_SIZE; if((symbol_table= (char **) realloc(symbol_table,actual_table_size*sizeof(char *)))==NULL) { fprintf(stderr,"put_sym_in_table: realloc() returned NULL\n"); exit(0); } } symbol_table[table_size++]=sym; } /***************************************************************************** * main() * * This is a simple demonstration program for getsym and put_sym_in_table. * * First Created: 25 July 1989 by david * Last Modified: 25 July 1989 by david *****************************************************************************/ main() { char *tmp; unsigned int i; while((tmp=getsym(stdin))!=((char *)(-1))) { if(tmp==NULL) { fprintf(stderr,"main: getsym returned NULL\n"); exit(0); } put_sym_in_table(tmp); } for(i=0;i<table_size;i++) { fprintf(stdout,"symbol_table[%d]=%s\n",i,symbol_table[i]); free(symbol_table[i]); } free(symbol_table); } #endif -- david. DISCLAIMER: If it's important have a backup. If it ain't broke don't fix it. Proceed at your own risk. My oponions are MY own. Spelling does not count. My fondest dream is to leave this planet.