davidson@sdcsvax.UUCP (12/05/83)
It has been possible for most of us to ignore the C compilers which used only one case, and less than 7 characters for identifier significance, simply because they only cause trouble when porting into such an environment, and few of us have had to do that. However, now that Berkeley has removed the identifier length restriction, non-Berkeley UNIX programmers are faced with the ever more frequent onerous task of squeezing Berkeley programs into using reduced identifier significance. Hence this CHALLENGE: CHALLENGE: To write a program (for the public domain) which converts a program designed for a given identifier significance of N1 characters, and with C1 cases (either one or two), into an equivalent program designed for identifier significance N2 and cases C2. These would be given as program arguments. Note that when relaxing restrictions, it may be necessary to truncate identifiers and regularize case, e.g. with N1 = 6, N2 = 100, foobarhere, foobarthere -> foobar, foobar. When tightening restrictions, it may be necessary to change the identifiers, e.g., this_one_here, this_one_there -> this1, this2. The latter case is more difficult, and requires avoiding collisions with other identifiers. I believe it very important that this program be written, and ideally by someone with C compiler experience, although anyone is welcome to try. We are forced to clean up someone else's mess here, as Berkeley should never have released their compiler without providing such a compatibility filter. -Greg
guy@rlgvax.UUCP (Guy Harris) (12/07/83)
Rumor has it that System VI C may implement long variable names. If you don't like the C or UNIX environment, wait a minute, it'll change... :-) Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
mem@sii.UUCP (Mark Mallett) (12/09/83)
b Once upon a time, in the middle ages, I worked with a compiler on a TOPS-10 system which supported arbitrarily long names and yet had to reduce these names to 6 characters apiece for the TOPS-10 linker. It made its reductions (as I recall) by making use of the fact that people tend to punctuate long names; it selected particular characters from each syllable and strung them together in some predictable way. Unfortunately I don't remember what its rules were, but if anybody wants to they could try to look it up (see below). A symbol such as MARKS_MAGIC_NUMBER might turn into MSMCNR I never saw it get confused; I don't know what it would do if it did. Second part: I wonder if anybody has ever heard of the above-mentioned compiler. It compiled a subset of PL/I, was written in LISP sometime around the late 1960s to early 1970s, and was called PL/E (PL/I for Eastman's Exec). It was written, I think, at Applied Logic Corp which used to be in NJ. I rather liked it; I wish that the people I resurrected it for hadn't lost it. Mark E. Mallett decvax!sii!mem
jim@ism780.UUCP (12/12/83)
I have posted to net.sources a simple program (shortc.c) which produces a mapping from arbitrary-length identifiers to identifiers which are unambiguous in the first N (default 7) characters. It produces them as #defines; all it takes is a flexnames version of cpp to compile the original source files with a non-flexname compiler, with no modification beyond including the output of shortc into the sources. If such a cpp is not available or creatable (not hard given any cpp source), the shortc output can be turned into a sed script and the sources can be compiled after being modified. Such modification does not fill the sources with identifiers like X12345 or MuWdIdr (for MultiWordIdentifier); rather it converts MultiWordIdentifier MultiWordIdentifier MultiWordThingy into AMultiWordThingy MultiWordyProgrammer BMultiWordyProgrammer Not so tough. -- Jim Balter (decvax!yale-co!ima!jim) Interactive Systems Corp --------
ldl@genix.UUCP (12/13/83)
In light of the fact that we (programmers) cannot depend on having identifiers of more than 'n' characters, I have taken to writing code somewhat differently. I 'beat up' on cpp's features. I'm not too sure if this plan will really work, but it does on our V7 environment. In a header: #define foobarthere snm00 /* routine description */ #define foobarhere snm01 /* routine description */ In code: foobarthere(...) { ... foobarhere(...); ... } foobarhere(...) { } Notice that in the source code, the 'long ids' are used. The macro processor 'remaps' the names into something that the compiler can handle, but that the user doesn't have to think about it. A further use of this (needed in my case) is that there are several 'support' routines (our stuff is divided into libraries) that are called by the 'main' routines in the library. Using this technique, there is no need to be concerned about having routines in one library uniquely named from all other routines. The 'real' name is controlled by the 'remapped' name that is handled by cpp. At first, I was rather concerned about the 'limits' of C external tags, etc, but using the above technique, no problems have been encountered to date (and over 80000 lines of lex, yacc, and C). Use cpp! P.S. I have hit the limit of 'too much defining' in one area. I took care of this by building a simple interface that uses m4 (yuck! for C). I still work in 'pure' C, and let a couple of scripts and make work out the details. -- Spoken: Larry Landis USnail: 5201 Sooner Trail NW Albuquerque, NM 87120 MaBell: (505)-898-9666 UUCP: {ucbvax,gatech,parsec}!unmvax!genix!ldl
eric@whuxle.UUCP (12/14/83)
#R:sdcsvax:-6000:whuxle:23200003:000:659 whuxle!eric Dec 13 17:22:00 1983 In reference to the suggestion to using #define massively_long_name short and then using massively_long_name();, or massively_long_name = 3;, etc.... THIS IS A BITCH TO DEBUG, ESPECIALLY IF SOMEONE DOESN'T READ THE HEADER FILES....... i.e., i printout the code above, have the hard copy next to me, and call "adb" (how arcane!!) to figure out why massively_long(); causes "Memory fault -- core dumped". To my suprise, massively_long_name() is NEVER called in the program. Instead, all i see are short(), or worse xx3ef();...... ARGHHH!!! If someone ever did that to me and I wasted time chasing it i would KILL!!! From the world of adb, eric
lee@unmvax.UUCP (12/14/83)
Larry, when was the last time you used ADB (you don't have SDB or its kin)? If I was attempting to debug using ADB I would sure find it inconvenient to have to go look up "smn00" to find it was really "foobarandbedamned" in my source. Think I would just try to be imaginative in the standard number of characters. Ready to get my buns torched, --Lee {ucbvax,gatech,parsec}!unmvax!lee
rpw3@fortune.UUCP (12/14/83)
#R:sdcsvax:-6000:fortune:16200011:000:804 fortune!rpw3 Dec 14 04:01:00 1983 I don't know which PDP-10 compiler was meant, but there was an abbreviation standard for squeezing 6-character program names into three characters, as need for file extensions and per-job tempfiles. As I recall, it was derived from some work done at Bell Labs on place-name abbreviations, and went like this: Take the first letter, the next consonant, and the last consonant, duplicating as necessary, EXCEPT, if the word is already 3 chars, leave it alone (so PIP => PIP, not PPP) Examples: LOGIN => LGN ALGOL => ALL FORTRAN => FRN BASIC => BSC FREE => FRR There was something about "Y" as a consonant, but I forget. Rob Warnock UUCP: {sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3 DDD: (415)595-8444 USPS: Fortune Systems Corp, 101 Twin Dolphins Drive, Redwood City, CA 94065
lee@haddock.UUCP (12/15/83)
#R:sdcsvax:-6000:haddock:12400001:000:111 haddock!lee Dec 12 22:01:00 1983 In the recently announced System V.2, flexnames are supported. The past is, finally, behind us on this issue.
fair@dual.UUCP (Erik E. Fair) (12/15/83)
For the gentleman who was `beating on cpp' for flexnames: god help you if you have to use adb on a core file! None of the compiled variables will make any sense w/o the header file... Erik E. Fair {ucbvax,amd70,zehntel,unisoft,onyx,its}!dual!fair Dual Systems Corporation, Berkeley, California
ado@elsie.UUCP (12/19/83)
In particular, regarding-- In the recently announced System V.2, flexnames are supported. The past is, finally, behind us on this issue. Well, maybe. As I recall, though, the UNIX (Bell Labs trademark) gurus have a history of abandoning support for machines they no longer feel like supporting-- like the PDP 11/40 when Version 7 was originally released. This has typically been done to "enhance portability"; how an operating system that can't run on systems it used to run on is "more portable" than its predecessor is beyond me. And of course there are folks using C compilers other than ones supplied by Berkeley or Bell; given that "flexible names" aren't "required" in "The C Programming Language," such compilers may lack support for them. I think the challenges of "historic" dialects of C are still with us. -- UUCP: decvax!harpo!seismo!rlgvax!cvl!elsie!ado Phone: (301) 496-5688
msc@qubix.UUCP (Mark Callow) (12/21/83)
Regarding -- In a header: #define foobarthere snm00 /* routine description */ #define foobarhere snm01 /* routine description */ In code: foobarthere(...) { ... foobarhere(...); ... } foobarhere(...) { } ---------- This would make symbolic debugging awfully hard... Of course I don't know many v7 systems with symbolic debuggers. -- From the Tardis of Mark Callow msc@qubix.UUCP, decwrl!qubix!msc@Berkeley.ARPA ...{decvax,ucbvax,ihnp4}!decwrl!qubix!msc, ...{ittvax,amd70}!qubix!msc
ldl@genix.UUCP (02/29/84)
I realize that this is a bit late, and that this discussion has died down to some extent, but out news source was in the process of upgrading to 4.2 and we were out of contact for a while. Please forgive this late of a response. This will clarify how it can be done without too much hassle. First of all, the names (short ones) MUST be chosen with care. If you divide the routines up into libraries, then all short names should indicate where the name originates. For example, I have a library named 'runtime'. In this library, there are both user calls, and support routines. The names of all 'user calls' are tagged 'rtgnn' and 'supports' are tagged 'rtlnn', for 'RunTime Global' and 'RunTime Local', respectively. Similarly, any variables used (common) by these routines are named 'rtgxnn' and 'rtlxnn' for 'RunTime Global eXternal' and 'RunTime Local eXternal'. (The 'nn' is a number sequence like '01', '02', etc). In general, there are less than 20 user calls and 30 supports, and no more than 10 variables (of each, global and local). Secondly, I have rarely had to resort to going to adb. I guess that it comes from having worked on systems that did not allow highly interactive debugging (i.e. communications front-ends, etc), but a long session of desk checking is worth 10 times that amount of time in debugging. (Typically, the 'debug' time, running the new code, is less than 10% of the total coding time, for me). Those few times that I have resorted to using adb, I was able to quickly track down what was happening and correct the fault. I guess that I should mention that I have put a preprocessor in front of the actual 'cc' that does the real remapping, thus the mapped names are easily accessible mechanically (have not interfaced to adb, thus the objections are valid from the standpoint of adb or even sdb). Considering the advantages of name conflicts between library support routine names (i.e. routines that are never accessed, nor accessible, outside of the library proper), the funky names have aided rather than hindered the conflicts that could otherwise result (like if the wrong routine was linked to the right spot because the name was the same). There have been many good suggestions, but apart from the language being changed (universally, on all Unix environments), there are disadvantages to all shortening approaches. The solution (apart from changing the definition of the language) appears to be deciding which poison tastes least bad. :-) -- Spoken: Larry Landis USnail: 5201 Sooner Trail NW Albuquerque, NM 87120 MaBell: (505)-898-9666 UUCP: {ucbvax,gatech,parsec}!unmvax!genix!ldl