mcdonald@uxe.cso.uiuc.edu (08/29/88)
>As I understand the draft standard, you may portably compute address >of the location *after* the last element of the array, but not the >location *before* the first element of the array. >Here is at least one architecture that breaks 80X86. > char *foo = malloc(65635) > /* foo <- address 4a33:0000 */ > char *bar = foo-1; >Ok, now what is the value of "(4a33:0000)-1"? Answer: there isn't >one. The draft standard doesn't say that nonconforming programs will >break on all machines, just that they won't work on all machines. Interesting. The dovumentation says that the argument of malloc is an unsigned int, for which the maximum value is 65535. Heaven knows what would happen if you actually tried this. But let's say we go to huge model and use halloc(65635,1) which is legal but obviously nonportable. OR you could say char foo[65635]; and compile in the huge model. In either case you get something like 2000:0000 for bar = foo; and 1000:FFFF for bar = foo; bar--; and if you do bar = foo; bar--; bar++; you get back 2000:0000. Thus on the 8086 in huge model it indeed works (but, we all agree, is nonportable). Actually if you write char foo[65536]; in large model it works: the pointer arithmetic works on only the offset portion of the address, so you get wraparounds, but as long as you don't try to dereference the resulting pointers everything works. If you DO try to dereference bar = foo; bar--; you of course actually access foo[65535]. In small data model, the first few bytes of the data segment are reserved for the word "Microsoft" so the worst you can do there is mess with their name. The last bytes of the segment are the stack, so if you DO dereference past the end of the legal data area, disaster most certainly COULD occur. The fact that it may work doesn't mean that it is pretty, though. Doug McDonald P.S. This is is for Microsoft C .
mcdonald@uxe.cso.uiuc.edu (09/05/88)
>There exist machines whose protection philosophy is to prevent you from >even thinking something illegal. In particular, on the Unisys A-series, >the compiler must implement all memory addressing protection--there is >no kernel/user state protection on memory.* A program cannot be allowed >to form an invalid address, as there is nothing to stop it from using it, >and nothing in the hardware to stop you from stomping on another user >if you do. Therefore, the compiler and the operating system would be >written so as to cause an interrupt if computing 'b - 1' were attempted. >Note that there is no C compiler for the A-series today, although one is >rumored. This seems logically inconsistent. You say that on the Unisys A-series that the problem is in the compiler. But then you say there is no C compiler. If the problem exists in other language compilers, simply leave it out of C! Simply write the compiler so that it doesn't check pointers. (If it did do that, wouldn't it be a horrendous time penalty? Every time you said "pointer++" it would have to check bounds, unless the pointer were declared the non-existant "noalias".) What about assembly language? What is to stop things from happening there with out of bounds pointers? Doug McDonald
ok@quintus.uucp (Richard A. O'Keefe) (09/06/88)
In article <225800063@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >This seems logically inconsistent. You say that on the Unisys A-series >that the problem is in the compiler. But then you say there is no >C compiler. If the problem exists in other language compilers, >simply leave it out of C! Simply write the compiler so that it doesn't >check pointers. (If it did do that, wouldn't it be a horrendous time >penalty? Every time you said "pointer++" it would have to check >bounds, unless the pointer were declared the non-existant "noalias".) > >What about assembly language? What is to stop things from >happening there with out of bounds pointers? > >Doug McDonald WHAT assembly language? NEWP (an Algol-like language) is as close to assembly language as one gets on the A-series. The point is that the machine was not designed to do pointer arithmetic (there isn't even a single notion of "address"; Indirect Reference Words and Indexed Descriptors have different tags and interpretations). The operating system has been able to rely on this. If you generate code to do "pointer arithmetic" on, say, Indexed Descriptors, you find that (a) you just bypassed the Virtual Memory system, and (b) bye-bye system integrity! The compilers don't general special code to check pointers, so it isn't something you can "leave out of" C. The systems-programming languages do have things called pointers, which are Indexed Descriptors, and can be adjusted. But incrementing such a pointer by N involves touching each word of storage referenced so that a boundary word won't be missed (not a performance hit, because this is not the kind of thing A-series machines are normally asked to do). I know of two BCPL compilers for the A-series, one actual one and one that was designed but not finished. Neither of them was a pretty sight. (The PL/I compiler had similar troubles.) The A-series is a "high level" architecture for a particular set of languages (Algol, Fortran, COBOL) and that you can't expect languages outside that set to map well onto it.
mcdonald@uxe.cso.uiuc.edu (09/09/88)
>What's to stop you from doing the following: > Generate code in an array. > Jump to the beginning of the array. * >Now you've blown the protection. You can do anything. I hope this isn't a >multiuser machine... It is certainly possible to design machine\compiler combinations that prevent this. I call them "totalitarian " or "Stalin" operating systems. Apparently ANSI C does not prohibit this behaviour: a fatal flaw in the ANSI standard. IF you can't do this, an entire class of programs becomes absolutely impossible: incremental compilers. It would prohibit a Turbo C or Quick C clone, for example. All of my programs I have designed for teaching chemistry and physics wouldn't work. It is even possible to design an operating system so that is is impossible (inside it of course) to write compilers: there is some magic cookie necessary to make an executable file, and no compiler or assembler allows setting such cookie *. VMS makes it rather difficult to set such a thing (but possible). Does the Unisys A series REALLY make it all that impossible? If so, maybe that is why no one has ever heard of them! Doug McDonald * I mean that the compiler can make an executable, but that you can't write a program that will make an executable.
dricej@drilex.UUCP (Craig Jackson) (09/11/88)
In article <225800063@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: [Although there is no attribution, I wrote the >> stuff. CEJ] >>There exist machines whose protection philosophy is to prevent you from >>even thinking something illegal. In particular, on the Unisys A-series, >>the compiler must implement all memory addressing protection--there is >>no kernel/user state protection on memory.* A program cannot be allowed >>to form an invalid address, as there is nothing to stop it from using it, >>and nothing in the hardware to stop you from stomping on another user >>if you do. Therefore, the compiler and the operating system would be >>written so as to cause an interrupt if computing 'b - 1' were attempted. > >>Note that there is no C compiler for the A-series today, although one is >>rumored. >This seems logically inconsistent. You say that on the Unisys A-series >that the problem is in the compiler. But then you say there is no >C compiler. If the problem exists in other language compilers, >simply leave it out of C! Simply write the compiler so that it doesn't >check pointers. (If it did do that, wouldn't it be a horrendous time >penalty? Every time you said "pointer++" it would have to check >bounds, unless the pointer were declared the non-existant "noalias".) You don't completely understand. The problem is not in the compiler, the 'problem' is in the architecture that leaves things up to the compiler to check. Theoretically the system could be a little faster by not doing as many security checks at run time; in reality, they don't save any logic, I believe. The upshot of this is that if you wrote a compiler that allowed undisciplined pointer operations, the system would be about as safe as MS-DOS. The nice thing about the hardware is that "pointer++" is checked by the hardware, assuming that the arrays are set up in the normal manner. There's a special 'add to pointer' instruction, which checks the tags on memory. There's another instruction to 'subtract from pointer', which is going to be used for 'int b[10];int *bb = b - 1;'. This instruction, in attempting to move the pointer down from the beginning of the array, would hit a word with an illegal tag and cause an interrupt. >What about assembly language? What is to stop things from >happening there with out of bounds pointers? There is no assembler for the A-series. Normal programs are written in ALGOL, COBOL, FORTRAN, PASCAL, or PL/I. The operating system, and certain operating systems extensions, are written in an extended ALGOL called NEWP. NEWP cannot be used to write normal user programs, and NEWP libraries (which are sort of an operating system extension) must be blessed by the operator before they are executed. >Doug McDonald As a further note, I believe that one reason why A-series C might not use the hardware stack and hardware pointers in a normal manner is varargs. What can be done about a system which *must* check argument count & type before execution? -- Craig Jackson UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej BIX: cjackson
gwyn@smoke.ARPA (Doug Gwyn ) (09/11/88)
In article <225800065@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >> Generate code in an array. >> Jump to the beginning of the array. * >It is certainly possible to design machine\compiler combinations that >prevent this. I call them "totalitarian " or "Stalin" operating systems. >Apparently ANSI C does not prohibit this behaviour: a fatal flaw >in the ANSI standard. IF you can't do this, an entire class of programs >becomes absolutely impossible: incremental compilers. It would prohibit >a Turbo C or Quick C clone, for example. All of my programs I have designed >for teaching chemistry and physics wouldn't work. I'm getting a bit tired of talk about "fatal flaws" in the proposed ANSI C standard from people who don't understand the goals and constraints under which such a standard is developed. It is simply NOT FEASIBLE for a global C standard to dictate characteristics of an implementation environment such as the ability to (somehow) switch the thread of execution into a process's data space. The proposed C standard does not prohibit an implementation from offering support for such a feature, but it also does not require such support. Any application that depends on such a feature, or on dynamic linking, communication with coprocesses, or other specific techniques for run-time creation and execution of machine instructions, is already inherently nonportable. It is not the job of a C standard to render already nonportable code suddenly, magically portable. Feel free to do anything that happens to work at the moment on your particular system. Just be aware that it may not work elsewhere or elsewhen, and please have the good sense not to blame this on people who have no direct control over that aspect of reality.
ldh@hcx1.SSD.HARRIS.COM (09/13/88)
This may have been specified before ... but I may have missed it. 1) is "numerical recipes in C" PD, Shareware or $$$$$$$ 2) where do I get a copy of it 3) I gather from the discussions that it will work on a PC, but which compiler is best suited to the games they play with the arrays? (TC1.5?) 4) will it work (at all?) better with sysV or UCB compilers/libs ? Thanks to all ... Leo Hinds *net: ldh@hdw.harris.com uunet!hcx1!hardy!ldh
mcdonald@uxe.cso.uiuc.edu (09/15/88)
In article <225800065@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu (that's me folks) writes: >> Generate code in an array. >> Jump to the beginning of the array. * >It is certainly possible to design machine\compiler combinations that >prevent this. I call them "totalitarian " or "Stalin" operating systems. >Apparently ANSI C does not prohibit this behaviour: a fatal flaw >in the ANSI standard. IF you can't do this, an entire class of programs >becomes absolutely impossible: incremental compilers. It would prohibit >a Turbo C or Quick C clone, for example. All of my programs I have designed >for teaching chemistry and physics wouldn't work. The usually sane Doug Gwyn replies: >I'm getting a bit tired of talk about "fatal flaws" in the proposed >ANSI C standard from people who don't understand the goals and >constraints under which such a standard is developed. It is simply >NOT FEASIBLE for a global C standard to dictate characteristics of >an implementation environment such as the ability to (somehow) switch >the thread of execution into a process's data space. The proposed C >standard does not prohibit an implementation from offering support >for such a feature, but it also does not require such support. >Any application that depends on such a feature, or on dynamic linking, >communication with coprocesses, or other specific techniques for >run-time creation and execution of machine instructions, is already >inherently nonportable. It is not the job of a C standard to render >already nonportable code suddenly, magically portable. I don't care one whit about what the goals and constraints of X3J11 (or X3J3 for that matter) ARE. I care about what they OUGHT to do. I don't see why being able to create code and execute it could cause the hardware of any machine fits. I can see how it might make a compiler vendor have fits if a cast of a data pointer to a code pointer wasn't simply a no-op, as it is on most sane machines. On the vast majority of machines it IS either a no-op, or , for example in OS/2, there is a simple system call that turns a data pointer to a code pointer which you can call. The cast would simply have to call the operating system. I can conceive of an architecture where it is absolutely impossible to have code and data in the same address space: say a physically different memory. But even there it could be done: somehow the system has to get code into the code memory, prehaps the only way being to write it to disk and read it out. In that case the run time library has to write out the data, and read it back in. I don't accept the argument that "our operating system doesn't allow user programs to do that". If it were in the C language spec they would have to CHANGE THE OPERATING SYSTEM TO MAKE IT WORK or else admit "our operating system is so broken that we can't have a C compiler". I want it put in the language definition so that systems that can't do it are made to say to all the world "Look at me, I'm the big bright computer of the future, I'll tell you how great this hotshot new protection scheme is, it's so great that I'm terminally unable to offer a C compiler to my users (if there are any)." I want the C standard to essentially force vendors to fix their machines. Dynamic linking, coprocessors, etc. really ARE operating system issues, and outside C. I am less than happy over the raw-terminal-io discussion going on in another comp.lang.c thread: I think that a portable way to get raw io MIGHT be possible, and should be thought about. But the issue there is PORTABILITY, not IMPOSSIBILITY. I find it quite interesting to compare X3J11 to X3J3. X3J3 has been known to give the same argument that Gwyn uses, to wit, "it would discombobulate one vendor" to argue against adding features to Fortran, when the very same features are ALREADY in C! Among these are bit operations ( | & ^ in C) and external names longer than 6 (six) characters. Doug McDonald
rob@kaa.eng.ohio-state.edu (Rob Carriere) (09/16/88)
In article <44100012@hcx1> ldh@hcx1.SSD.HARRIS.COM writes: >This may have been specified before ... but I may have missed it. >1) is "numerical recipes in C" PD, Shareware or $$$$$$$ None of the above. It is book, published by the Oxford University Press for 40-some dollars. The programs listed in it can be obtained in machine readable form for another 20 or so. >2) where do I get a copy of it See above. >3) I gather from the discussions that it will work on a PC, but which >compiler is best suited to the games they play with the arrays? (TC1.5?) I have no idea, but you can always eliminate the ``games'' at the cost of a small amount of storage. >4) will it work (at all?) better with sysV or UCB compilers/libs ? I am using it on a Sun 3/50 (BSD) with both cc and GNU cc. 'Tworks fine. Rob Carriere
ok@quintus.uucp (Richard A. O'Keefe) (09/16/88)
In article <225800069@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: > >In article <225800065@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu >(that's me folks) writes: >>> Generate code in an array. >>> Jump to the beginning of the array. * >The usually sane Doug Gwyn replies: > >>I'm getting a bit tired of talk about "fatal flaws" in the proposed >>ANSI C standard from people who don't understand the goals and >>constraints under which such a standard is developed. ... >>It is not the job of a C standard to render >>already nonportable code suddenly, magically portable. > >I don't care one whit about what the goals and constraints of X3J11 >(or X3J3 for that matter) ARE. I care about what they OUGHT to do. >I don't see why being able to create code and execute it could >cause the hardware of any machine fits. The most famous example is the B6700, where memory consisted of 52-bit words (1 parity, 3 tag, 48 data). Even tags (0 = single precision, 2 = double precision, 4 & 6 hairy) were things user code could manipulate, odd tags (1 = indirect reference, 5 = array description, 7 = procedure, 3 = boundary/stack control word/code) were not. At my home university we installed a hack (for the benefit of a load-and-go Fortran compiler) which took an array and changed it to code. But you couldn't use it as code and data *both* at the same time, and there were a number of other restrictions. When MCP 3.0 of the operating system came out, a better approach would have been to create a code file and attach it as a dynamic library (that way the code would not have been locked in physical memory). There are quite a few machines with separate I/D. The UNIX PERQ was (is?) one of them. Some modern RISCs are. A micro-controller with execute access only to a ROM would not be able to do this. And so on. But all of this misses what I think Doug Gwyn's point is. If you are generating code into an array, *that* part of the program is *already* non-portable (because the code is machine-dependent). The ANSI C commmittee cannot be expected to demand that everyone emulate the 80286 in order to make programs which generate 80286 code into an array and jump to it portable. If you move your program to another machine you are going to have to rewrite much if not most of the code that generates the instructions. What is so terrible about changing the call as well?
gwyn@smoke.ARPA (Doug Gwyn ) (09/16/88)
In article <225800069@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >If it were in the >C language spec they would have to CHANGE THE OPERATING SYSTEM TO >MAKE IT WORK or else admit "our operating system is so broken that >we can't have a C compiler". ... I want the C standard to essentially >force vendors to fix their machines. X3J11 has rightly observed that such an attitude would most likely lead to the ANSI C standard failing to gain the widespread support necessary for a true standard. There are practical reasons for promoting a C standard, but imposition of a particular philosophy of hardware architecture design on the computing industry is not one of them. >... when the very same features are ALREADY in C! Among >these are bit operations ( | & ^ in C) and external names longer than >6 (six) characters. C extern names are not necessarily unique beyond 6 characters, monocase. In some environments they are and in some they aren't. Acknowledging this constraint was one of the most distressing decisions that X3J11 had to make. But the fact is, many C implementors are not in a position to improve the linker that will of necessity be used with the object code their compiler generates.
dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/17/88)
In article <8507@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >C extern names are not necessarily unique beyond 6 characters,... >But the fact is, many C >implementors are not in a position to improve the linker that >will of necessity be used with the object code their compiler >generates. (This is not meant to be a flame, just a comment.) I think Doug Gwyn exaggerates in saying "many" and "of necessity". -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
gwyn@smoke.ARPA (Doug Gwyn ) (09/17/88)
In article <3981@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
->But the fact is, many C
->implementors are not in a position to improve the linker that
->will of necessity be used with the object code their compiler
->generates.
-I think Doug Gwyn exaggerates in saying "many" and "of necessity".
No. (Sometimes I wonder why I waste my breath, er, fingers.)
henry@utzoo.uucp (Henry Spencer) (09/18/88)
In article <3981@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >>But the fact is, many C >>implementors are not in a position to improve the linker that >>will of necessity be used with the object code their compiler >>generates. > >I think Doug Gwyn exaggerates in saying "many" and "of necessity". No. The world does not consist primarily of Unix systems with sources, or of hobbyist-owned micros that can abandon standard software whenever it's convenient to do so. Most C compilers have to fit into existing environ- ments, which the compiler writer cannot change without greatly diminishing the market for his compiler. Given a choice of conforming to ANSI C or conforming to the de facto standards set by the operating system in question, most compiler writers know which side their bread is buttered on. Speaking as an amateur compiler writer with professional compiler-writer friends, we don't like this any more than you do. We don't like income tax, either. We have no illusions about being able to change either problem. -- NASA is into artificial | Henry Spencer at U of Toronto Zoology stupidity. - Jerry Pournelle | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
will.summers@p6.f18.n114.z1.fidonet.org (will summers) (09/18/88)
In article <3981@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: > In article <8507@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) > writes: > >C extern names are not necessarily unique beyond 6 characters,... > I think Doug Gwyn exaggerates in saying "many" and "of necessity". I hate this restriction (big deal! **everybody hates this**, even the committee!) So what to do? Liberally paraphrasing the Rationale: dpANS work-around 2: Use defines: #define real_long_name a_xyz_real_long_name #define real_long_name2 a_rwt_real_long_name2 dpANS work-around 3: Use longer names and kiss portability to short-extern environments goodby. What to do? Well dpANS *permits* the implementor to honor as much significance as he wishes. In practice an implementor affected by market forces will honor as many characters as his environment permits. So I choose (3), and will add #defines al'a (2) if I ever need to port to a short-extern environment. I think so many programmers in longer-extern environments will do the same that those importing to short-extern environments will encounter the problem often enough to develop tools to generate the #defines automatically. \/\/ill -- St. Joseph's Hospital/Medical Center - Usenet <=> FidoNet Gateway Uucp: ...{gatech,ames,rutgers}!ncar!noao!asuvax!stjhmc!18.6!will.summers
seanf@sco.COM (Sean Fagan) (09/18/88)
In article <225800069@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: [lots of ranting and raving, deleted; see Doug Gwyn (the "normally sane")'s] [reply for some good answers] >I can conceive of an architecture where it >is absolutely impossible to have code and data in the same address >space: say a physically different memory. The PDP-11, using split I&D, is unable to generate code and then jump to it. The 80386 (and, I think, the 80286) cannot execute data. You have to change permissions for the page (or segment, I forget which). Um, I don't see a problem with this. >somehow the system has to get code into the code memory, >prehaps the only way being to write it to disk and read it out. Yep, that's how XENIX does it. Reads it all into data segments, and then changes it (magicly) into a text segment. Again, I see nothing wrong with this. >I don't accept the argument that "our operating >system doesn't allow user programs to do that". If it were in the >C language spec they would have to CHANGE THE OPERATING SYSTEM TO >MAKE IT WORK or else admit "our operating system is so broken that >we can't have a C compiler". (Please excuse me, I don't normally do this. Also, I would like to reiterate the disclaimer below: I alone share my opinions.) Your problem is that you grew up on a machine which could execute data (such as a VAX), and you think that all machines should then be like that. You are ranting and raving, calling Doug Gwynn insane (ok, you didn't out and out say that, but you darn well implied it), and also insuating that the X3J11 committee, me, gobs of other people in the world, and the Intel Microprocessor design team is brain damaged and/or incompetent (well, Intel is questionable 8-) ). Perhaps we should also put in bit counting operators into C. Then, we could write programs that require said operator, and say that all other machines are slow and stupid because they don't have such things built into the hardware (CDC Cybers do, Crays might, it was the only think I could think of at 12:40 am 8-)). Or maybe we should require that all ints be 32 bits. And doubles be 64 bits, to hell with any machine which has a superior floating point scheme. There is very rarely any need to be able to execute data that you have created on the fly. If you really need to, you can create an executable relatively easily, and then execute that. I, personally, dislike the idea, but that's just MHO. Not all machines are alike, nor are all memory management schemes, nor are all operating systems. And, like it or not, at no point in C's history was it stated (or even implied) that you could jump to data. All the function pointers in K&R were assigned to functions created by the programmer (such as main, exit, printf, etc.), or NULL (which is, of course, a valid pointer). Argc. >Doug McDonald -- Sean Eric Fagan | "Joy is in the ears that hear, not in the mouth that speaks" seanf@sco.UUCP | -- Saltheart Foamfollower (S. R. Donaldson) (408) 458-1422 | Any opinions expressed are my own, not my employers'.
sjs@jcricket.ctt.bellcore.com (Stan Switzer) (09/19/88)
In article <1988Sep17.212624.8858@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: > In article <3981@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: > >>But the fact is, many C > >>implementors are not in a position to improve the linker that > >>will of necessity be used with the object code their compiler > >>generates. > > > >I think Doug Gwyn exaggerates in saying "many" and "of necessity". > > No. The world does not consist primarily of Unix systems with sources, or > of hobbyist-owned micros that can abandon standard software whenever it's > convenient to do so. Two is a couple. A few is at least three (in my book). I guess *many* will have to be at least four. Let's put this question to the test. "How many C implemenations are constrained by 6 character monocase linkers and how badly are they constrained?" In order to avoid netting too many red herring, we'll exclude machine and operating system combinations for which no C compilers exist (if there is a viable implementation in the works, we'll let it slide). Also, different designations of the same basic architecture or OS count only once. I can think of one, so I'll start: 1) GECOS / GCOS / GCOS 8 for the GE 600 / Honeywell 6000 / DPS 8 series Being essentially quantitative, the first part of this controversy is easier to resolve than the second, but as of my last experience w/ GCOS (1982), I don't feel I'd have lost very much in abandoning the standard linker in favor of a "C" linker. Stan Switzer sjs@ctt.bellcore.com
ok@quintus.uucp (Richard A. O'Keefe) (09/20/88)
In article <10295@bellcore.bellcore.com> sjs@ctt.bellcore.com (Stan Switzer) writes: >> >>But the fact is, many C >> >>implementors are not in a position to improve the linker that >> >>will of necessity be used with the object code their compiler >> >>generates. >Two is a couple. A few is at least three (in my book). I guess >*many* will have to be at least four. Let's put this question to the >test. >I can think of one, so I'll start: > 1) GECOS / GCOS / GCOS 8 > for the GE 600 / Honeywell 6000 / DPS 8 series Here are two very well known ones: 2) MVS/XA for IBM S/370 series 3) VM/CMS for IBM S/370 series There are some similarities between these two operating systems, but there are major differences too. There is a Japanese workalike for MVS, but let's ignore workalikes. The S/370 range can run System V (Amdahl's UTS) and SunOS, but they haven't got this linker problem (:-). There was a C compiler for TOPS-10 on the DEC-10, but I guess we can regard TOPS-10 as dead and not count it. One more, and we'll be there! But the question is not the number of _system types_ but the number of _implementors_. I know of four C compilers for VM/CMS, and I'm sure there must be more in progress.
seanf@sco.COM (Sean Fagan) (09/21/88)
In article <10295@bellcore.bellcore.com> sjs@ctt.bellcore.com (Stan Switzer) writes: >Two is a couple. A few is at least three (in my book). I guess >*many* will have to be at least four. Let's put this question to the >test. >I can think of one, so I'll start: > 1) GECOS / GCOS / GCOS 8 2) CDC Cybers, 170 series. (I have to hedge a bit here, we can use *7* character identifiers, but, since it also uses, I believe, an underscore, that takes up one of the characters.) It is, however, monocase. And, surprising though it may be to those who know the machine (and those who don't should 8-)), there exist at least *two* C Compilers for the macine: UofTexas (or is it Austin, I forget) ported PCC to NOS (ugh!), and I and a couple of friends (Hi mike!) ported Small-C (almost as much ugh!). The Compilers work, but there is not much we can do about the linker (part of the operating system, you see; generally, you build a ".o" equivilent, then, when you try to run it, the OS recognizes that it is non-linked and then proceeds to link it). >Stan Switzer sjs@ctt.bellcore.com -- Sean Eric Fagan | "Never underestimate the bandwith of a pickup full of seanf@sco.UUCP | 9-track tapes!" - Eric Green (elg@killer) (408) 458-1422 | Any opinions expressed are my own, not my employers'.
will.summers@p6.f18.n114.z1.fidonet.org (will summers) (09/21/88)
(Re: dpANS guarentee of only 6 monocase characters of external name significance) In article <10295@bellcore.bellcore.com> sjs@jcricket.ctt.bellcore.com (Stan Switzer) writes: > Two is a couple. A few is at least three (in my book). I guess > *many* will have to be at least four. Ah... the way I heard it was two's company, three's a crowd, four's a fist fight and five's a riot. Guess we need six. :-) > "How many C implemenations are constrained by 6 character monocase > linkers and how badly are they constrained?" > 1) GECOS / GCOS / GCOS 8 > for the GE 600 / Honeywell 6000 / DPS 8 series > > Being essentially quantitative, the first part of this controversy is > easier to resolve than the second, but as of my last experience w/ > GCOS (1982), I don't feel I'd have lost very much in abandoning the > standard linker in favor of a "C" linker. I believe the committee's concern was over those installations where security prevented all but "secure" programs from generating an executable module. Does GCOS qualify? I -think- the waterloo C compiler for GCOS (single segment) recoginzes 100 case-siginificant characters in external names. I am a supporter of dpANS, but have trouble understanding this decision. Even if the implementor could not generate his own linker, it would seem that he could implement a pre-link pass that mapped longer identifiers in the .o files (or whatever). Non-dpANS .LIB files would need an associated mapping file. Maybe I just don't understand but it seems a small price for the rest of the world to enhjoy 32-bit externs. I forsee this limitation as one of the most widely ignored, even by many programmers that are otherwise careful about portability considerations. \/\/ill -- St. Joseph's Hospital/Medical Center - Usenet <=> FidoNet Gateway Uucp: ...{gatech,ames,rutgers}!ncar!noao!asuvax!stjhmc!18.6!will.summers
mcdonald@uxe.cso.uiuc.edu (09/22/88)
>I can think of one, so I'll start: > 1) GECOS / GCOS / GCOS 8 > 2) CDC Cybers, 170 series. And a third: PDP-11/RT11. And all of this is rather unimportant, because it should be possible to write a linker that links all the C files together and leaves only operating system calls and calls to other languages for the system linker.
henry@utzoo.uucp (Henry Spencer) (09/22/88)
In article <1305@scolex> seanf@sco.COM (Sean Fagan) writes: [6-character linkers in C environments] >> 1) GECOS / GCOS / GCOS 8 > > 2) CDC Cybers, 170 series. (I have to hedge a bit here, we can use *7* >character identifiers, but, since it also uses, I believe, an underscore, >that takes up one of the characters.) ... Unless RT-11 has changed a lot since I last saw it, it's a 6-character environment. And yes, there is at least one C compiler for it. -- NASA is into artificial | Henry Spencer at U of Toronto Zoology stupidity. - Jerry Pournelle | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
smryan@garth.UUCP (Steven Ryan) (09/23/88)
>The Compilers work, but there is not much we can do about the linker (part >of the operating system, you see; generally, you build a ".o" equivilent, >then, when you try to run it, the OS recognizes that it is non-linked and >then proceeds to link it). Which is nice. 170 Loader runs like a bat out of hell because it has to. ld runs like a turtle out of antartica.
dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/23/88)
In article <225800072@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: [re linkers with 6-char limit] >And all of this is rather unimportant, because it should be possible >to write a linker that links all the C files together and leaves only >operating system calls and calls to other languages for the system linker. Actually, it's even easier than that. The C compiler can generate an internal object format. A custom post-processor takes these object files, scans for all long identifiers, shortens them to unique 6-char names, and produces as its output system-format object files ready for the standard linker. No linking need be done by this post processor. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
ok@quintus.uucp (Richard A. O'Keefe) (09/23/88)
In article <4071@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >In article <225800072@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >[re linkers with 6-char limit] >Actually, it's even easier than that. The C compiler can generate an >internal object format. A custom post-processor takes these object >files, scans for all long identifiers, shortens them to unique 6-char >names, and produces as its output system-format object files ready for >the standard linker. No linking need be done by this post processor. There are several reasons why one wants the names in the source code to bear a simple predictable relation to the names the system sees, such as mixed language programming and system-supplied debugging tools like load maps. The people in comp.lang.c++ often complain about compiler- generated names. There was a program posted to one of the sources news-groups a while back that did the long name -> unique name mapping on the source code; sorry I can't remember the name or the date, ask in comp.sources.wanted.
gwyn@smoke.ARPA (Doug Gwyn ) (09/24/88)
In article <703.2339B3CB@stjhmc.fidonet.org> will.summers@p6.f18.n114.z1.fidonet.org (will summers) writes: >but it seems a small price for the rest of the world to enhjoy 32-bit >externs. Nothing is stopping the rest of the world from enjoying 32-bit externs. A little (very little) information theory will show that this cannot be guaranteed by any amount of trickery in a 6-character extern environment, if one does not have control over the linker etc. The proposed ANS for C does NOT repeat NOT prohibit implementations from supporting more than 6 monocase characters of significance in external identifiers. >I forsee this limitation as one of the most widely ignored, even by >many programmers that are otherwise careful about portability >considerations. It's already ignored, and already causes problems.
henry@utzoo.uucp (Henry Spencer) (09/25/88)
In article <4071@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >Actually, it's even easier than that. The C compiler can generate an >internal object format. A custom post-processor takes these object >files, scans for all long identifiers, shortens them to unique 6-char >names, and produces as its output system-format object files ready for >the standard linker. No linking need be done by this post processor. Right, so we build it into the output phase of the compiler, since it doesn't have to do any linking. Now we have a compiler whose output contains only 6-character names. How is this an improvement on simply doing that from the beginning? Remember that the rule applies only to external names, so it's how the names appear to the outside world -- to libraries, to modules written in other languages, to linkers -- that matters. It's easy to say "shortens them to unique 6-char names", but making that nice phrase *work* is just a wee bit harder. -- NASA is into artificial | Henry Spencer at U of Toronto Zoology stupidity. - Jerry Pournelle | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/26/88)
I wrote: A custom post-processor takes these object files, scans for all long identifiers, shortens them to unique 6-char names, and produces as its output system-format object files ready for the standard linker. In article <1988Sep24.212346.26591@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Right, so we build it into the output phase of the compiler, since it >doesn't have to do any linking. Now we have a compiler whose output >contains only 6-character names. How is this an improvement on simply >doing that from the beginning? *If* existence of the post-processor could be assumed on the handful of systems with old linkers, using the post-processor would be better than using 6-char externs in the source to begin with, because: It would let people on systems with modern linkers use long externs in their C programs, knowing that their code would still be portable to systems with old 6-char linkers. Existence of the post-processor could be assumed *if* ANSI were to mandate long externs in all conforming compilers and recommend such post-processing to implementors stuck with an old linker. By the way, you can't really build the post-processor into the output phase of the compiler. It has to have access to all user files that will be linked so it can look for conflicting symbols and disambiguate them. The compiler itself might be used in a makefile to compile only one file at a time, so it won't know about all identifiers that conflict when truncated to 6 characters. (The above discussion is largely moot, because the 6-char limit on portable programs is here to stay for the next few years. But it's worth seeing that this limit was not necessary, and that the common arguments in its favor are not valid. This is *not* meant to be a flame.) -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
news@ism780c.isc.com (News system) (09/27/88)
In article <8569@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >The proposed ANS for C does NOT repeat NOT prohibit implementations >from supporting more than 6 monocase characters of significance in >external identifiers. > Absolutly true. But it does prevent me form *using* external identifiers with more than 6 monocase characters if I want to be certain that my programs will be accepted by *all* conforming C compililation systems. Marv Rubinstein
henry@utzoo.uucp (Henry Spencer) (09/28/88)
In article <16711@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: >>The proposed ANS for C does NOT repeat NOT prohibit implementations >>from supporting more than 6 monocase characters of significance in >>external identifiers. >Absolutly true. But it does prevent me form *using* external identifiers >with more than 6 monocase characters if I want to be certain that my programs >will be accepted by *all* conforming C compililation systems. No, not quite right. For one thing, the identifiers can be longer than 6 characters, they just can't *rely* on being longer, i.e. they must be distinct in the first six. And second, it is not ANSI which is causing this, it is the deficiencies of existing computer systems. Anyone who wants to be certain about portability has had to observe this restriction all along. Moreover, it is not within ANSI's powers to cure that, since the systems that have the 6-character limit are the ones that can't change easily anyway. Encore une fois: standards committees are in the business of recognizing reality, not trying to change it just because the new version would be nicer. -- The meek can have the Earth; | Henry Spencer at U of Toronto Zoology the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu
henry@utzoo.uucp (Henry Spencer) (09/30/88)
In article <4111@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: > A custom post-processor takes these object files, scans for all > long identifiers, shortens them to unique 6-char names, and > produces as its output system-format object files ready for the > standard linker. > >By the way, you can't really build the post-processor into the output >phase of the compiler. It has to have access to all user files that >will be linked so it can look for conflicting symbols and disambiguate >them... So we are talking about a partial linking step after all. And the postprocessor has to scan all the libraries, to prevent name conflicts with them. And the object modules and libraries can't be postprocessed until linking time. How, precisely, is this different from defining a new object-module format and writing a new linker? -- The meek can have the Earth; | Henry Spencer at U of Toronto Zoology the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu
news@ism780c.isc.com (News system) (10/01/88)
Doug? >>The proposed ANS for C does NOT repeat NOT prohibit implementations >>from supporting more than 6 monocase characters of significance in >>external identifiers. [Marv] >>Absolutly true. But it does prevent me form *using* external identifiers >>with more than 6 monocase characters if I want to be certain that my programs >>will be accepted by *all* conforming C compililation systems. [Henry] >No, not quite right. For one thing, the identifiers can be longer than >6 characters, they just can't *rely* on being longer, i.e. they must be ^^^^ who are the 'they' that can't rely? :-) >distinct in the first six. And second, it is not ANSI which is causing >this, [Marv again] I was not suggesting that ANSI should do anything about the 6 character problem. I was just pointing out even though some compiler implementers are kind enough to provide long names, I could not take advantage of their kindness and write programs with names like 'interval_two' and 'interval_three' if I want to run on a old fashion system. BTW. It isn't all that hard to supply long names on the old systems. I once had to write a compiler supporting long names on a 6 character system. What I did was write my own library-archive program and my own linker. My linker linked objects from the special archive and built a module that the standard system linker could process so as to finish the job. The effort added about six staff weeks to the compiler project. Marv Rubinstein
gwyn@smoke.ARPA (Doug Gwyn ) (10/03/88)
In article <16711@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: -In article <8569@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: ->The proposed ANS for C does NOT repeat NOT prohibit implementations ->from supporting more than 6 monocase characters of significance in ->external identifiers. -Absolutly true. But it does prevent me form *using* external identifiers -with more than 6 monocase characters if I want to be certain that my programs -will be accepted by *all* conforming C compililation systems. Wrong -- it is not the dpANS that prevents you from doing that, but rather the way that some system environments happen to work. The dpANS simply acknowledges this externally-imposed constraint.