jharkins@sagpd1.UUCP (Jim Harkins) (11/28/89)
In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >I would set strict standards that deal with well written programs >(e.g. IMHO a. few if any globals, b. one routine per file, c. well >documented, etc.). I've never understood why one routine per file is such a *good* thing and thou shalt not deviate from this under penalty of dirty looks. In a lot of cases I feel that routines that belong together should be in the same file. An example is malloc/free. They use the same data structures, nobody else has any business seeing these structures, so why not have them in the same file. Other times I'll write a function to do something and, in the interests of small functions doing one thing the body of the function is "call fred, call wilma, call barney, return something". Fred, wilma, and barney are usually declared statics as they aren't much use to anyone else. I can understand the 1 routine 1 file idea, but in the real world it falls on its face pretty often. The real solution is to change C. As most software projects are divided into pieces, it seems I should be able to control which variable/function names are visible by other pieces. This calls for 2 different types of globals, a half-baked global thats visible amongst files in my piece, and full globals that are visible to other pieces. This is more of a linker issue than a C issue, all C needs is a new keyword to declare half-baked globals. I'm sure you language designers out there have a term for what I'm talking about in that last paragraph, but hopefully you all understand my point. my piece jim "Congress. Outside of Zsa Zsa the most bloated, conceited, self-indulgent entity in the world."
hollombe@ttidca.TTI.COM (The Polymath) (11/29/89)
In article <546@sagpd1.UUCP> jharkins@sagpd1.UUCP (Jim Harkins) writes: }In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: }>I would set strict standards that deal with well written programs }>(e.g. IMHO ... one routine per file ... }I've never understood why one routine per file is such a *good* thing and }thou shalt not deviate from this under penalty of dirty looks. In a lot }of cases I feel that routines that belong together should be in the same }file. ... I think this goes back to the days of Data General FORTRAN compilers that _required_ one routine per file. I used to do a lot of work with those beasts. Keeping the common blocks straight was such a pain the company wrote an elaborate utility to automate most of it (and, incidentally, implement a pretty nifty data dictionary). This is the only environment I've ever worked in where the "one routine per file" rule was even considered. The only real advantage I can see for it is cheaper (faster) compiles because only the changed routines get re-compiled. In these days of relatively cheap CPU cycles that's a poor trade off for the hassle of keeping track of all those little files. I agree with Jim's attitude and have found it to be the de facto situation in most of the jobs I've done (and am doing). Related routines and functions should be grouped together and invisible to anything that doesn't need to use them directly. -- The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com) Illegitimis non Citicorp(+)TTI Carborundum 3100 Ocean Park Blvd. (213) 452-9191, x2483 Santa Monica, CA 90405 {csun|philabs|psivax}!ttidca!hollombe
hue@netcom.UUCP (Jonathan Hue) (11/30/89)
>In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >>I would set strict standards that deal with well written programs >>(e.g. IMHO a. few if any globals, b. one routine per file, c. well >>documented, etc.). (Following assumes programming in C) My $0.02: I don't agree with (b) at all. In practice, (a) and (b) are often mutually exclusive. Sometimes two or more functions will need to use the same variable, and if you can shove the functions that use it in one file and make the variable static, you save polluting your program's global name space. If you adhere to (b), you've no choice but to make it global. You can also hide functions with static that no one else is interested in. I think adhering to (b) would tempt you into writing functions longer than you really should (Aw, I don't want to come up with another filename for this 40 line chunk of code. I'll just shove it in inline wherever I need it). At my last job we had a guy that adhered to (b) because he "didn't like searching around to find out where a function was". I suggested using tags, but since he used Microsoft Word on a Mac to edit programs (then he would upload them to a Sun to compile via TOPS - I'm not kidding!) tags weren't very useful. Because of his strict adherence to (b), and his desire to keep the number of files down, he would write 700 to 2000 line functions. Regarding (c), well documented to me doesn't necessarily mean lots of comments. At one place I worked there were separate documents which described how each subsystem of the product worked. The documents gave an overview, and had separate sections for each of its parts which described what all the functions did and how they worked. The comments in the code described anything which wouldn't be obvious to someone who had read the documentation. Having this type of documentation is extremely valuable when bring new people on board. Much better than sitting down with 100K lines of code and going through it with a new hire. 'Course, none of this ever gets written until the first release goes out... -Jonathan
dmt@pegasus.ATT.COM (Dave Tutelman) (11/30/89)
In article <4727@netcom.UUCP> hue@netcom.UUCP (Jonathan Hue ) writes: >>In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >>>(e.g. IMHO a. few if any globals, b. one routine per file... > >(Following assumes programming in C) > >My $0.02: I don't agree with (b) at all. In practice, (a) and (b) are >often mutually exclusive. Sometimes two or more functions will need to >use the same variable, and if you can shove the functions that use it in >one file and make the variable static, you save polluting your program's >global name space. I'm glad you posted that; I was thinking of doing so. Right on! >I suggested using tags... Ditto! >.... but since he used Microsoft Word on a Mac to edit programs, tags >weren't very useful. I have a similar problem, in that most of my editors on the PC don't support tags, but I use "stevie" when I need tags on the PC. I also use "cpr" to generate function indices in my hard copy. There IS one argument, in some cases a compelling one, for "one function per file". In general, linkers aren't smart enough to link just PART of a binary file (.OBJ or .o), when that file contains a function needed by the link. Consider, therefore, developing a library of functions to be used by several programs. For instance, consider a library that manipulates a widget, and can open, close, bump, and bash the widget. As Jonathan correctly notes, these functions are likely to share a set of "limited global data" about the widget, private to the outside but public among open(), close(), bump(), and bash() of widget. So we are tempted to write the four functions in a single C file, and declare the limited-globals as "static". However, suppose the library will be used by an application that: 1. Has strict memory constraints, and 2. The only thing it does to the widget is "bash()" it. The linker wouldn't be able to separate out the bash() function from the binary file, and the application would carry the memory burden of all the widget functions. The alternative is to put each of the the four public widget functions in its own file, compiling to its own binary, and use a librarian program to combine them into a library file. Decent linkers can easily link part of a library file, in quanta of the original object files. How do we deal with the shared data? There are two ways, one ugly and one clean (but more effort, and a little bigger which partially offsets the memory savings): - UGLY: pollute the global data space, and choose names not likely to be used by the application (like "widg_lib_firstone"). - HARDER: write a "data-manager" function, in yet another file, which owns the static variables and responds to requests for them from the library function. The variable names can be translated into integers through a header file common to the library functions. The only pollution of the global name space is the name of the data manager for widget data. So calls would look like: firstone = widg_lib_datamgr( GET, FIRSTONE ); error = widg_lib_datamgr( SET, FIRSTONE, firstone ); I've occasionally had to write libraries where this sort of thing was important. Hope this helps. +---------------------------------------------------------------+ | Dave Tutelman | | Physical - AT&T Bell Labs - Lincroft, NJ | | Logical - ...att!pegasus!dmt | | Audible - (201) 576 2194 | +---------------------------------------------------------------+
sccowan@watmsg.waterloo.edu (S. Crispin Cowan) (12/01/89)
In article <4727@netcom.UUCP> hue@netcom.UUCP (Jonathan Hue ) writes: >>In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >>>I would set strict standards that deal with well written programs >>>(e.g. IMHO a. few if any globals, b. one routine per file, c. well >>>documented, etc.). > >(Following assumes programming in C) [good stuff] >At my last job we had a guy that adhered to (b) because he "didn't like >searching around to find out where a function was". I suggested using >tags, but since he used Microsoft Word on a Mac to edit programs (then >he would upload them to a Sun to compile via TOPS - I'm not kidding!) tags >weren't very useful. Because of his strict adherence to (b), and his desire >to keep the number of files down, he would write 700 to 2000 line functions. I would want anyone who produced 2000 line functions fired, unless they had REALLY good reasons, and "I don't like vi" doesn't even come CLOSE to cutting it. >-Jonathan ---------------------------------------------------------------------- Login name: sccowan In real life: S. Crispin Cowan Office: DC3548 x3934 Home phone: 570-2517 Post Awful: 60 Overlea Drive, Kitchener, N2M 1T1 UUCP: watmath!watmsg!sccowan Domain: sccowan@watmsg.waterloo.edu "We have to keep pushing the pendulum so that it doesn't get stuck in the extremes--only the middle is worth having." Orwell, Videobanned -- Kim Kofmel
tim@hoptoad.uucp (Tim Maroney) (12/02/89)
In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >I would set strict standards that deal with well written programs >(e.g. IMHO a. few if any globals, b. one routine per file, c. well >documented, etc.). In article <4727@netcom.UUCP> hue@netcom.UUCP (Jonathan Hue ) writes: >My $0.02: I don't agree with (b) at all. [...] >I think adhering to (b) would tempt you into writing >functions longer than you really should (Aw, I don't want to come up with >another filename for this 40 line chunk of code. I'll just shove it in >inline wherever I need it). > >At my last job we had a guy that adhered to (b) because he "didn't like >searching around to find out where a function was". [...] >Because of his strict adherence to (b), and his desire >to keep the number of files down, he would write 700 to 2000 line functions. I share your distaste for the rule of one routine per file. Short functions are almost always easier to read than long ones. A medium-sized project (say, 15,000 lines) with functions averaging out at a reasonable size (say, forty lines) would have 375 source files using this rule. What an atrocity! >Regarding (c), well documented to me doesn't necessarily mean lots of comments. Absolutely. Clear code shouldn't *need* a lot of comments; a programmer should be able to read it and understand what's going on from the routine names, the variable names, and the flow of control, with just a few added comments if any. A lot of extraneous comments about things that would be perfectly clear just from reading the code actually damages code readability; the control structures become much harder to follow. There are a lot of people who adhere to an rule that more comments are always better. I worked with a piece of code like that this year. I couldn't make heads or tails out of the commented version, which wound up a few hundreds of lines. So I sat down and ruthlessly stripped out all the comments, and when the code was reduced to a few tens of lines, I then reduced the control structures to the simpler forms which emerged when you could actually start to see the forest for the trees. After that, it became comprehensible. In summary: Clear code is far more important than extensive comments. >At one place I worked there were separate documents which described how >each subsystem of the product worked. The documents gave an overview, and >had separate sections for each of its parts which described what all the >functions did and how they worked. The comments in the code described anything >which wouldn't be obvious to someone who had read the documentation. Having >this type of documentation is extremely valuable when bring new people on >board. Much better than sitting down with 100K lines of code and going through >it with a new hire. 'Course, none of this ever gets written until the first >release goes out... Again, I agree. External documentation is very useful; far more so than most code comments. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com "Every institution I've ever been associated with has tried to screw me." -- Stephen Wolfram
wayne@dsndata.uucp (Wayne Schlitt) (12/04/89)
In article <9157@hoptoad.uucp> tim@hoptoad.uucp (Tim Maroney) writes: > In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: > >I would set strict standards that deal with well written programs > >(e.g. IMHO a. few if any globals, b. one routine per file, c. well > >documented, etc.). > > [ .... ] > > I share your distaste for the rule of one routine per file. Short > functions are almost always easier to read than long ones. A > medium-sized project (say, 15,000 lines) with functions averaging out > at a reasonable size (say, forty lines) would have 375 source files > using this rule. What an atrocity! > hmmm... one of the first things i usually do to code that i get off the net is break it up into one function per file. i am not that dogmatic about it, i just it because it seems to me to be easier to work with. take your example of a project with 15,000 lines of C code. the two extremes are one file of 15,000 lines, or 375 files with around 40 lines per file. if given only the choice between these two extremes, i would definitely pick the latter. the compile time alone will kill you if you choose just one file. in practice i usually do not go to the extreme of always just one function per file, but i rarely let any given file go over 1000 lines. i only put functions in the same file if they are closely related. if the directory gets too many files, i break the directory up into sub directories. maybe some of the reason why i find it easier to work with lots of files is that i use emacs and it is very easy to have several files loaded at the same time. when i am looking at the code it is very easy to see what another function does by doing a "Ctl-x 4 f function_name.c" and then i can see both the calling function and the called function on the screen at the same time. going to the beginning or end of a function is easy as going to the beginning or the end of a file. searching and replacing text doesnt spill over into other functions. yes, you can do these things when everything is in one file too, but to me it seems simpler and easier to have one function per file. another reason why i may lean toward one function per file is that i am used to working on large projects (>100k lines) and when you get to that size, you almost _have_ to work with lots of files, directories of directories, libraries and such. oh well... to each there own... -wayne
crm@romeo.cs.duke.edu (Charlie Martin) (12/04/89)
the problem with (do/do not) use one file per function is that it's an optimization. I'll assume that we're talking about C, not Ada or Fortran,say. There are some really good reasons to use more than one function per file in C; one of the best is to take advantage of C's feature(?) of file scope. You could, for example, implement a stack object of a hidden type by having the stack itself declared static in file stack.c, then implementing push, pop, etc as functions that access this static stack. The type of the stack representation and the details of storage are hidden from the user. This same trick can be done in, say, fortran, using functions and a BLOCK DATA subprogram, but fortran doesn't care much about how many files used. (Does anyone know of any other languages than C -- other than C++ etc -- that have this file scope mechanism?) On the other hand, there are real good reasons not to put a whole 10KSLOC program into one file: editing and compilation time strike me. What the optimum is between one function per file and only one file is probably determined by the problem and program architecture. Charlie Martin (crm@cs.duke.edu,mcnc!duke!crm)
perry@apollo.HP.COM (Jim Perry) (12/05/89)
In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >In article <4727@netcom.UUCP> hue@netcom.UUCP (Jonathan Hue ) writes: >>>In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >>>>(e.g. IMHO a. few if any globals, b. one routine per file... >> >>My $0.02: I don't agree with (b) at all. In practice, (a) and (b) are > I'm glad you posted that; I was thinking of doing so. Right on! > >There IS one argument, in some cases a compelling one, for "one function >per file". In general, linkers aren't smart enough to link just >PART of a binary file (.OBJ or .o), when that file contains a function >needed by the link. Consider, therefore, developing a library of functions ... >The linker wouldn't be able to separate out the bash() function from >the binary file, and the application would carry the memory burden >of all the widget functions. > >The alternative is to put each of the the four public widget functions >in its own file, compiling to its own binary, and use a librarian >program to combine them into a library file. Decent linkers can >easily link part of a library file, in quanta of the original object >files. > >How do we deal with the shared data? There are two ways, one ugly >and one clean (but more effort, and a little bigger which partially ... > >I've occasionally had to write libraries where this sort of thing >was important. Instead of having zillions of programmers standing on their heads like this in order to get their jobs done, how about a couple of people smarten up the linkers to do the job right? I'm not familiar with the particular linkage conventions used by the compilers/linkers that affect this group (presumably unix and popular-PC ones), but there's nothing fundamental keeping linkers from separating out the bash() function from the binary file, because I've used such linkers/loaders. Jim Perry perry@apollo.com HP/Apollo, Chelmsford MA This particularly rapid unintelligible patter isn't generally heard and if it is it doesn't matter.
perry@apollo.HP.COM (Jim Perry) (12/05/89)
In article <9157@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >In article <??> dopey@sun.UUCP (Can't ya tell by the name) writes: >>Regarding (c), well documented to me doesn't necessarily mean lots of comments. > >Absolutely. Clear code shouldn't *need* a lot of comments; a >programmer should be able to read it and understand what's going on >from the routine names, the variable names, and the flow of control, >with just a few added comments if any. A lot of extraneous comments >about things that would be perfectly clear just from reading the code >actually damages code readability; the control structures become much >harder to follow. > >There are a lot of people who adhere to an rule that more comments are >always better. I worked with a piece of code like that this year. I >couldn't make heads or tails out of the commented version, which wound >up a few hundreds of lines. So I sat down and ruthlessly stripped out >all the comments, and when the code was reduced to a few tens of lines, >I then reduced the control structures to the simpler forms which >emerged when you could actually start to see the forest for the trees. >After that, it became comprehensible. > >In summary: Clear code is far more important than extensive comments. Clear code and clear comments are both important. As you observe, it's quite possible to obfuscate a program in any number of ways. However, this example doesn't say much, other than that you were presented with a small program you didn't understand (presumably because it was badly written/commented), and by extensively editing it, and substantially rewriting a significant percentage of it, came to understand it. Let's assume that you've now rearranged the code to the optimal C language (again, an assumption) description of the solution, but no comments. I submit that I can then pass over that file adding comments, and by so doing produce an even better program. My definition of "even better"? I assign an arbitrary engineer, who's never seen that piece of code (or who last worked on it six months ago, effectively the same thing), to make some functional modification to the program. The sooner the correct new solution is reached, the better the (original) program. >> Much better than sitting down with 100K lines of code and going through >>it with a new hire. 'Course, none of this ever gets written until the >>release goes out... > >Again, I agree. External documentation is very useful; far more so than >most code comments. Again, you're throwing out the baby with the bathwater. External documentation has a fundamental flaw alluded to by dopey (no offense): it's not generally there, and it's out of date. "Most code comments" are also missing or out of date, but only because most code is poorly documented. As Fred Brooks says in The Mythical Man-Month: "[external] Program documentation is notoriously poor, and its maintenance is worse. Changes made in the program do not promptly, accurately, and invariably appear in the paper." "The solution, I think, is to merge the files, to incorporate the documentation into the source program. This is at once a powerful incentive toward proper maintenance, and an insurance that the documentation will always be handy to the program user. Such programs are called *self_documenting*". The proper rule, of course, is not that more comments are always better, but that sufficient comments are always better. In your example there were presumably too many comments, but then the code was apparently not clearly written either. It is true that what Knuth calls a literate programmer must have both the skill of coding, and that of documenting. All programmers are in effect technical writers, documenting their work for other programmers who will see it/work on it. Not all current programmers excel at both of these skills, but it is a goal to aspire to. >> Much better than sitting down with 100K lines of code and going through >>it with a new hire. Well, of course this is the heart of the matter. A few-hundred or few-ten line program tells us very little about real life software engineering situations. Actually, if the code is properly self-documenting, then the new hire *can* just sit down with the code and learn from the code itself. Documentation, like code, is hierarchical. At the beginning of each program, library, whatever, is a broad overview of that unit. More specific comments would be associated with modules, functions, algorithms, etc. For instance, let's say I've been asked to change the memory allocation implementation of a moderately large program I've never seen before. From the documentation of the program I determine generally what it does and what sort of data it deals with, and further that it's internally broken down into twelve modules, one of which deals with storage allocation. In that module's primary .c file is a description of the general memory model, a breakdown of the operations on that memory (functions in the module), and perhaps a summary of what the cost and benefit of that model are compared to likely alternatives. At each subsidiary function the particular algorithms used are described, potential pitfalls, potential interaction with other functions. Within a function the variables are described, and the high points of the algorithm, such as potential trouble sites for concurrency, etc. There's not much time overhead in generating this documentation, assuming a basic competence at technical writing to one's own level. At design time most of this information is probably either already written down or on the forefront of the programmer's brain (I often design code by writing the documentation). This sort of information *can't* easily be reconstructed from reading C code. ("now WHY was I cocky enough to code this loop without explicitly guarding against interrupts?") I experienced an epiphany once when I realized that for the fourth time in two years I was drawing little linked-list boxes-and-lines to prove to myself that a list handling function was correct in all cases. I put that diagram into the code (and subsequently did in fact refer to it a few times on later occasions, saving myself significant time). I hope if subsequent maintainers have had occasion to visit that code they benefit from it, but it doesn't really matter, in this case, I've already benefitted myself. - Jim Perry perry@apollo.com HP/Apollo, Chelmsford MA This particularly rapid unintelligible patter isn't generally heard and if it is it doesn't matter.
tada@athena.mit.edu (Michael J Zehr) (12/05/89)
In article <473ae701.20b6d@apollo.HP.COM> perry@apollo.HP.COM (Jim Perry) writes: >There's not much time overhead in generating [heirarchical >documentation], assuming a >basic competence at technical writing to one's own level. At design time >most of this information is probably either already written down or on the >forefront of the programmer's brain (I often design code by writing the >documentation). This sort of information *can't* easily be reconstructed >from reading C code. ("now WHY was I cocky enough to code this loop >without explicitly guarding against interrupts?") >- >Jim Perry perry@apollo.com HP/Apollo, Chelmsford MA I heartily agree with this, but unfortunately i rarely see it in practice. The largest project i ever did entirely by myself (a library of calls to handle a user interface, including things like 'make button' and 'call this function when the user clicks this region', etc.) amounted to around 2000 lines of C code. After designing it, I started the coding by writing 3-5 lines describing each function. When i went to write the actual code, i was able to do it all very quickly, and those 3-5 lines of description for each function have saved enormous time and effort enhancing the library. On top of that, I don't think it took any extra time to write, even to start with. I was basically sitting at the terminal thinking "how do i start this" and decided typing what i knew was the best way to start. I think the time to write the comments plus the time to write the code was less than if i had just started in on the code. Unfortunately, i've gotten a lot of comments along the lines of "i'm too busy writing code to write a comment describing it" when suggesting it to others. I guess keeping maintenance costs high is what keeps some people in business... :-\ -michael j zehr
tim@hoptoad.uucp (Tim Maroney) (12/05/89)
In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >There IS one argument, in some cases a compelling one, for "one function >per file". In general, linkers aren't smart enough to link just >PART of a binary file (.OBJ or .o), when that file contains a function >needed by the link. WHAT? What year is this? I don't think I've ever used a linker that didn't eliminate unused routines. Any such linker would be seriously brain damaged. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com "Everything that gives us pleasure gives us pain to measure it by." -- The Residents, GOD IN THREE PERSONS
bill@twwells.com (T. William Wells) (12/05/89)
I'll add my two cents about commenting. I am a big fan of *useful* comments. And I despise over-commenting. The latter is any comment that is not the former. I have three simple rules: 1) Comments should explain the purpose of the code and its relationship to the rest of the program. 2) Comments should explain the *abstract* character of the code. 3) Comments should *never* explain how the code works or what it is doing *unless* you have done something tricky. (Tricky: if, six months later, after reading the comments and the code you have to think about it, it is tricky.) BTW, the abstract character of the code and its purpose, while they can usually be explained simultaneously, are not always close enough that this will work. A routine that converts binary time to a useful printable form (this is its abstract character) might have as its purpose permitting printing binary time usefully; in that case, one comment easily enough serves two purposes. On the other hand, you might have this complicated date conversion routine whose character would be described by its I/O relationships. However, the purpose of the routine might be: "to satisfy the wants of our VP from Alpha Centauri". :-) --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
nick@lfcs.ed.ac.uk (Nick Rothwell) (12/05/89)
In article <9157@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >Absolutely. Clear code shouldn't *need* a lot of comments; a >programmer should be able to read it and understand what's going on >from the routine names, the variable names, and the flow of control, >with just a few added comments if any. As long as we're forced to use old fashioned low-level languages like C, where it's impossible to express the pure algorithm directly in the target language, there's a need for comments. There are two reasons. The first is that the original algorithm might use concepts which can't be expressed directly in C (higher order functions, or polymorphic data objects, for example). The second is that there has to be some low-level implementation of the things which were assumed as part of the "universe" of the high-level description (e.g. garbage collection). Let me see you write a garbage collector (for example), where it's clear exactly what GC algorithm you're using and what assumptions you're making about the format, storage, invariants of the objects, in C, without comments. Nick. -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcvax!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ "You're gonna jump!?" "No, Al. I'm gonna FLY!"
bill@polygen.uucp (Bill Poitras) (12/05/89)
In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: %In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: %%There IS one argument, in some cases a compelling one, for "one function %%per file". In general, linkers aren't smart enough to link just %%PART of a binary file (.OBJ or .o), when that file contains a function %%needed by the link. % %WHAT? What year is this? I don't think I've ever used a linker that %didn't eliminate unused routines. Any such linker would be seriously %brain damaged. %-- %Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com Yes you have. ANY linker you have does this. What you are thinking of is a LIBRARY, ie. .lib file. (lib*.a if you're a Unix person), which when used in the link process, only the functions used in the program begin linked, get linked. Although I'm not a compiler/linker expert, I almost positive that this is true. +-----------------+---------------------------+-----------------------------+ | Bill Poitras | Polygen Corporation | {princeton mit-eddie | | (bill) | Waltham, MA USA | bu sunne}!polygen!bill | +-----------------+---------------------------+-----------------------------+
bill@twwells.com (T. William Wells) (12/05/89)
In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: : In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: : >There IS one argument, in some cases a compelling one, for "one function : >per file". In general, linkers aren't smart enough to link just : >PART of a binary file (.OBJ or .o), when that file contains a function : >needed by the link. : : WHAT? What year is this? I don't think I've ever used a linker that : didn't eliminate unused routines. Any such linker would be seriously : brain damaged. Then you have been in a very limited universe. Most linkers will not take, from a single object file, just those routines needed by the rest of the program. Most linkers *will* take only those object files needed from an archive, but that is not the same thing. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
jef@well.UUCP (Jef Poskanzer) (12/06/89)
In the referenced message, bill@twwells.com (T. William Wells) wrote: }In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: }: WHAT? What year is this? I don't think I've ever used a linker that }: didn't eliminate unused routines. Any such linker would be seriously }: brain damaged. } }Most linkers will not take, from a single object file, just those }routines needed by the rest of the program. Tim is (almost certainly) wrong that he has never used such a brain damaged linker, since every Unix linker is brain damaged in this fashion. However, T. Bill is wrong that most linkers have this brain damage, since pretty much every NON-Unix linker works correctly. --- Jef Jef Poskanzer jef@well.sf.ca.us {ucbvax, apple, hplabs}!well!jef "Kirk to Enterprise -- beam down Yeoman Rand and a six-pack."
sccowan@watmsg.waterloo.edu (S. Crispin Cowan) (12/06/89)
In article <1989Dec5.115934.24535@twwells.com> bill@twwells.com (T. William Wells) writes: >du> >Organization: None, Ft. Lauderdale, FL >Lines: 31 >Xref: watmath comp.software-eng:2636 comp.misc:7659 > >I'll add my two cents about commenting. I am a big fan of >*useful* comments. And I despise over-commenting. The latter is >any comment that is not the former. I don't understand the problem with over-commenting. First of all, it is _very_ rare (both :-) and :-(), and secondly, just skip it if it bugs you. >I have three simple rules: [good list of three rules] I also like to see ALL variables described. I can figure out what a for-loop is doing, but it's not at all obvious what the trans_rec_count variable is a count of (total transactions, to-date, how many gropple-grommits in this shipment, etc.). Unless it's just a scratch counter such as `i', it should be commented. >Bill { uunet | novavax | ankh | sunvice } !twwells!bill >bill@twwells.com ---------------------------------------------------------------------- (S.) Crispin Cowan, CS grad student, University of Waterloo Office: DC3548 x3934 Home phone: 570-2517 Post Awful: 60 Overlea Drive, Kitchener, N2M 1T1 UUCP: watmath!watmsg!sccowan Domain: sccowan@watmsg.waterloo.edu "The most important question when any new computer architecture is introduced is `So what?'" -someone on comp.arch (if it was you, let me know & I'll credit it)
dmt@pegasus.ATT.COM (Dave Tutelman) (12/06/89)
>%In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >%%There IS one argument, in some cases a compelling one, for "one function >%%per file". In general, linkers aren't smart enough to link just >%%PART of a binary file (.OBJ or .o), when that file contains a function >%%needed by the link. >In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >%WHAT? What year is this? I don't think I've ever used a linker that >%didn't eliminate unused routines. Any such linker would be seriously >%brain damaged. In article <600@fred.UUCP> bill@fred.UUCP (Bill Poitras) writes: >Yes you have. ANY linker you have does this. What you are thinking of is >a LIBRARY, ie. .lib file. (lib*.a if you're a Unix person), which when used in >the link process, only the functions used in the program begin linked, get >linked. Although I'm not a compiler/linker expert, I almost positive that >this is true. Thanks, Bill. I generally use "dumb" linkers, though the linkers that Tim claims to use wouldn't violate any laws of thermodynamics. Consider: - It isn't too difficult for a C compiler to demark beginning and end of function in a .OBJ or .o, or even just guarantee adjacency within a single function. (Of course, the last function would have to be demarked.) - The external variables (static or otherwise) ALLOCATED in that file could be loaded, depending on whether they are referenced by any of the loaded functions. So why don't any of the linkers I use get this smart? Because their authors wanted to have a single linker that would handle arbitrary object files, without depending on their being generated by their favorite C compiler. In particular, object files from hand-coded assembler could also be linked in. (This is an IMPORTANT feature of MSDOS linkers, since a lot of programs use a little assembler for their lowest-level routines.) Hand-coded assembly code yields object files that CAN'T be split up into the functions that are actually called. Just a few of the obvious things that make it impossible are: - Self-modifying code. - Gotos whose scope isn't restricted to be in the function. This problem is solved, as Bill notes, by keeping enough information in libraries to keep the object files separate. If you write one function per file, then the linker only loads the essential functions. If you write several functions per file, then all the functions from that file (but not all the functions in the library) get loaded if ANY function from the file does. When I wrote the base note, I thought about including this discussion, but didn't because the note was long enough. Bad decision? +---------------------------------------------------------------+ | Dave Tutelman | | Physical - AT&T Bell Labs - Lincroft, NJ | | Logical - ...att!pegasus!dmt | | Audible - (201) 576 2194 | +---------------------------------------------------------------+
bill@twwells.com (T. William Wells) (12/06/89)
In article <14836@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: : In the referenced message, bill@twwells.com (T. William Wells) wrote: : }In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: : }: WHAT? What year is this? I don't think I've ever used a linker that : }: didn't eliminate unused routines. Any such linker would be seriously : }: brain damaged. : } : }Most linkers will not take, from a single object file, just those : }routines needed by the rest of the program. : : Tim is (almost certainly) wrong that he has never used such a brain : damaged linker, since every Unix linker is brain damaged in this fashion. : : However, T. Bill is wrong that most linkers have this brain damage, since : pretty much every NON-Unix linker works correctly. Eh? I've worked on a dozen or so non-Unix machines. Only a few of them were capable of taking apart an object file and using only the routines you needed. And those linkers could not be used with a C compiler that did not play games with static variable names. (They had no notion of static at all.) I'll admit that many of those machines were used over eight years ago, so things might be better now, but I doubt it. IBM, for example, does not change all that quickly. Care to name some specific systems where the linker could take apart an object file, and for which a reasonable C compiler exists? --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
porges@inmet.inmet.com (12/07/89)
>/* Written 9:33 pm Dec 4, 1989 by tim@hoptoad.uucp in inmet:comp.misc */ >In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >>There IS one argument, in some cases a compelling one, for "one function >>per file". In general, linkers aren't smart enough to link just >>PART of a binary file (.OBJ or .o), when that file contains a function >>needed by the link. >WHAT? What year is this? I don't think I've ever used a linker that >didn't eliminate unused routines. Any such linker would be seriously >brain damaged. -- >Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com Then you've never used Sun3 ld, or any other Unix linker. Remember, he's talking about parts of a .o file, not members of a library. For example: a.c: main() { b() } b.c: b() { } c() { } The resulting link of a.o and b.o will include the routine 'c'. This can be verified by nm, or by inspecting the a.out file with adb. -- Don Porges porges@inmet.inmet.com ..uunet!inmet!porges
jef@well.UUCP (Jef Poskanzer) (12/07/89)
In the referenced message, bill@twwells.com (T. William Wells) wrote: }Care to name some specific systems where the linker could take }apart an object file, and for which a reasonable C compiler }exists? Why the second requirement, Bill? To be crystal clear about what is being discussed, it is the ability to make a library from a single source file, and then at link time extract only the referenced routines from that library. No one is talking about eliminating unreferenced routines from the main program. Everyone who is getting hysterical about their call-by-string hacks can stop screaming now. Anyway, the last time this discussion came up, I posted a transcript of a session with the VMS FORTRAN compiler and the VMS linker. They have no problem at all separating a single source file into one object module per routine. The reaction then was, "Oh sure, FORTRAN can do that, but we were discussing *real* languages." "Real" languages meaning C, of course. So, why the second requirement, Bill? Have you ever actually checked whether any of the non-Unix systems you've used have this ability? Are you afraid of what you might find? --- Jef Jef Poskanzer jef@well.sf.ca.us {ucbvax, apple, hplabs}!well!jef "An object never serves the same function as its image - or its name." -- Rene Magritte
bill@twwells.com (T. William Wells) (12/07/89)
In article <14850@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes:
: In the referenced message, bill@twwells.com (T. William Wells) wrote:
: }Care to name some specific systems where the linker could take
: }apart an object file, and for which a reasonable C compiler
: }exists?
:
: Why the second requirement, Bill?
Specifically because those few linkers I know of that permit
disassembling an object file and using just the pieces work with
object files that are, essentially, archives. That is to say, if
you compiled several functions into the one object file, each
function occupied a physically distinct part of the file; taking
the object file apart was little more complex than just copying
some particular part of the file.
None of these machines ran C. They would have had real problems
with C, since it would have been hard to implement file scope
with those linkers. (I know, I had to try to do something similar
with one of them.)
The second requirement is there, not as an absolute requirement,
but as a "reasonableness" requirement. None of those linkers
would have been useful in a modern environment. Certainly a linker
that a C compiler exists for is minimally "reasonable". I'd be
willing to entertain other linkers, so long as they aren't overly
restrictive.
: So, why the second requirement, Bill? Have you ever actually checked
: whether any of the non-Unix systems you've used have this ability? Are
: you afraid of what you might find?
An ad hominem deserves an ad hominem in response: fuck you, Mr.
Poskanzer. I do not appreciate personal attacks.
And, to answer your question: yes, of course I looked.
We now have one linker (a VMS linker, mentioned in a deleted part
of the article). But I'd like to see some more.
After all, the point under discussion is:
: }Most linkers will not take, from a single object file, just those
: }routines needed by the rest of the program.
Which is to say that there *are* some linkers that will. I happen
not to have used any recently, and the ones that I did were
really brain damaged, but I can see how one would do such a
linker. (BTW, it happens that I've never used VMS.)
*One* linker certainly does not negate "most". So, without at
least a few more examples, there isn't any reason to doubt the
"most". And, unless someone comes up with a few more, there is no
point in discussing this further.
---
Bill { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com
murphyn@cell.mot.COM (Neal P. Murphy) (12/07/89)
bill@polygen.uucp (Bill Poitras) writes: >In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >%In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >%%... >%%per file". In general, linkers aren't smart enough to link just >%... >%WHAT? What year is this? I don't think I've ever used a linker that >%... >Yes you have. ANY linker you have does this. What you are thinking of is >... Of course, if you want to get into esoterica, you might as well mention the linker/loader on the old TOPS-10 and TOPS-20 O/S from DEC. They would read every module from every object file linked, unless the object file was specified as name.rel/LIB, whereupon it would be treated as a library. Of course, this action requires more memory, ... Strike that, the TOPS-10 compiler ran in 30k words. If I might opine, such selective action usually requires extra thought on the part of linker designers, and if they DGAS (dontgivea...), they don't design it in. NPN
perry@apollo.HP.COM (Jim Perry) (12/07/89)
bill@polygen.uucp (Bill Poitras) writes: >In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >%In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >%%There IS one argument, in some cases a compelling one, for "one function >%%per file". In general, linkers aren't smart enough to link just >%%PART of a binary file (.OBJ or .o), when that file contains a function >%%needed by the link. >% >%WHAT? What year is this? I don't think I've ever used a linker that >%didn't eliminate unused routines. Any such linker would be seriously >%brain damaged. >%-- >%Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com >Yes you have. ANY linker you have does this. What you are thinking of is >a LIBRARY, ie. .lib file. (lib*.a if you're a Unix person), which when used in >the link process, only the functions used in the program begin linked, get >linked. Although I'm not a compiler/linker expert, I almost positive that >this is true. This is true only in the very narrow context of UNIX; in more advanced systems (both predating and postdating UNIX) the output of the compilers (foo.o or a.out in UNIX terms) is what you think of as a library, and the associated linkers/loaders/library editors can do the right thing, for instance doing type-checking on cross-module function calls, but certainly including only procedures and data that are actually referenced. On another point, speaking of a.out, it's relatively rare in the world at large for compilers to generate *assembly source*; that's another UNIXism. (They compile directly to the appropriate machine language -- in a relocatable library form, of course). People on both sides are surprised by this, for some reason. Anyway, I suspect there may be some relation between this issue and the object module thing, but I don't know for sure. It's also true that the traditional UNIX philosophy calls for multiple simple tools rather than complex tools, and ease of writing over ease or efficiency of use; thus rather than a cc that knows about libraries, you have cpp/cc/as/ar/ranlib. Easier to write and almost as good. (Or, if you will, "brain damaged" :-). - Jim Perry perry@apollo.hp.com HP/Apollo, Chelmsford MA This particularly rapid unintelligible patter isn't generally heard and if it is it doesn't matter.
scott@bbxsda.UUCP (Scott Amspoker) (12/08/89)
In article <4748ed31.20b6d@apollo.HP.COM> perry@apollo.HP.COM (Jim Perry) writes: >On another point, speaking of a.out, it's relatively rare in the world at >large for compilers to generate *assembly source*; that's another UNIXism. >(They compile directly to the appropriate machine language -- in a >relocatable library form, of course). People on both sides are surprised >by this, for some reason. Anyway, I suspect there may be some relation >between this issue and the object module thing, but I don't know for sure. I think it has more to with the fact that an assembler already exists. why re-invent the wheel? The code generation phase of a compiler is much simpler if outputs assembly source resulting in a compiler that is more portable. -- Scott Amspoker Basis International, Albuquerque, NM (505) 345-5232 unmvax.cs.unm.edu!bbx!bbxsda!scott
dplatt@coherent.com (Dave Platt) (12/08/89)
In article <14850@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: > Anyway, the last time this discussion came up, I posted a transcript of > a session with the VMS FORTRAN compiler and the VMS linker. They have > no problem at all separating a single source file into one object > module per routine. The reaction then was, "Oh sure, FORTRAN can do > that, but we were discussing *real* languages." "Real" languages > meaning C, of course. The Honeywell CP-6 object language and linker supports this sort of feature, as well. Object-language records are stored in a "keyed file" (a B-tree file, in effect); it's possible to store many object-language packages in a single keyed file with no interference. The linker brings in the necessary object-language modules, by searching the "external functions defined" record for each module, and loading only those modules which define a function that's actually needed. All external (global) data variables are, by definition, contained within an object module.. and hence within a function. This isn't entirely consistent with the C model, which defines globals as being those variables which lie _outside_ of any function. One way in which a savvy C compiler could resolve this, would be to bundle all of the global variables into a dummy module. Each real module (function) in the source-file would access the global variables as if they had been declared "extern". If the linker fetched a function-module from the object file, it would "see" that the module was accessing some extern variables, would search the "external variables defined" records in the object file, and link in the dummy module (which defined the globals) as a result. There's a cost to this approach, though. It requires that all global variables be accessed as "externs", even if they were defined in the same source-file as they're being used. This makes it difficult to have "static" variables of file scope... because the very use of the "static" keyword requires that the variables' names not be exported (for reasons of data-hiding, name-space pollution, etc.) An effective C compiler can get around this problem by hashing the names of the static variables into some horrible string that's guaranteed not to collide with any other name. -- Dave Platt VOICE: (415) 493-8805 UUCP: ...!{ames,apple,uunet}!coherent!dplatt DOMAIN: dplatt@coherent.com INTERNET: coherent!dplatt@ames.arpa, ...@uunet.uu.net USNAIL: Coherent Thought Inc. 3350 West Bayshore #205 Palo Alto CA 94303
bill@twwells.com (T. William Wells) (12/08/89)
In article <32359@watmath.waterloo.edu> sccowan@watmsg.waterloo.edu (S. Crispin Cowan) writes: : In article <1989Dec5.115934.24535@twwells.com> bill@twwells.com (T. William Wells) writes: : >I'll add my two cents about commenting. I am a big fan of : >*useful* comments. And I despise over-commenting. The latter is : >any comment that is not the former. : I don't understand the problem with over-commenting. First of all, it : is _very_ rare (both :-) and :-(), and secondly, just skip it if it : bugs you. This is the syndrome where someone writes: /* Increment a. */ ++a; Such comments don't help, but they do waste the time needed to read them. You don't know ahead of time whether the comment is important, so you have to read it. To no purpose. This particular kind of commenting is all too common. : >I have three simple rules: : [good list of three rules] : : I also like to see ALL variables described. I can figure out what a : for-loop is doing, but it's not at all obvious what the : trans_rec_count variable is a count of (total transactions, to-date, : how many gropple-grommits in this shipment, etc.). Unless it's just a : scratch counter such as `i', it should be commented. 1) Comments should explain the purpose of the code and its relationship to the rest of the program. applies to variables as well as executable code. BTW, I follow a consistent rule when commenting variables. If the variable needs a complex description, that goes before the declaration. On the declaration goes a short comment which indicates what the variable is; that comment is always a noun phrase with any leading determiner deleted. I'll add a fourth rule to my list: 4) Comments are literary objects; write them with your audience in mind. Write clear and standard English (or whatever). Avoid unnecessary abbreviations and ellipses. Read a few good books on writing and take at least *some* of their suggestions to heart; Strunk & White's _The Elements of Style_ is, at least, short and entertaining (not to mention useful) and there are many others. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
dricejb@drilex.UUCP (Craig Jackson drilex1) (12/08/89)
In article <14836@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: >In the referenced message, bill@twwells.com (T. William Wells) wrote: >}In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >}: WHAT? What year is this? I don't think I've ever used a linker that >}: didn't eliminate unused routines. Any such linker would be seriously >}: brain damaged. >} >}Most linkers will not take, from a single object file, just those >}routines needed by the rest of the program. > >Tim is (almost certainly) wrong that he has never used such a brain >damaged linker, since every Unix linker is brain damaged in this fashion. Tim has been on the Mac for a while; I think Mac linkers *may* be different in this regard. >However, T. Bill is wrong that most linkers have this brain damage, since >pretty much every NON-Unix linker works correctly. > Jef Poskanzer jef@well.sf.ca.us {ucbvax, apple, hplabs}!well!jef I know for a fact that MS-DOS linkers have this 'brain damage', even though real librarians are available. The semantics of C make it somewhat harder to eliminate the dead code. (Not impossible. The problem is making sure the static stuff is handled correctly.) I suspect that the MIPS linker can do this, because it can do link-time optimization. I haven't seen it, but the linker for Unisys A-Series C may be able to do this, because it can do this for other languages on that machine. -- Craig Jackson dricejb@drilex.dri.mgh.com {bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}
tim@hoptoad.uucp (Tim Maroney) (12/08/89)
In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: >There IS one argument, in some cases a compelling one, for "one function >per file". In general, linkers aren't smart enough to link just >PART of a binary file (.OBJ or .o), when that file contains a function >needed by the link. In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >WHAT? What year is this? I don't think I've ever used a linker that >didn't eliminate unused routines. Any such linker would be seriously >brain damaged. bill@polygen.uucp (Bill Poitras) wrote: > Yes you have. ANY linker you have does this. What you are thinking of is > a LIBRARY, ie. .lib file. (lib*.a if you're a Unix person), which when used in > the link process, only the functions used in the program begin linked, get > linked. The linkers I use every day on the Macintosh routinely remove all unreferenced routines from output files. On the other hand, it turns out that very few, if any, UNIX linkers do this. I *have* used UNIX linkers (I used to use them all the time, in fact), so my impression was incorrect, as a number of people have kindly informed me by e-mail. I was going to apologize, but I've just read all ten pages of the ld(1) manual page, and it never explicitly says this, so I feel justified in the error. I'm not used to Mac development tools being smarter than UNIX! -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com FROM THE FOOL FILE: "The men promise to provide unconditionally for their wives. The women in turn serve unconditionally to provide the other household services necessary for the men to fulfill their obligations to the women. The women are satisfied because they have the men working for THEM." -- Colin Jenkins, soc.women
murphyn@cell.mot.COM (Neal P. Murphy) (12/08/89)
bill@twwells.com (T. William Wells) writes: >... >*One* linker certainly does not negate "most". So, without at >least a few more examples, there isn't any reason to doubt the >"most". And, unless someone comes up with a few more, there is no >point in discussing this further. >... A few more examples? DEC10/TOPS10 and DEC20/TOPS20 linking loader would extract only the functions that were referenced, provided it was informed that it should use the object file as a library, e.g., .algol fubar,jnil,myfncs ; compile the three ALGOL sources . ; creating fubar.rel, jnil.rel, and . ; myfncs.rel .load fubar,jnil,myfncs/LIB ; load the desired functions into memory .save fubar ; save the image in fubar.run The procedure on a DEC20 would be similar. It's been ten years since I used this DEC10, so there could be a minor error in my syntax. This was a KA-10 processor, 196k words memory, 45 jobs, swapping drum and 602 monitor, so memory size is no reason for not performing selective linking. NPN
throopw@sheol.UUCP (Wayne Throop) (12/09/89)
> tim@hoptoad.uucp (Tim Maroney) > The linkers I use every day on the Macintosh routinely remove all > unreferenced routines from output files. On the other hand, it turns > out that very few, if any, UNIX linkers do this. There are many different reasons for this, but it is perhaps worth noting that the problem isn't purely a linker problem. If the compiler doesn't generate object code with the separable portions marked out, the linker simply can't separate them. There are valid reasons for generating unravelable object files, having to do with just what tradeoffs one wishes to make between compile, link, and runtime efficency. The general Mac environment makes the motives for making this tradeoff differently than traditional mainframe and minicomputer language systems compelling. > but I've just read all ten pages of > the ld(1) manual page, and it never explicitly says this, so I feel > justified in the error. I think the real meat of it is in the a.out(5) (or is that (6)) man page, that is, the executable/object-file data format definition. Reading between the lines of this, it becomes apparent that it would be difficult (though not impossible) for a compiler to generate separable object files, and thus at least as difficult for a linker to separate them. (This is, of course, still not an explicit statement of functional deficit.) Question: is it possible to convince the Mac language environment to leave the unreferenced routines and data in the executable image equivalent? If not, unit testing from a good debugger becomes more difficult. (Mind you, not impossible.... one could simply artificially reference every routine you want to unit test, but this could be tedious or even problematical.) -- Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org
tim@hoptoad.uucp (Tim Maroney) (12/09/89)
In article <1989Dec6.154103.2078@twwells.com> bill@twwells.com (T. William Wells) writes: >Eh? I've worked on a dozen or so non-Unix machines. Only a few of >them were capable of taking apart an object file and using only >the routines you needed. And those linkers could not be used with >a C compiler that did not play games with static variable names. >(They had no notion of static at all.) > >Care to name some specific systems where the linker could take >apart an object file, and for which a reasonable C compiler >exists? MPW for the Apple Macintosh. I just elaborately verified that the linker does this for a skeptical friend. The C compiler is nearly a full ANSI C, and it certainly does include "statics". I just did another test, and it even deletes unused "static" functions. It's actually a pretty strong development system overall; I've been praising it on the net ever since I was one of the original beta testers. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com This message does represent the views of Eclectic Software.
decot@hpisod2.HP.COM (Dave Decot) (12/09/89)
The questions I want answered by comments are:
What do the possible values of this variable or type represent,
within the user's abstraction?
and
What does this code expect, and what does it assume, about the
values of its arguments and surrounding variables?
and
What, abstractly, does this code guarantee when it's done?
For instance,
int status; /* GOOD if the laser hit the target, BAD if not */
status = zap(plane2); /* try to blast the enemy plane */
if (status == GOOD) /* the airplane was successfully destroyed */
{
++k; /* bump the death toll */
kill(SIGUSR1, pid2); /* notify the air traffic controller */
}
I don't care what bits you're twiddling, I want to know what it's supposed
to be for, and what it means abstractly.
The interesting thing about comments like this is you can grab them straight
out of a specification document. >> HINT!!! <<
Dave Decot
hpda!decot
hue@netcom.UUCP (Johathan Hue) (12/09/89)
In article <473ae701.20b6d@apollo.HP.COM>, perry@apollo.HP.COM (Jim Perry) writes: > Again, you're throwing out the baby with the bathwater. External > documentation has a fundamental flaw alluded to by dopey (no offense): it's > not generally there, and it's out of date. "Most code comments" are also > missing or out of date, but only because most code is poorly documented. > As Fred Brooks says in The Mythical Man-Month: One of my rules is "You shouldn't have to look at the code to understand what it does and how it works". If the external documentation isn't there, then your programmers have no discipline and your managers are a bunch of whimps. Maybe HP will be able to whip you Apollo boys into shape. :-) > Well, of course this is the heart of the matter. A few-hundred or few-ten > line program tells us very little about real life software engineering > situations. Actually, if the code is properly self-documenting, then the > new hire *can* just sit down with the code and learn from the code itself. > Documentation, like code, is hierarchical. At the beginning of each > program, library, whatever, is a broad overview of that unit. More > specific comments would be associated with modules, functions, algorithms, > etc. I'm sorry, this just doesn't work for me. If you're going to read the code you're going to have several hundred to several thousand page listing, and are going to be forever flipping through it trying to trace the flow of control and read your comment boxes which describe what your functions do. I once worked on a device driver and in the external documentation I drew a state machine of how one part of it worked. Are you going to use ascii text graphics to draw boxes and arrows and put that in your comments? -Jonathan
hue@netcom.UUCP (Johathan Hue) (12/09/89)
In article <WAYNE.89Dec3140323@dsndata.uucp>, wayne@dsndata.uucp (Wayne Schlitt) writes: > hmmm... one of the first things i usually do to code that i get off > the net is break it up into one function per file. i am not that > dogmatic about it, i just it because it seems to me to be easier to > work with. You're giving up some features of the language if you do that. You no longer have static functions, or static variables outside of functions, so everything becomes global, and good luck if someone decided to have a global and a static with the same name. You also lose some compiler optimizations. For instance, most C compilers can't do inline functions if caller and callee aren't in the same file. -Jonathan
bill@twwells.com (T. William Wells) (12/10/89)
In article <9228@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: : In article <1989Dec6.154103.2078@twwells.com> bill@twwells.com (T. : William Wells) writes: : >Care to name some specific systems where the linker could take : >apart an object file, and for which a reasonable C compiler : >exists? : : MPW for the Apple Macintosh. I just elaborately verified that the : linker does this for a skeptical friend. The C compiler is nearly a : full ANSI C, and it certainly does include "statics". I just did : another test, and it even deletes unused "static" functions. It's : actually a pretty strong development system overall; I've been praising : it on the net ever since I was one of the original beta testers. When I had to work on the Mac, my development environment consisted of a cross compiler from a VAX and a downloader originally written in *machine code* to be executed from BASIC! (My first program was a shell with a download command. Surprised?) There *was* no C compiler for the Mac then. I considered it a major revolution to get Manx C, with its tiny shell and a native compiler. I'm glad that things are better now. :-) --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
tim@hoptoad.uucp (Tim Maroney) (12/10/89)
In article <14850@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: >No one is talking >about eliminating unreferenced routines from the main program. Yes, we are. Get a clue. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com "There's a real world out there, with real people. Go out and play there for a while and give the Usenet sandbox a rest. It will lower your stress levels and make the world a happier place for us all." -- Gene Spafford
jef@well.UUCP (Jef Poskanzer) (12/11/89)
In the referenced message, tim@hoptoad.UUCP (Tim Maroney) wrote: }In article <14850@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: }>No one is talking }>about eliminating unreferenced routines from the main program. } }Yes, we are. Get a clue. You are right, I was wrong. Let me rephrase my statement. No one who has been paying attention to the discussion is talking about eliminating unreferenced routines from the main program. --- Jef Jef Poskanzer jef@well.sf.ca.us {ucbvax, apple, hplabs}!well!jef "...for DEATH awaits you all, with nasty sharp pointy teeth!" -- Tim
perry@apollo.HP.COM (Jim Perry) (12/12/89)
In article <5007@netcom.UUCP> hue@netcom.UUCP (Johathan Hue) writes: >In article <473ae701.20b6d@apollo.HP.COM>, perry@apollo.HP.COM (Jim Perry) writes: >> Again, you're throwing out the baby with the bathwater. External >> documentation has a fundamental flaw alluded to by dopey (no offense): it's >> not generally there, and it's out of date. "Most code comments" are also >> missing or out of date, but only because most code is poorly documented. >> As Fred Brooks says in The Mythical Man-Month: > >One of my rules is "You shouldn't have to look at the code to understand >what it does and how it works". If the external documentation isn't >there, then your programmers have no discipline and your managers are a >bunch of whimps. Maybe HP will be able to whip you Apollo boys into shape. :-) I originally wrote a fairly windy response to this, but on the whole I suspect we agree more than we disagree. For the record, my opinions as expressed here are based on all of my experiences, and don't necessarily reflect either policy or practice at HP/Apollo. Most of my opinions about good and bad code documentation were formed by experiences before I came here. Based on my experience, I've come to the conclusion that code should be completely self-documenting to the maximum degree possible. I'm not saying that there shouldn't be external documentation, but that it's unlikely, in practice, to keep up with program mutation to the extent that internal comments are. Again, this is not to say that more comments are always better, but there should be sufficient comments to completely document the code. I stand by my statement that such programs are possible, and that a programmer/engineer can use such the documentation in such a program, as formatted listing on paper or fiche, or just as source text in your favorite browsing editor, to come quickly up to speed on that program. I say this, again, from experience. I do use ascii characters to draw box-and-line diagrams, as it happens. It's not perfect, but it's better than nothing. Design specs and other external documents get fancier Interleaf graphics. - Jim Perry perry@apollo.hp.com HP/Apollo, Chelmsford MA This particularly rapid unintelligible patter isn't generally heard and if it is it doesn't matter.
hollombe@ttidca.TTI.COM (The Polymath) (12/12/89)
In article <5007@netcom.UUCP> hue@netcom.UUCP (Johathan Hue) writes: }In article <473ae701.20b6d@apollo.HP.COM>, perry@apollo.HP.COM (Jim Perry) writes: }One of my rules is "You shouldn't have to look at the code to understand }what it does and how it works". If the external documentation isn't }there, then your programmers have no discipline and your managers are a }bunch of whimps. ... Alternatively, the programmers want to keep their jobs and the managers are a bunch of non-technical slave drivers (i.e.: marketing types), who don't give a hoot about the software life cycle as long as the code gets out the door on schedule. }I once worked on a device driver and in the external documentation I drew a }state machine of how one part of it worked. Are you going to use ascii text }graphics to draw boxes and arrows and put that in your comments? I once worked for a place that actually did this. They were a gov't subcontractor and all gov't specs call for flow-charts of all the code. They wrote a program that could read FORTRAN source files and draw a flow-chart, either on a Tektronix graphics display _or_ in ASCII on a line printer. Your tax dollars at work. (-: -- The Polymath (aka: Jerry Hollombe, hollombe@ttidca.tti.com) Illegitimis non Citicorp(+)TTI Carborundum 3100 Ocean Park Blvd. (213) 450-9111, x2483 Santa Monica, CA 90405 {csun | philabs | psivax}!ttidca!hollombe
tim@hoptoad.uucp (Tim Maroney) (12/13/89)
In article <14913@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: >In the referenced message, tim@hoptoad.UUCP (Tim Maroney) wrote: >}In article <14850@well.UUCP> Jef Poskanzer <jef@well.sf.ca.us> writes: >}>No one is talking >}>about eliminating unreferenced routines from the main program. >} >}Yes, we are. Get a clue. > >You are right, I was wrong. Let me rephrase my statement. No one >who has been paying attention to the discussion is talking about >eliminating unreferenced routines from the main program. Wrong again. This has been the subject of every message bearing on the topic. Only you are saying that this is not what the discussion is about. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com "Americans will buy anything, as long as it doesn't cross the thin line between cute and demonic." -- Ian Shoales
psrc@pegasus.ATT.COM (Paul S. R. Chisholm) (12/13/89)
In articles <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman),
<9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney), <600@fred.UUCP>
bill@fred.UUCP (Bill Poitras), and <4304@pegasus.ATT.COM>
dmt@pegasus.ATT.COM (Dave Tutelman again) argue about whether their
linkers are smart enough to ignore unused names.
I'd like to point out that Borland's Turbo Pascal for MS-DOS lets
you do something very much like this: "Turbo Pascal 5.0's built-in
linker automatically removes unused code and data when building an
[executable] file. Procedures, functions, variables, and typed
constants that are part of the compilation, but never get referenced,
are removed in the [executable] file. The removal of unused code takes
place on a per procedure basis, and the removal of unused data takes
place on a per declaration section basis." (Source: Turbo Pascal 5.0
Reference Guide, p. 220.)
But Borland cheated, sort of. Turbo Pascal doesn't use the .OBJ file
format that Intel defined. Instead, each separate compilation is a
"unit", more like a library than what a C programmer would think of as
an object file. Since Borland defined what a unit looks like, they
could set it up to allow for smart linking. (TP 5.0 and 5.5 can also
link in ordinary MS-DOS .OBJ files, but I think the implication is that
these are just dragged in whole hog.) Since TP 5.5 offers object
oriented extensions, this smart linking can come in extremely handy.
I'm not sure if Pascal's block structure makes smart linking easier or
harder.
Paul S. R. Chisholm, AT&T Bell Laboratories
att!pegasus!psrc, psrc@pegasus.att.com, AT&T Mail !psrchisholm
I'm not speaking for the company, I'm just speaking my mind.
meissner@dg-rtp.dg.com (Michael Meissner) (12/14/89)
In article <600@fred.UUCP> bill@polygen.uucp (Bill Poitras) writes: | In article <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: | %In article <4290@pegasus.ATT.COM> dmt@pegasus.ATT.COM (Dave Tutelman) writes: | %%There IS one argument, in some cases a compelling one, for "one function | %%per file". In general, linkers aren't smart enough to link just | %%PART of a binary file (.OBJ or .o), when that file contains a function | %%needed by the link. | % | %WHAT? What year is this? I don't think I've ever used a linker that | %didn't eliminate unused routines. Any such linker would be seriously | %brain damaged. | %-- | %Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com | Yes you have. ANY linker you have does this. What you are thinking of is | a LIBRARY, ie. .lib file. (lib*.a if you're a Unix person), which when used in | the link process, only the functions used in the program begin linked, get | linked. Although I'm not a compiler/linker expert, I almost positive that | this is true. I suspect that most real world linkers work the way UNIX does (with regard to loading the entire contents of an object file, instead of just the 'functions' that are needed). However, if the linker is this 'helpful', it breaks a debugging paradim that we in DG langauges and also GNU C use. Mainly, you include some functions that are otherwise unused in the program that take a pointer to some internal datatype, and prints the datatype out in an implementation defined manner. For example here is a fragment of a dbx debugging session on GCC that calls the function 'debug_rtx' to print out an RTL tree (reformatted to fit in 80 columns): (2) Stopped in final (first=(rtx) 0x412010, file=(struct FILE *) 0x405820, write_symbols=NO_DEBUG, optimize=0, prescan=0), file final.c, line 537 537 insn = final_scan_insn (insn, file, write_symbols, optimize, (dbx) print insn insn = (rtx) 0x4121b8 (dbx) print debug_rtx(insn) (insn 7 6 8 (set (reg:SI 2) (symbol_ref:SI ("*@LC0"))) 89 (nil) (nil) (nil)) debug_rtx(insn) = 1 (dbx) -- -- Michael Meissner, Data General. Until 12/15: meissner@dg-rtp.DG.COM After 12/15: meissner@osf.org
decot@hpisod2.hp.com@canremote.uucp (decot@hpisod2.HP.COM) (12/21/89)
From: decot@hpisod2.HP.COM (Dave Decot)
Subj: Re: Coding standards (was Re: Programmer productivity)
Orga: Hewlett Packard, Cupertino
The questions I want answered by comments are:
What do the possible values of this variable or type
represent,
within the user's abstraction?
and
What does this code expect, and what does it assume, about the
values of its arguments and surrounding variables?
and
What, abstractly, does this code guarantee when it's done?
For instance,
int status; /* GOOD if the laser hit the target, BAD if
not */
status = zap(plane2); /* try to blast the enemy plane */
if (status == GOOD) /* the airplane was successfully
destroyed */
{
++k; /* bump the death toll */
kill(SIGUSR1, pid2); /* notify the air traffic
controller */
}
I don't care what bits you're twiddling, I want to know what it's
supposed to be for, and what it means abstractly.
The interesting thing about comments like this is you can grab them
straight out of a specification document. >> HINT!!! <<
Dave Decot
hpda!decot
---
* Via MaSNet/HST96/HST144/V32 - UN Misc. Computer Topics
* Via Usenet Newsgroup comp.misc
ts@cup.portal.com (Tim W Smith) (12/31/89)
In article <9157@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >Absolutely. Clear code shouldn't *need* a lot of comments; a >programmer should be able to read it and understand what's going on >from the routine names, the variable names, and the flow of control, >with just a few added comments if any. A lot of extraneous comments >about things that would be perfectly clear just from reading the code >actually damages code readability; the control structures become much >harder to follow. I tend to agree. When I code, I tend to comment on the difficult parts, and leave the obvious parts to speak for themselves. On the other hand, something that may be obvious to me because of my several years of experience may not be obvious to a junior member of the staff who is getting his first look at whatever arcane thing is being dealt with. So, what should one do? Include a lot of "extraneous" stuff to help the jr. guys, or should one be minimalist to avoid distracting the other sr. engineers when they look at the code? I think we came up with a pretty good solution to this problem where I work. The comments in the source code are aimed at a person who has a detailed knowledge of the field that the code is part of. For example, if the code was a Unix disk driver, it is assumed that the person reading the code is an expert on Unix disk drivers. Part of the final documentation for any code we write is something we call "left hand pages". This is a detailed commentary on the code written by the implementor of the code. We call it "left hand pages" because we print a copy of the source code and put it in a binder together with the "left hand pages" in a way that causes one to have the source code on the right and the commentary on the left. The LHPs are aimed at a much lower level. For example, if the code is for a Unix disk driver, the LHPs would assume that you can spell "Unix" and know that by "disk" we don't mean "Frisbee". Well, perhaps not quite this low a level...:-). For example, a recent Unix disk driver we did had about 45k of source code. The LHPs came to about 120k. As a side effect, we've discovered that quite a few bugs are found when the implementor is producing the LHPs. Tim Smith