dg@wrs.UUCP (David Goodenough) (10/16/87)
In article <117@nusdhub.UUCP> rwhite@nusdhub.UUCP (Robert C. White Jr.) writes: > First, a linker is a "linking loader" and it's only REAL >purpose is to resolve refrences and setup dynamic relocation >information. ALL "short" jumps [add/subtract value from the >instruction pointer directly] are compleetly generated and closed >in the assembler. If there were to be an alteration in the size >of a block of code-text, the linker would have to "reassemble" the >code block to make up for the altered "short" jumps and such. In >order for the linker to do this it would either need the source >code, or it would have to take a stab at disassembeling the object >code and hope not to get it wrong. Not true - every (useful) assembler out there can generate information that a given word needs to be relocated in some way, so why not just have two types or relocation: far (for unresolved, long distance) and near (for resolved, short distance) > The assembler is designed to AUTOMATICALLY determine the >"near"ness or "far"ness of a refrence. This works beautfuly as >long as you act like a pascal programmer and only use backward >refrences. For the forward refrences you must use a keyword like >"near" or "far" or else take what you get, and hope what you get >is close enough to work. Wrong again - I have seen many assemblers in my time, and only one was a single pass animal (and it was a real sharp piece of software - had to maintain two symbol tables: one for defined labels, and another for unresolved references). Think about it for half a second, you have to do a second pass to resolve all the forward references anyway, so all you do is add a link field to your symbol table, linking the symbols together in address order, then whenever you munge a far / near branch, just run up the chain adjusting all the references (this actually requires a third pass or O N^2 time based on the number of symbols). > It should be obvious that the assemblers job is to assemble >and the linkers job is to link. Open to discussion, I added the -X flag to my assembler to do an assemble and link all in one go on a source - saves some time as I get away with three passes (two for the assembly and one for linkage) as opposed to four (two each for the separate assembly and linkage) -- dg@wrs.UUCP - David Goodenough ..... !rtech!--\ >--!wrs!dg ..... !sun!--/ +---+ | +-+-+ +-+-+ | +---+
rwhite@nusdhub.UUCP (Robert C. White Jr.) (10/18/87)
As far as you coments on my coments on the assembeler, I must cede to you greater knowlege. My only execption to your statments is: When I used "SHORT" you started talking about "NEAR" and "FAR" under the intel [grabag] there is an instruction which simply adds/ subtracts a signed byte from the instruction pointer. In fact ALL the conditional branching in a x86 family are these types of things. The simbol table for load-time-fixups, not runtime relocation, would be horrendus. Since we were talking about shrinking and growing the operations between word and dword parameters for every FAR refrence which became NEAR durring linking. [I.E. Why, that's right over here, I don't need a segment on this refrence.] the problems become numerous. 1) all the conditional branches will have to have their inline constants checked for validity, and each "short" label, which the assembeler now disposes of, will have to be placed in the external symbol table. 2) The linker will have to look at the entire body of code, and determine if any of the internal or external refrences to a procedure [i.e. use of the "call" opp.] are actually far. If none are, then the linker can go ahead and change all the call opps, and the return opps to nolonger contain a refrence to a segment. It must then: a) scan the body of the call for any segment overides refering to the code segment, and trun them into NEAR opps. b) Scan the body of the text, and remove some one of the negitive-of-frame-pops, because the frame has been shrunken by one word. This scaning would have to include itterated loops and simple addidtion of constatnts to the stack pointer or base pointer, or any regester or memory location which assumes the value of the above durring the functioning of the call, or any of it's branches or children. <Jumps arn't This bad, but they have their own stickeyness about their selves to complicate these "fixups"> ..... [stuff and issues ad nausium deleted] ..... My point was that any assembeler worth it's salt, in a similarly valued programmer's hands can MUCH better serve by the judicious use of "NEAR" and "FAR" rathar than screaming about how easy it would be to make a linker which would "take a far call, turn it into a near call, and then delete the no-ops to tighten the code" Yes, Assembelers send all the pertinent information to the linker, and yes the linker can do amazing things with it, and yes, Assembelers can do all sorts of optomizing things to assembly code. That was part of my point. But when someone tells me that they want to have a "Linker" go through the output of their various compilers and assembelers and have it re-optomize their code, rigt down to removing all those peskey extra no-ops and over-refrenced calls, I SCOFF. Isn't that what I bought my compiler and assembeler for? <sort of> If there is such a product, a "disassembeler-optomizer-assembeler-linker" which can work with "any object output, no matter what the source" and optomize it to such a persnicity level, there wont have to be any programmers, because the language it was writen in will take instructions like "make me an inventory cost-accounting database package costomized to my needs.<CR>" as a fully functional programming directive. I say: Write one for me, and I'll see how well it works. Rob.
daveb@geac.UUCP (10/19/87)
A very reasonable (and reasoned) discussion between David Goodenough and Robert C. White Jr., et all, prompts me to paraphrase Christopher Fraser thusly: It is noticeable that assemblers do a lot of address resolution, passing resolution on to a linker in cases where the information in the assembler file is insufficient. The linker, in turn, does a lot of machine dependent loading. It is a good idea to teach the assembler to be just a single-pass translator, passing all resolution off to the linker. It is admitted that this means identifying internal -vs- external symbols to the linker so it does not mis-link an internal label "i" to an external one or vice versa. It is now apparent that the linking algorithm is a merge-sort, given a properly ordered set of linkable objects. It is admitted that this ordering may require a topological sort to properly order the commands in the linkage control-file. It is finally apparent that the algorithm is portable to many (but not all) machines, with the machine-dependent portion placed in a separate loader. This is not to say that the classical form of linking is wrong, merely that it is historical... --dave -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.