news@brl-adm.UUCP (04/17/87)
> I just had an interresting thought. There is such a thing >as a dis-assembler, but is it possible to write an un-loader. In >other words, I'd like to have a tool were I can break up an executable >into the object files that it was loaded with, i.e. un-load it. With >such a tool, a computer sight with only executables can un-load a >program and then re-load it with there own version of a subroutine. Your thought about an "un-loader" is indeed interesting, however, given current machine architectures and the current format of executable modules, such a device would be of limited capability. The process of linking individual relocatable binary modules into a single (or set of overlay) absolute executable modules reduces the total amount of information in the system. The loss of information makes the process irreversable. Each relocatable object module contains information identifying the location of externally referenced and globally declared addresses and whether they are in the data or text ("code" to us pre-unix oldtimers) sections. Also the size of the instruction field which references the reloctable addresses is encoded into the relocatable object module. When the individual modules are linked, much of this information is discarded. The loss of information makes it impossible to re-create the original environment. As a demonstration, consider that you have the following Motorola 68000 assembly language code in a source module: EXTERNAL EXTNVAR . . TEXT ;BEGINNING OF TEXT (CODE) SECTION MOVE.L #$0011FE,D3 ;LOAD CONSTANT INTO D3 REGISTER . . MOVE.L #EXTNVAR,D3 ;LOAD ADDRESS OF EXTERNAL VARIABLE . . DATA ;BEGINNING OF DATA SECTION DC.W 0 ;16 BIT CONSTANT DC.W #0011FE ;16 BIT CONSTANT . . DC.W 0 ;16 BIT CONSTANT DC.W EXTNVAR ;16 BIT RELOCATABLE CONSTANT DC.L EXTNVAR ;32 BIT RELOCATABLE CONSTANT Now consider that it is possible that when loaded, the external variable EXTNVAR could reside at address $0011FE. The relocatable object module contains encoded information that identifies the immediate operand field of the MOVE.W #EXTNVAR,D3 instruction as a reference to an external variable whose address is to be determined at load time. After the executable module has been created, the relocation information is discarded and it is now impossible to distinquish the two MOVE.W instructions, one of which is an constant value, and the other is relocatable. The constants in the data section create a similar problem. Is the first zero constant part of the 0011FE? Is the second zero constant part of the "DC.W EXTNVAR"? The text section is relatively easy to decode (via dis-assemblers) because of the well known structure of the instructions, however, the data section has no well known structure and its structure cannot always be totally determined from the information in the text section. Note that dis-assemblers also suffer from the loss of information syndrome and sometime err in their interpretations. Also note that dis-assemblers do not attempt to describe the true structure of the data section, whereas an un-loader must! You could use an "un-loader" type of program to identify the places where a specific number appears, however, you must examine the context to determine whether it is constant or an address and how many bits in the reference. The context examination is not a trivial task (expecially in the data section) and may be indeterminant. As a partial solution to the problem, UNIX loaders have an option to retain the relocation information in the loaded module, but most object modules distributed to users do not have this information in them. I have touched on the most obvious problems, there may be others which are equally intractable. In summary the loss of total information in the system during the loading process makes the creation of a perfect un-loader impossible, and a useable one very difficult. D.A.R.Y.L. Daryl Crandall 1820 Dolley Madison Blvd. McLean, VA 22102 daryl@gateway.mitre.org (703) 883-7278 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -