wwho@ucdavis.edu (W. Wilson Ho) (09/06/90)
I am looking for any information related to disassembling object code into assembly langauge or even higher-level language such as C. Would someone please give me pointers to program sources, documentation or papers related to this? Thanks in advance! W. Wilson Ho | INTERNET: how@ivy.ucdavis.edu Division of Computer Science | UUCP: ...!ucbvax!ucdavis!ivy!how EECS Department | BITNET: wwho@ucdavis.bitnet University of California | Davis, CA 95616 | [Turning object code back into assembler is pretty straightforward, and every debugger does it. Someone else asked about disassembling into higher level languages a little while ago, but I didn't see any responses. -John] -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.
hankd@dynamo.ecn.purdue.edu (Hank Dietz) (09/10/90)
In article <HOW.90Sep5173755@sundrops.ucdavis.edu> you write: > I am looking for any information related to disassembling >object code into assembly langauge or even higher-level language such >as C. Would someone please give me pointers to program sources, >documentation or papers related to this? Basic disassembly is trivial, particularly if you have an object module with a name list. The interesting problems are: [1] Determining which portions of a raw memory image are code and which are data. Typically, this is done by providing a set of code entry points and having the disassembler trace program flow marking each word with type information as each flow path is followed. [2] Dealing with self-modifying code. At least the technique of [1] can detect when this might happen.... I don't know of any reasonable way to deal with it. Notice that indirect jump tables are particularly difficult to flow trace (see [1]), as are techniques which use a Call instruction but follow the instruction with the argument values (raw data) and tweak the return address appropriately (as in some threaded interpreters). Notice that knowing that the code image came from a particular compiler can make these problems much easier to deal with, since you can simply recognize the compiler's code generation idiom. -hankd@ecn.purdue.edu PS: Back around 1981-2 I did a flow analyzing disassembler for several then-popular microprocessors (e.g., 8080). I still have it, but it really isn't very impressive... especially when it hits some of those problem cases noted above (e.g., PCHL). -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.