pete@octopus.UUCP (Pete Holzmann) (07/15/88)
In article <2274@sugar.UUCP> karl@sugar.UUCP (Karl Lehenbauer) writes: >[I wrote] >> ...There are plenty of >> PC-based tools for binary analysis that can be quickly run over reasonble- >> sized programs (and slowly run over big programs)... >> automatic disassemblers that produce comments for anything that touches >> the external environment (system memory, I/O ports, interrupts, system >> calls), etc... >Well, I assume automatic dissasemblers blow off stuff they don't understand >and just .DATA it or whatever as binary data. It would be no problem for a >Trojan Horse to decrypt the portion of itself that actually trashes your >system when it has decided that the time is right. That way, string searches >and code that looks for anything that touches system memory, I/O ports, >interrupts, system calls, etc, will fail to locate the Trojan. More clever >variations can be envisioned in which the encrypted part, or code that >generates the code to do the trashing, etc, etc appears to be something >useful, or at least seems too complicated to bother to decipher. Let me first say that I am sure it *is* possible to confuse the heck out of a great disassembler, even if it is human :-)! The same goes for source code. And my response to either is the same: If I can't understand what the code is doing, or if it looks 'funny', I don't trust it. As far as your examples go, a good disassembler (Sourcer for the PC is a pretty good one, for example) keeps working at the program until it understands everything that is at all normal, and *marks* everything that isn't. Thus: - all code is simulated [simple simulation, but enough to handle most requirements] enough to trace every path of existing code. [Encrypted code is not properly disassembled, but...] - all destinations of flow control transfer are marked. If the destination is a data area, that's a heavy duty flag for wierd code [either we're dealing with encrypted code, self- modifying code, or something similar] The job is much harder on an Intel-architecture CPU than for a good CPU :-). The addressing is rather strange, with overlapped address bits from the segment and offset registers for the current instruction address. From experience, the examples you gave *would* be found by a good disassembler. Actually, what gives it the worst fits is inline data following a subroutine call. It marks everything, but in order to get a usable disassembly, I've got to go through and fix up the markers by hand. Fortunately, normal languages don't do this. MY NEWS.ADMIN POINT: Sure, somebody could trick me. But the resources I have available for verifying binary programs [not the least of which is trusting the members of the net] make me as confident about using binaries as about using source code. Followups about disassembler technology should probably be redirected to comp.arch or something. -- OOO __| ___ Peter Holzmann, Octopus Enterprises OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014 OOOOO \___/ UUCP: {hpda,pyramid}!octopus!pete ___| \_____ Phone: 408/996-7746