eric@snark.UUCP (Eric S. Raymond) (10/17/88)
In article <7226@ihlpl.att.com>, knudsen@ihlpl.ATT.COM (Knudsen) writes: >In article <e6m10#2eDFfC=eric@snark.UUCP>, I write: >>Excuse me, but I thought the security problem in for-sale software was to >>guard it from unauthorized *copying* and *use*, not unauthorized >>*understanding. > > Well, some vendors are afraid of people (competitors?) understanding > their code. And distributing a uMIIL isn't going to make automatic disassembly *easier*? Long ago, in my pre-UNIX days, I once started writing a smart disassembler for 8086 code, one that would recognize illegal instructions, do flow-of-control analysis on jumps and assign symbolic labels (then allow you to change the names to meaningful ones). It would recognize and interpret OS service calls, so you'd be able to spot I/O subroutines at a glance. It would keep its deductions and your comments on them in a database so you could analyze code interactively in stages. The Cracker, I called it. You'd sic this thing on a binary, watch the listings it generated, and add comments through it. The end product; a text database which, when merged against the binary through the cracker, would produce a neat commented listing. What stopped me? Well, I got this 68010 UNIX box (which is now dying and being replaced by an 80386). Suddenly cracking 8086 machine code didn't look very interesting anymore...but if the code for both machines had been distributed in a uMIIL, I would have had lots of incentive to *finish* the cracker. And then I'd have given it to the world. BBSs would start swapping comment/label databases for popular programs. And the uMIIL-using manufacturers' code would suddenly be naked, stripped of whatever dubious prtection the uMIIL was supposed to get them. Now, even if (in this uMIIL-using alternate reality) *I'd* never finished a uMIIL Cracker *someone would have*! Machine-language distribution doesn't concentrate the incentive to produce such a program the way a uMIIL would, because in a uMIIL wotld the program would only have to be done *once*. What price 'knowledge security' then? No, manufacturers are better off without a uMIIL and *with* multiple barriers to code-cracking. P.S. on the Cracker concept: Does anyone know of something like this having been actually implemented? Notice that all the code except the single-instruction disassembler would have been machine-independent; plug in a new such routine, and you support a new instruction set. Since the only code the Cracker would need to be smart about was control transfers and service-request traps, I even thought about trying to make it table-driven from some kind of instruction-set description language. You know, maybe I *should* go back and finish it, just as an interesting research problem of course... -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
usenet@cps3xx.UUCP (Usenet file owner) (10/18/88)
In article <e8amX#27Cbjc=eric@snark.UUCP>, Eric S. Raymond (eric@snark.uucp) writes: > [...] >Long ago, in my pre-UNIX days, I once started writing a smart disassembler for >8086 code, one that would recognize illegal instructions, do flow-of-control >analysis on jumps and assign symbolic labels (then allow you to change the >names to meaningful ones). It would recognize and interpret OS service calls, >so you'd be able to spot I/O subroutines at a glance. It would keep its >deductions and your comments on them in a database so you could analyze code >interactively in stages. The Cracker, I called it. > > [... stuff about implications deleted ...] > >Does anyone know of something like this having been actually implemented? > There's a program called "Sourceror" on the Apple ][ series, by Glen Bredon I think. It came with the Big Mac/Merlin assemblers. It knew about all Apple's ROM calls (and many DOS calls), most global variables, the 6502 instruction set, and the "Sweet-16" pseudo-code instruction set. It didn't add comments to the code, but it did take care of things like assigning symbolic names to labels, etc. which helped a lot if you wanted to understand programs (I used it on Applesoft BASIC, for example). It's a good start, anyway.... Anton Rang rang@cpswh.cps.msu.edu +---------------------------+------------------------+----------------------+ | Anton Rang (grad student) | "UNIX: Just Say No!" | "Do worry...be SAD!" | | Michigan State University | rang@cpswh.cps.msu.edu | -- Jill Belscamper | +---------------------------+------------------------+----------------------+
bcase@cup.portal.com (Brian bcase Case) (10/20/88)
In article <7226@ihlpl.att.com>, knudsen@ihlpl.ATT.COM (Knudsen) writes: >And distributing a uMIIL isn't going to make automatic disassembly *easier*? This, I think, is the one real hurdle is getting a the MIIL concept accepted. >I once started writing a smart disassembler for 8086 code, ... recognize >illegal instructions, flow-of-control analysis on jumps and assign symbolic >labels. It would recognize and interpret OS service calls, so you'd be able >to spot I/O subroutines at a glance. It would keep its deductions and your >comments on them in a database so you could analyze code interactively in >stages. The Cracker, I called it. You have described "MacNosey" for the Mac by Jasik Designs! Check it out. >...but if the code for both machines had been distributed >in a uMIIL, I would have had lots of incentive to *finish* the cracker. >And then I'd have given it to the world. And the uMIIL-using >manufacturers' code would suddenly be naked, stripped of protection... Yes, this is the problem. But the point of a MIIL is to prevent obsolete software, not prevent reverse engineering. However, the prevention of reverse engineering will probably be required to gain the kind of acceptance it needs to make an appreciable impact.
pardo@june.cs.washington.edu (David Keppel) (10/21/88)
bcase@cup.portal.com (Brian bcase Case) writes: >knudsen@ihlpl.ATT.COM (Knudsen) writes: >>And distributing a uMIIL isn't going to make automatic disassembly *easier*? >This, I think, is the one real hurdle is getting a the MIIL concept accepted. I think the nub of the matter is that it makes disassembly more *useful*, not any easier. I claim that I can distribute C code to my programs and it is completely useless. I gave an example of this quite a while back. I need to do things such as: * Rename all variables. * Hoist (inline) functions. * Do loop transformations (e.g. for() loop to a goto loop). * Strip out all comments. * Run the preprocessor to remove #ifdefs (Is this the same value "4" that appeared in the line before, or are they unrelated?) * Avoid standard libraries. * Do code motion. * Delcare wasted variables, dead code, unoptimize code that an optimizer can put back together again later, ... Essentially, preform all the optimizations that I can on the C source, and steal liberally from the Obfusacted C Code Contest. Consider the following (well-formated) program. What does it do? extern struct _a7F9a1Xs3 { int _a7F6a1Xs3; char *_a7G9a1xs3; char *_a7G6a1xs3; int _a7G6a1xs7; short _a7F9a1xs7; char _a7F9a1xf7; } _iob[3]; main(_a7F9a1xf3, _a7F61axf3) int _a7F9a1xf3; char *_a7F61axf3[]; { int _a7G61asf3, _a7G61faf3; goto _a7G61afx3; _a7G61afs3: exit(0), _a7G61asf3&=(0x10)+1; _a7G61afx3: ((_a7G61asf3=(--((&_iob[0]))->_a7F6a1Xs3>=0 ? *((&_iob[0]))->_a7G9a1xs3++&0377 :_filbuf((&_iob[0])))) !=(-1)); if (_a7G61asf3*(3-1)==(0-2)) goto _a7G61afs3; (--((&_iob[1]))->_a7F6a1Xs3>=0 ? ((int)(*((&_iob[1]))->_a7G9a1xs3++=(unsigned)(_a7G61asf3))) :_flsbuf((unsigned)(_a7G61asf3),(&_iob[1]))); goto _a7G61afx3; _a7G71afs3: (--((&_iob[1]))->_a7F6a1Xs3>=0 ? ((int)(*((&_iob[1]))->_a7G9a1xs3++=(unsigned)(_a7G61asf3))) :_flsbuf((unsigned)(_a7G61asf3),(&_iob[1]))); exit(1); } Did you guess: #include <stdio.h> main(argc, argv) int argc; char *argv[]; { int c; while ((c=getchar())!=EOF) putchar(c); } Enough. ;-D on ( Throw a monkey in the wrench ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
lyndon@nexus.ca (Lyndon Nerenberg) (10/21/88)
In article <e8amX#27Cbjc=eric@snark.UUCP>, eric@snark (Eric S. Raymond) writes: >P.S. on the Cracker concept: > >Does anyone know of something like this having been actually implemented? Yes indeed! Some friends had this running on an Amdahl under the Michigan Terminal System back around 1980. The disassembler was part of a program called Glass that ran on 3270 terminals. Glass was a "window" onto your virtual memory space. By using the PF keys, you could "page" forward/backward in memory, or jump to an arbitrary address. You were also able to toggle between three display modes: EBCDIC, hex, and disassembly. Glass was aware of the loader's symble table lookup conventions, therefore it was capable of inserting symbolic names for system subroutines and entry points into user loaded code. There was talk of adding knowledge of symbolic debugger load records, but this never got implemented. The program was (apparently) inspired by a similar utility found floating around SHARE someplace. Oh yes, you could also use Glass to modify memory contents by setting the display to hex mode, making changes to the screen, and hitting ENTER to write the changes. There was also an undocumented PF key combination that would invoke spells to nuke the hardware protection bits. Given that the entire OS resided in shared virtual memory, this feature contributed to some rather interesting evenings ... :-) *** INLOOP PROTECTION TROUBLE SNARK *** HELP! SNARK IN MTS! --lyndon --
eric@snark.UUCP (Eric S. Raymond) (10/21/88)
In <10191@cup.portal.com>, bcase@cup.portal.com (Brian bcase Case) writes: > In article <7226@ihlpl.att.com>, knudsen@ihlpl.ATT.COM (Knudsen) writes: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >And distributing a uMIIL isn't going to make automatic disassembly *easier*? Eh? I wrote what he's quoting, and I am *not* "knudsen@ihlpl.ATT.COM". > You have described "MacNosey" for the Mac by Jasik Designs! Check it out. Interesting...does anybody know of a similar product for Intel-family machines? > Yes, this is the problem. But the point of a MIIL is to prevent obsolete > software, not prevent reverse engineering. My article was in reply to a claim that a uMIIL would somehow offer better security against what we politely call "reverse engineering" than current machine code. > However, the prevention of > reverse engineering will probably be required to gain the kind of acceptance > it needs to make an appreciable impact. True. And this raises an insuperable problem for uMIIL proponents, because "preventing reverse engineering" and "easy portability" are diametrically opposing goals. The uMIIL concept seems to me to be a particularly ill-thought-ought stab at serving both masters -- but machine-coded proprietary software already well meets the requirements of the former and HLL source is pretty good for the latter (modulo OS-standardization problems that any uMIIL will *also* have). In case it isn't obvious, I think this is yet another reason to class the whole notion of uMIIL as a distracting red herring and forget it. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
paul@unisoft.UUCP (n) (10/21/88)
Subject: Re: Universal Disassemblers vs. Universal MIILs Isn't a 'Universal Disassembler' what the nanotechnology people use for reverse engineering a competitor's products? :-) Paul -- Paul Campbell, UniSoft Corp. 6121 Hollis, Emeryville, Ca ..ucbvax!unisoft!paul Nothing here represents the opinions of UniSoft or its employees (except me) "Gorbachev is returning to the heritage of the great Lenin" - Ronald Reagan 1988 (then the Wasington Post attacked RR [from the right] for being a Leninist)
fox@marlow.uucp (Paul Fox) (10/25/88)
In article <e8amX#27Cbjc=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes: >In article <7226@ihlpl.att.com>, knudsen@ihlpl.ATT.COM (Knudsen) writes: >>In article <e6m10#2eDFfC=eric@snark.UUCP>, I write: > >P.S. on the Cracker concept: > >Does anyone know of something like this having been actually implemented? > Yes - I did this once. It was for Z-80 machine code, and I did it for a Z-80 ICE for which I needed to extend its functionality. It was pretty easy, and it was command line driven. (You would create shell scripts containing the long command lines). It allowed you to do things like specify what the RST instructions were for, and allowed things like having some of the RST instructions being followed by a byte of sub-opcode; It allowed you to add labels (although not comments for particular lines). Thus as you understood what parts of the code were doing you could tell it the labels to use, and any references to that address would come out symbolically. Also, since its diffcult to make the machine decide whether something is code or data, it allowed you to mark selected areas as being tables and thus avoid disassembling it. ===================== // o All opinions are my own. (O) ( ) The powers that be ... / \_____( ) o \ | /\____\__/ Tel: +44 628 891313 x. 212 _/_/ _/_/ UUCP: fox@marlow.uucp
bpendlet@esunix.UUCP (Bob Pendleton) (10/27/88)
From article <6152@june.cs.washington.edu>, by pardo@june.cs.washington.edu (David Keppel): > I claim that I can distribute C code to my programs and it is > completely useless. I gave an example of this quite a while back. > I need to do things such as: > > * Rename all variables. > * Hoist (inline) functions. > * Do loop transformations (e.g. for() loop to a goto loop). > * Strip out all comments. > * Run the preprocessor to remove #ifdefs (Is this the same > value "4" that appeared in the line before, or are they > unrelated?) > * Avoid standard libraries. Why? > * Do code motion. > * Delcare wasted variables, dead code, unoptimize code that > an optimizer can put back together again later, ... Again why? > Essentially, preform all the optimizations that I can on the C source, > and steal liberally from the Obfusacted C Code Contest. Ignoring the deliberate obfuscation this gives you source code that a fairly dumb compiler can convert to reasonably good object code. One trouble with it is that it is portable, but not machine independent. It can only become machine dependent by establishing a standard for the sizes of all data types and the semantics of Cs "defined to be undefined" operator/operand pairs. The MIIL cannot be C because C is not machine independent. Another problem with using C as an MIIL is that the only subroutine calling conventions and scoping rules that can be efficiently represented in C are those of C. the scoping rules and subroutine linking mechanisms of languages like MODULA-2 and LISP do not map well onto C. Maybe this discussion will get a little farther if we drop the "Intermediate Language" part of MIIL and try looking at it as a MISDL (Machine Independent Software Distribution Language). Is it safe to even try to talk about a machine independent dialect of C? With extensions that provide low level mechanisms to allow several different subroutine linking and scoping rules to be implemented efficiently? BTW, a quick pass with an editor to convert all your hard to read names into short names like i1 for ints and c3 for chars makes your example a lot easier to read. It's the macro expansions that make it hard to follow. Bob P. -- Bob Pendleton, speaking only for myself. An average hammer is better for driving nails than a superior wrench. When your only tool is a hammer, everything starts looking like a nail. UUCP Address: decwrl!esunix!bpendlet or utah-cs!esunix!bpendlet
wcs@alice.UUCP (Bill Stewart, usually) (11/04/88)
In article <6152@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes:
:I think the nub of the matter is that it makes disassembly more
:*useful*, not any easier.
:I claim that I can distribute C code to my programs and it is
:completely useless. I gave an example of this quite a while back.
:I need to do things such as:
:* Rename all variables.
:* Strip out all comments.
:* Avoid standard libraries.
:* Do code motion.
:[.......]
:Essentially, preform all the optimizations that I can on the C source,
:and steal liberally from the Obfusacted C Code Contest. Consider the
There was once a consultant at a large telecommunications company who
was required to provide source for the products he developed. In
addition to preprocessing the source, he ran it through the C
equivalent of a "jive" filter: all the variable names were combinations
of capital O, lower-case L, and 0 and 1. Useless!
--
# Thanks;
# Bill Stewart, att!ho95c!wcs, AT&T Bell Labs Holmdel NJ 1-201-949-0705
# and/or
# Shelley Rosenbaum, att!ho95c!slr, 1-201-949-3615 ho95c.att.com