[comp.arch] Universal Disassemblers vs. Univers

aglew@urbsdc.Urbana.Gould.COM (11/06/88)

>There was once a consultant at a large telecommunications company who
>was required to provide source for the products he developed.  In
>addition to preprocessing the source, he ran it through the C
>equivalent of a "jive" filter: all the variable names were combinations
>of capital O, lower-case L, and 0 and 1.  Useless!
>
>#				Thanks;
># Bill Stewart, att!ho95c!wcs, AT&T Bell Labs Holmdel NJ 1-201-949-0705
># and/or
># Shelley Rosenbaum, att!ho95c!slr, 1-201-949-3615   ho95c.att.com

I have resisted commenting on this, after pointing out that one of 
the original reasons for a universal intermediate language was
to make it difficult to steal source, but....

Do you really think that obfuscating your C will make it terribly
hard to understand your source?  After all, people disassemble
their competitors object code and binary every day. In assembler,
at least, flow of control info is hidden - even in obfuscated C
it is not, you still have labels.

Remember, "understanding" doesn't necessarily mean being able to use
the whole program. It might mean just going in and looking for wired
in limits, that you can exceed, to make your competitors code look
bad on benchmarks, or exploit for security holes. It might just
mean tracing out a spreadsheet's recalculation algorithm, to steal
proprietary ideas.

Besides, if you obfuscate the variable names, I can unobfuscate
them, and convert them to single letter identifiers. Then it would
be just like reading UNIX source! :-)

Inline substitution of functions is the best way to make
reverse engineering really difficult - with optimization to
remove unnecessary parts in context.

henry@utzoo.uucp (Henry Spencer) (11/08/88)

In article <28200227@urbsdc> aglew@urbsdc.Urbana.Gould.COM writes:
>Do you really think that obfuscating your C will make it terribly
>hard to understand your source?  After all, people disassemble
>their competitors object code and binary every day. In assembler,
>at least, flow of control info is hidden - even in obfuscated C
>it is not, you still have labels.

Uh, assembler has labels too, and a good disassembler will insert them.
It's not that hard to sort out flow of control from a binary.  Obfuscated
C probably won't be as hard to understand as binaries resulting from a
heavily-optimizing compiler, but it could come pretty close to equalling
assembler-derived binaries, and the industry seems to accept those as an
acceptable method of software distribution.  (Comparisons should be
against realistic alternatives, not hypothetical perfection.)
-- 
The Earth is our mother.        |    Henry Spencer at U of Toronto Zoology
Our nine months are up.         |uunet!attcan!utzoo!henry henry@zoo.toronto.edu