[comp.software-eng] reverse engineering report

kmagel@plains.NoDak.edu (ken magel) (02/11/91)

     Here are the informative replies I have received thus far to my query 
concerning the practices of reverse engineering.  Several people mentioned that
they would report or ask colleagues to report on actual reverse engineering 
efforts, but I have not yet received anything on those efforts.  If I do, I 
will post another summary.

From byrne@ksuvax1.cis.ksu.edu Mon Feb  4 12:47:41 1991
Received: from harris.cis.ksu.edu by plains.NoDak.edu; Mon, 4 Feb 91 12:47:06 -0600
Return-Path:  <byrne@ksuvax1.cis.ksu.edu>
Received: from ksuvax1.cis.ksu.edu by harris.cis.ksu.edu 
		(5.58/SWH-2.03); id AA15726; Mon, 4 Feb 91 12:46:12 CST
Received: by ksuvax1.cis.ksu.edu 
	(5.59++/CIS1.1) id AA01405; Mon, 4 Feb 91 12:46:05 CST
Date: Mon, 4 Feb 91 12:46:05 CST
From: byrne@ksuvax1.cis.ksu.edu (Eric J. Byrne)
Message-Id: <9102041846.AA01405@ksuvax1.cis.ksu.edu>
To: kmagel@plains.NoDak.edu
Subject: reverse engineering refs
Status: R


    I saw your request on the net for reverse engineering references.
  Reverse engineering seems to mean different things to different people,
  and there are a variety of reasons for doing it.  Here are some references
  that I have collected.  They range around the topic of reverse engineering,
  but some stray a bit.  The notes on each are mine and may or may not be
  completely accurate.  These refs are not in any kind of order, other than
  random ( if you consider that an order).

  Naturally, I hope to see a summary of what you receive.

   - Eric

#############
TITLE:   Reverse Engineering and Design Recovery: A Taxonomy
AUTHOR:   Chikofsky, Elliot J. and Cross, James H.
SOURCE:   IEEE Software
VOL:   7
NO:   1
DATE:   January 1990
PAGES:   13 - 17

 This article defines and relates six terms: reverse engineering, forward
 engineering, redocumentation, design recovery, restructuring, and
 reengineering.


TITLE:   A Knowledge-Based Approach to the Analysis of Code and Program
         Design Language (PDL)
AUTHOR:   Das, Bikas K.
SOURCE:   Conference on Software Maintenance
DATE:   October 16-19, 1989
PAGES:   290 - 296

 This paper presents a knowledge-based technique for understanding
 programs ( PDL and corresponding code ) in terms of their plans.
 Also understanding code supports certain QA activities.

TITLE:   Program Recognition
AUTHOR:   Ourston, Dirk
SOURCE:   IEEE Expert
VOL:   4
NO:   4
DATE:   Winter 1989
PAGES:   36 - 49

 Reviews in some detail three systems in program recognition research -
 the Program Recognizer, Talus, and Proust.  Gives strengths and limitations
 of each system.


TITLE:   Recognizing Design Decisions in Programs
AUTHOR:   Rugaber, Spencer,  Ornburn, Stephen B., and LeBlanc, Richard J.
SOURCE:   IEEE Software
VOL:   7
NO:   1
DATE:   January 1990
PAGES:   46 - 54

 This articles discussions design decisions and their effect on
 source code.  Discusses how to characterize decisions and how
 to find them.


TITLE:   Using Function Abstraction to Understand Program Behavior
AUTHOR:   Hausler, Philip A., Pleszkoch, Mark G., Linger, Richard C., and
          Hevner, Alan R.
SOURCE:   IEEE Software
VOL:   7
NO:   1
DATE:   January 1990
PAGES:   55 - 63

 Discusses proposed characteristics and techniques of an automated
 system for function abstraction.  Goal of function abstraction is
 to extract busines rules ( requirements ) from code and express them
 in nonprocedural terms for inspection and analysis.


TITLE:   Knowledge-Based Program Analysis
AUTHOR:   Harandi, Mehdi T., and Ning, Jim Q.
SOURCE:   IEEE Software
VOL:   7
NO:   1
DATE:   January 1990
PAGES:   74 - 81

 Describes PAT, a support tool for program maintenance.  It uses an
 object-oriented framework of programming concepts and a heuristic-based
 concept-recognition mechanism to understand programs.


TITLE:   Recognizing a Program's Design: A Graph-Parsing Approach
AUTHOR:   Rich, Charles, and Wills, Linda M.
SOURCE:   IEEE Software
VOL:   7
NO:   1
DATE:   January 1990
PAGES:   82 - 89

 Describes the Recognizer, a program that automatically finds all
 occurances of a given set of cliches in a program and builds a
 description of that program.  Also discusses difficulties with
 recognizing cliches.


TITLE:   A Reverse Engineering Methodology to Reconstruct Hierarchical
         Data Flow Diagrams For Software Maintenance
AUTHOR:   Benedusi, P., Cimitile, A., and De Carlini, U.
SOURCE:   Conference on Software Maintenance
DATE:   October 16-19, 1989
PAGES:   180 - 189

 Describes a methodology used to produce from code a hierarchy of
 Data Flow Diagrams (DFDs) at different levels of abstraction.


TITLE:   Design Recovery for Maintenance and Reuse
AUTHOR:   Biggerstaff, Ted J.
SOURCE:   IEEE Computer
VOL:   22
NO:   7
DATE:   July 1989
PAGES:   36 - 49

 This article discusses design recovery, proposes an architecture
 to implement the concept, illustrates how the architecture operates,
 describes the progress toward implementing it.  Architecture is
 based on a domain model that can detect instances of known procedural
 entities.


TITLE:   Understanding and Documenting Programs
AUTHOR:   Basili, Victor R., and Mills, Harlan D.
SOURCE:   IEEE Transactions on Software Engineering
VOL:   SE-8
NO:   3
DATE:   May 1982
PAGES:   270 - 283

 This paper reports on an experiment in trying to understand an unfamiliar
 program.  The program was re-structured and a specification and correctness
 proof were developed for it.  The techniques used included function
 specification, the discovery of loop invariants, case analysis, and the
 use of a bounded indeterminate auxiliary variable.


TITLE:   The Retrospective Introduction of Abstraction into Software
AUTHOR:   Colbrook, A., and Smythe, C.
SOURCE:   Conference on Software Maintenance
DATE:   October 16-19, 1989
PAGES:   166 - 173

 A technique is proposed which facilitates the retrospective introduction
 of abstract data types into existing systems and the corresponding
 software tool to aid this process is presented.  The purpose of this
 paper is to show that it is possible to take the original source code
 data structures and to remap them onto a set of more rigidly defined
 and understood data structures.


TITLE:   Software Renewal: A Case Study
AUTHOR:   Sneed, Harry M.
SOURCE:   IEEE Software
VOL:   1
NO:   3
DATE:   July 1984
PAGES:   56 - 63

 Describes a design recovery effort to re-document an existing system and
 develop an environment for future maintenance of the system.  Gives a good
 description of the recovery steps and their interrelationships and the products
 produced.  Techniques and problems are skimmed over.


TITLE:   Using Modern Design Practices To Upgrade Aging Software Systems
AUTHOR:   Britcher, Robert N., and Craig, James J.
SOURCE:   IEEE Software
VOL:   3
NO:   3
DATE:   May 1986
PAGES:   16 - 24

 Gives experiences by IBM in upgrading FAA's NAS en route software. Existing
 software was abstracted to a mathematical PDL notation.  This level was
 redesigned and reimplemented.  Only 100,000 LOC out of the 1.5 million LOC
 system were redesigned and 52,000 new LOC was created.  Having to redesign
 to function with existing code increased the difficulty of the task.


TITLE:   Maintenance and Porting of Software by Design Recovery
AUTHOR:   Arango, Guillermo,  Baxter, Ira,  Freeman, Peter, and
          Pidgeon, Christopher
SOURCE:   Conference on Software Maintenance
DATE:   1985
PAGES:   42 - 49

 Proposes a method for the re-implementation of programs by recovering
 the design of such programs, and using the recovered design, to
 re-implement the program in new environments (porting), with different
 functionality (maintenance), or with different performance (enhancement).
 The method is intergrated with the Draco development paradigm.


AUTHOR:   Ulrich, William
TITLE:   Re-engineering vs. Reverse Engineering
SOURCE:   Software Magazine
VOL:   8
NO:   11
DATE:   September 1988
PAGES:   8,10

 Gives definitions of re-engineering and reverse engineering.  Briefly
 explains the benefits of each.  Claims that re-engineering is the
 first step in reverse engineering and that reverse engineering tools
 don't exist yet, but will be available within two years.


TITLE:   TMM: Software Maintenance by Transformation
AUTHOR:   Arango, Guillermo, Baxter, Ira, Freeman, Peter, and
          Pidgeon, Christopher
SOURCE:   IEEE Software
VOL:   3
NO:   3
DATE:   May 1986
PAGES:   27 - 39

 Proposes two methods for maintenance.  TMM, transformation maintenance
 model, which works with the Draco paradigm.  Reverses design decisions
 to reach a least common abstraction for the current implementation and
 the desired implementation.  MBA, maintenance by abstraction, handles
 the situtation where no specification or design information exist.
 This paper is a rewrite of the authors Conference on Software Maintenance-85
 paper.


TITLE:   Inverse Transformation of Software From Code To Specification
AUTHOR:   Sneed, Harry M., and  Jandrasics, Gabor
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   102 -  109

 Describes how existing Cobol software can be retranslated into a logical
 software design stored in the form of a relational database for both
 data and program design elements and how these relations can be
 retranslated into an entity/relationship model of the system with entity,
 structure, and relationship descriptions.  Relations are described and
 information sources are given, but transformation techniques are not
 mentioned.



TITLE:   A CASE for Reverse Engineering
AUTHOR:   Bachman, Charlie
SOURCE:   Datamation
VOL:   34
NO:   13
DATE:   July 1, 1988
PAGES:   49 - 56

 Non-technical article that discusses the potential for CASE tools that
 incorporate support for reverse engineering, re-engineering, and expert
 systems to help with backward and forward development.  The article
 deals mostly with reverse engineering, what it is, its benefits, and
 why CASE tools need to support it.  Business oriented article.



TITLE:   Simple Tools To Automate Documentation
AUTHOR:   Kuhn, D. Richard, and Hollis, Carol G.
SOURCE:   Conference on Software Maintenance
DATE:   November 11-13, 1985
PAGES:   203 - 210

 This paper describes how program information can be extracted from source
 code using simple programs.  The technique relys on the use of a
 programming standard when writing the software.  Information retrieved
 includes calling interfaces, variable usage, calling-called relationships,
 etc.  Oriented towards PL/1, but is general enough for other languages.


TITLE:   SRE: A Knowledge-based Environment for Large-Scale Software
         Re-engineering Activities
AUTHOR:   Kozaczynski, Wojtek,  and Ning, Jim Q.
SOURCE:   11th International Conference on Software Engineering
DATE:   1989
PAGES:   113 - 122

 This paper describes the underlying principles of a knowledge-based
 Software Re-engineering Environment (SRE). Issues related to the re-engineering
 of large-scale software systems are addressed.  The focus seems to be more
 on reverse engineering support rather than supporting modifications.


TITLE:   PROMPTER: A Knowledge Based Support Tool for Code Understanding
AUTHOR:   Fukunaga, Koichi
SOURCE:   8th International Conference on Software Engineering
DATE:   August 28-30, 1985
PAGES:   358 - 363

 Reports on a prototype tool called PROMPER for code understanding.  Given
 assembler source code for a program, it produces a higher level description
 of the program using programming knowledge, hardware knowledge, and 
 program conventions.  The basis for the tool and its structure are given.


TITLE:   The Evolution of Programs: Program Abstraction and Instantiation
AUTHOR:   Dershowitz, Nachum
SOURCE:   5th International Conference on Software Engineering
DATE:   March 9-12, 1981
PAGES:   79 - 88

 Gives two detailed examples demonstrating a methodology for deriving
 an abstract program schema that captures a shared technique underlying
 a set of concrete programs.  A schema can be instantiated to create
 a new concrete program.  The method is based on formal logic and
 specifications.  The concrete programs must have input/output and
 body assertions given for this method to work.


TITLE:   A Knowledge-Based System for Software Maintenance
AUTHOR:   Calliss, F. W., Khalil, M., Munro, M., and Ward, M.
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   319 - 324

 Describes a project to develop a knowledge-based tool called the
 Maintainer's Assistant.  The tool is directed at supporting a maintainer
 develop an understanding of unknown code.  The tool uses a formal
 language to model source code, transformations are used to "realize"
 the purpose of the code, and programming plans are used to spot known
 algorithms.



TITLE:   A Documentation Method Based on Cross-Referencing
AUTHOR:   Foster, John R., and Munro, Malcolm
SOURCE:   Conference on Software Maintenance
DATE:   September 21-24, 1987
PAGES:   181 - 185

 This paper is concerned with the use and maintenance of documentation.
 It describes a toolset ( and methodology) called DOCMAN that uses a
 cross-referencer to collect the names of all program items.  A maintainer
 can then add a description about an item as an understanding of that
 item is gained.  The tool can also be used to view recorded item information.


TITLE:   MAP : a Tool for Understanding Software
AUTHOR:   Warren, Sally
SOURCE:   Sixth International Conference on Software Engineering
DATE:   1982
PAGES:   28 - 37

 This paper describes MAP, a tool that helps maintenance programmers
 understand their programs.  The paper lists the MAP command set and
 shows example of its use.  Its targeted support areas are also well
 explained.  MAP helps show procedural structure, follow control-flow
 and data-flow, understanding data aliasing, search for patterns, and
 compare two different versions of the same program.  It supports COBOL.



TITLE:   Automatic Documentation Methodologies For Software Maintenance
AUTHOR:   Landis, Larry D., Hyland, Patricia M., Gilbert, Alton L., and
          Fine, Andrew J.
SOURCE:  Prepared by Technical Solutions, Inc. for U.S. Army Research
         Office  DTIC Number : AD-A204 752
DATE:   January 25, 1989

Based on the idea that software maintenance is made easier by accurate
documentation, this reports deals with providing automatic techniques
of generation documentation from source code.  This is a brief report
indicating the goals of the research and the results reached.  No details


TITLE:   Redocumenting Software Systems using Hypertext Technology
AUTHOR:   Fletton, Nigel T., and Munro, Malcolm
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   54 - 59

 This is a brief paper that discusses the possible advantanges of
 using Hypertext to document a software system.  Most of the paper
 discusses current software documentation troubles and how they relate to
 maintenance.  Then the idea of using hypertext is advocated.  The
 authors have just begun an experimental study using hypertext to
 collect program information gained by maintainers while documenting a program.



TITLE:   Maintenance and Reverse Engineering: Low-Level Design Documents 
         Production and Improvement
AUTHOR:   Antonini, P., Benedusi, P., Cantone, G., and Cimitile, A.
SOURCE:   Conference on Software Maintenance
DATE:   September 21-24, 1987
PAGES:   91 - 100

 This paper presents a technique for generating Jackson's logic diagrams
 and Warnier/Orr logical process structures from Cobol source code. The
 techniques are general enough to be applied to most programming languages.
 The paper also presents a methodology based on using reverse engineering
 in the maintenance phase.  The methodology is based on the generation
 and comparison of new design documents with earier document versions.


TITLE:   Documentation in a Software Maintenance Environment
AUTHOR:   Landis, Larry D., Hyland, Patricia M., Gilbert, Alton L., and
          Fine, Andrew J.
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   66 - 73

 This paper describes a project to help maintenance programmers develop
 an understanding of a program by generating documentation from source
 code.  Recovered documentation includes extended Nassi-Shneiderman charts,
 a data dictionary, pictorial representations of data structures, and a
 pretty-printer.  The system supports Fortran, C, and Ada, which are translated
 into an internal language called the Documentation Language (DL).
 A good review of documentation methodologies is given in an appendix.


TITLE:   PAT: A Knowledge-based Program Analysis Tool
AUTHOR:   Harandi, Mehdi T.,  and Ning, Jim Q.
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   312 - 318

 This article describes a knowledge based tool to support program
 understanding and debugging.  PAT uses program plans to recognize
 common algorithms and typical implementation mistakes.  The paper gives
 an overview of the tool architecture and explains the program plan
 notation and use.  An example tool sessions is included.


TITLE:   Software Maintenance as an Engineering Discipline
AUTHOR:   Linger, Richard C.
SOURCE:   Conference on Software Maintenance
DATE:   October 24-27, 1988
PAGES:   292 - 297

 This paper argues that software maintenance must use more formal
 techniques in order to become a managible activity.  This paper
 discusses the use of the Linger-Mills theory of program primes as
 a formal construct.  Its use for program control restructuring is
 given.


TITLE:  Reverse Software Engineering
AUTHOR:  Prywes, N., Ge, X., Lee, I., and Song, M.
SOURCE:  Tech Report MS-CIS-88-99 Department of Computer and Information
         Science, University of Pennsylvania
DATE:  December 1989


From Invader%cup.portal.com@nova.unix.portal.com Thu Feb  7 00:34:54 1991
Received: from portal.COM by plains.NoDak.edu; Thu, 7 Feb 91 00:34:39 -0600
Received: by nova.unix.portal.com (3.1.18.113)
	id m0j459s-0000pvC; Wed, 6 Feb 91 22:31 PST
Received: by portal.unix.portal.com (%I%) 
	id AA27836; Wed, 6 Feb 91 22:31:05 PST
Received: by hobo.corp.portal.com (4.0/4.0.3 1.6) 
	id AA20576; Wed, 6 Feb 91 22:31:03 PST
To: kmagel@plains.nodak.edu
From: Invader@cup.portal.com
Subject: reverse engineering
Lines: 27
Date: Wed,  6 Feb 91 22:31:02 PST
Message-Id: <9102062231.1.18964@cup.portal.com>
X-Origin: The Portal System (TM)
Status: RO

OK.  I'll bite.  Reverse engineering has been my hobby for years.
I started off with a paper tape of an interpreter system for the
PDP-11 and have gone from there.

Since I'm not dead serious about it, I don't have a great deal of advice.
In general, though, it requires a tool of some sort.  I have built tools
progressively attack the problem and that save hints about the outcome.
The more interactive, the better.

The idea is to show something useful and when you understand that,
make it more symbolic.

An ideal system would have a lot of built-in knowledge of how programs
work and would do a lot more analysis, breaking into basic blocks and
using flow analysis techniques like a compiler would use.

Finally, compiled code is the easiest to deal with because it is
normally very regular (unless the compiler did some code motion or
cross jumping.)  Regularity is the key, especially in dealing
with assembly language.  What is really cool about it is that
even at the bit level you can tell that different people wrote
different parts of a program.  The personality comes through.

I don't know why you want to know any of this stuff, but I'd be
happy to talk more about it if you have specific ideas, etc.
	mkd
ps  I'd be glad to receive any other info you get, too.

From bnfb@cs.washington.edu Mon Feb  4 15:03:17 1991
Received: from june.cs.washington.edu by plains.NoDak.edu; Mon, 4 Feb 91 15:03:15 -0600
Received: by june.cs.washington.edu (5.64/7.0jh)
	id AA21642; Mon, 4 Feb 91 13:03:19 -0800
Date: Mon, 4 Feb 91 13:03:19 -0800
From: bnfb@cs.washington.edu (Bjorn Freeman-Benson)
Return-Path: <bnfb@cs.washington.edu>
Message-Id: <9102042103.AA21642@june.cs.washington.edu>
To: kmagel@plains.NoDak.edu
Subject: Re: reverse engineering
Newsgroups: comp.software-eng
In-Reply-To: <7940@plains.NoDak.edu>
Organization: University of Washington, Computer Science, Seattle
Cc: 
Status: R

>Does anyone have any real world experiences they are willing to
>share?  Thanks.

What the heck, here's what I do: I work on a contract basis writing
linkers for Zortech C++.  My linker must be compatible with the
Microsoft linker, but there are no specs on what or how the MS
linker works.  Thus I run test cases through the MS linker and check
the output.  Then I run a slightly different test case, and check it
again.  For example, I want to know how segment attributes are
combined: is it a logical OR?  a last-come, last-served?  a defaults
are always overridden?  a logical AND?  etc.  I write dozens of
little test cases based on what I believe the solution to be and
check my hypothesis.  Sometimes I am write, sometimes I am
surprised, and (rarely if I am a good engineer) do I miss the truth.
The trick for my work is to apply all the off-by-one, corner-cases,
etc. experience that I have to choosing the test cases.

Regards,
Bjorn N. Freeman-Benson

From jcardow@blackbird.afit.af.mil Mon Feb  4 15:14:15 1991
Received: from [129.92.1.2] by plains.NoDak.edu; Mon, 4 Feb 91 15:14:05 -0600
Received: by blackbird.afit.af.mil (5.64+/a0.25)
	id AA10390; Mon, 4 Feb 91 16:14:05 -0500
Date: Mon, 4 Feb 91 16:14:05 -0500
From: James E. Cardow <jcardow@blackbird.afit.af.mil>
Message-Id: <9102042114.AA10390@blackbird.afit.af.mil>
To: kmagel@plains.NoDak.edu
Subject: Re: reverse engineering
Newsgroups: comp.software-eng
References: <7940@plains.NoDak.edu>
Status: R

In comp.software-eng you write:


>    About ten days ago, I requested information concerning how people do 
>reverse engineering of computer software.  To date, there have been six 
>requests to post what I found out, but very little in the way of informative
>responses.  The January, 1990 issue of IEEE Software is devoted in part to 
>reverse engineering.  There are some good articles there and some good 
>references.  Does anyone have any real world experiences they are willing to
>share?  Thanks.

Ken, 

I am trying to prepare a course in reverse engineering/re-engineering for
working professionals so while I can't give you practical examples I can 
give you some pointers.  I did participate in one effort several years ago
without a good understanding of what was going on, which is why I interested
in developing the course.

Some references of interest:

IEEE Computer Society Tutorial on Software Restructuring by Robert Arnold.
  (I attended a tutorial by Mr Arnold last month and he said it is in 
   revision).

IEEE Computer Society Proceedings from the Conference on Software Maintenance
for 1990.  Several tracks addressed reverse engineering etc.  One paper by
Takis Katsoulakos was especially interesting as an overview of efforts in
Europe.

Conference currently in planning by one of the Navy outfits around D.C. is
on current reverse engineering efforts.

Articles by Biggerstaff, Chikofsky, and Rugaber in IEEE Software.

Work by Eric Byrne of Kansas State U.  for the Air Force as summer study.

Don't know if this will help, but I too would be interested in finding out 
your results.  I'd appreciate any information you would care to share.

Jim Cardow
Air Force Institute of Technology
Wright Patterson AFB, OH


From kim@unagi.cis.upenn.edu Tue Feb  5 13:02:27 1991
Received: from LINC.CIS.UPENN.EDU by plains.NoDak.edu; Tue, 5 Feb 91 13:02:23 -0600
Received: from UNAGI.CIS.UPENN.EDU by linc.cis.upenn.edu
	id AA04374; Tue, 5 Feb 91 14:02:32 -0500
Return-Path: <kim@unagi.cis.upenn.edu>
Received: by unagi.cis.upenn.edu
	id AA27771; Tue, 5 Feb 91 14:02:31 EST
Date: Tue, 5 Feb 91 14:02:31 EST
From: kim@unagi.cis.upenn.edu (JEE-IN KIM)
Posted-Date: Tue, 5 Feb 91 14:02:31 EST
Message-Id: <9102051902.AA27771@unagi.cis.upenn.edu>
To: kmagel@plains.NoDak.edu
Subject: Re: reverse engineering
Newsgroups: comp.software-eng
In-Reply-To: <7940@plains.NoDak.edu>
Organization: University of Pennsylvania
Cc: 
Status: R

Noah Prywes (nsp@central.cis.upenn.edu) and his colleagues have been
developing an equational language called MODEL which was used as an
intermmediate language for a reverse engineering from CMS-2 codes to
Ada and C.  He and his consulting company have lots of REAL WORLD
experiences, I should say. I think you had better ask him for a list
of references including his techniacl reports.

Best,

Jee-In
-- 
---------------------------
Jee-In Kim
kim@unagi.cis.upenn.edu
---------------------------