[comp.lang.c] PROTOIZE 1.00 - now available via FTP

rfg@pink.ACA.MCC.COM (Ron Guilmette) (06/21/89)

PROTOIZE 1.00  --  now available for anonymous FTP

Many people have noted the need for tools to automate the
conversion of old C code (in non-prototyped form) to new
ANSI C and C++ prototype form.  Well, I noted it too and I
decided to do something about it.

I would like to announce the availability of an automated
prototyping assistant tool.  This tool can automate much (but
definitely not all) of the otherwise tedious work of converting
a large system of source files to ANSI-C (or C++) prototype
format.  This assistant can convert most of the obvious cases,
leaving you to do only the occasional tricky case manually.

This prototyping assistant tool comes in two separate parts.
The first part is an tool to gather up prototype information,
and the second part is a tool to automatically edit this information
into the proper places (i.e. function declarations and definitions)
within a set of existing source files (both base files and include
files).

The information gathering tool is really a modified version of
the GNU Project C compiler (GCC) Version 1.35 (or thereabouts).
This was the most expedient base for an information gatherer,
because it already had a whole (debugged) parser for full ANSI C.

The other half of the automatic protyping "system" is the tool
that actually munges the prototypes back into the source code
at the right places.  This is called "protoize".  It is written
in (reasonably portable) C.  It is about as intelligent as it can
be, given the limitations of its input.  For instance, it knows
the difference between static and extern functions, and about many
other subtle points.

The current "man page" for protoize is given below.  To read it,
run it through nroff or troff while using the -man option.

Since the protoize program must be used with GCC, the current version
(1.00) is being distributed as a set of patches against the (virgin)
GCC 1.35 sources.  These patches add several files to the GCC 1.35
sources, specifically:

	protoize.c	the protoize program
	protoize.1	the man page
	proto-gen.c	new addition to GCC (cc1)
	std.c		prototypes for system functions

You may obtain the current version of protoize via anonymous FTP
from yahi.stanford.edu (36.83.0.92).  The compressed set of GCC 1.35
patches is in:

	~ftp/pub/protoize-1.00.Z

This software is distributed with the same terms and conditions as other
GNU software distributed by the Free Software Foundation.  Specifically,
if while using it, your source files vanish (perhaps because you didn't
bother to read the documentation first) and if you were too dumb to have
regular backups made, you cannot hold me, my employer, FSF, or anyone
else liable for direct, incidental, or consequential damages, or for loss
of revenue, teeth, hair, or other vital or disposable organs or appendages.

        ****** MAKE BACKUPS ******  **** YOU'VE BEEN WARNED ****

P.S.  If you like this program, or if you use it for any big conversion,
I'd like to hear from you.  Drop me a line at the address below.

Also, if you find this code useful, and if you would like to see it become
a permanent part of GCC, and be distributed with future versions (and
be maintained), please send a note to that effect to:

	rms@wheaties.ai.mit.edu

// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg


man page - "protoize.1"  -  cut here
--------------------------------------------------------------------------
.TH PROTOIZE 1 "10 June 1989" ""
.SH NAME
protoize \- convert old style (K&R) C source code to ANSI C prototype format
.SH SYNOPSIS
.B protoize
[
.B -C
] [
.B -v
] [
.B -d
] [
.B -s<func-name>
\&... ] [
.I prototypes-file
\&... ]
.SH DESCRIPTION
.I Protoize
aids in the conversion of 
old style (K&R) C source code files to new style
C source code files with ANSI function prototypes.
This conversion is useful for eliciting more complete
interface checking from ANSI C compilers, or as
a preliminary step in the conversion of C programs to C++.
.PP
.I Protoize
is designed to be used in conjunction
with some C compiler which does preliminary
.I "information gathering"
of prototype information from the files to be converted.
Currently, only the GNU C compiler is capable of producing
the information need by
.I protoize
in a form which is acceptable to
.I protoize.
.PP
.I Protoize
actually has two primary functions.  First, It converts
existing (old style) function declarations and definitions
to prototype form.  Second, for cases in which functions
are called before they have been declared
(i.e. points of
.I implicit
function declarations),
.I protoize
inserts new prototype style function declarations
into the source code.
For each case of an implicit function declaration,
.I protoize
inserts the new (explicit) function declaration
at the very beginning of the block which contains
the implicit declaration.
The insertion of these additional explicit function
declarations assures that
.B all
function calls in the source code being converted will be 
checked for correctness (by an ANSI C compiler).
.PP
.I Protoize
supports the conversion of whole systems of C source
code to function prototype form.  Such
.I "en-mass conversions"
may be performed in a batch mode, with
.I protoize
converting an entire system in one run.
.I Protoize
is able to convert entire systems of C source code because
it knows how to use information (gleaned by the C compiler) from one
source file to add prototypes for
function definitions and declarations in
other source files.
.PP
Each conversion of a system of source code
to prototype format consists of two parts.  First,
you must do
.I "information gathering"
by recompiling all of the source files that make up a given
executable program.  These recompilations must be done
with the GNU C compiler, and you must use the
.B -fgen-prototypes
option for each of these compilations.
.PP
As you perform individual compilation steps (using the
.B -fgen-prototypes
option)
you may notice that
a side-effect of
these compilations is to leave files with a
.B .p
suffix in the same directory with the original
.I base
source files.
During compilation with
.B -fgen-prototypes,
one such
.B .p
file is created for
each
.I base
source file compiled.  These files contain
function prototype information and
additional coded information which can be used by
.I protoize
to convert your source code
to function prototype format.
.PP
After all of the
.B .p
files corresponding to all of the
.B .c
files for the source code
have been created,
you may perform the actual prototype conversion step for
the entire source code system by using the
.I protoize
program.
To perform the actual conversion step, you simply invoke
.I protoize
and supply it with the list of
.B .p
files for the system to be converted.
Usually, this list can be automatically expanded
by your shell, and you may simply use the notation
.B *.p
as the command line argument to
.I protoize.
.PP
Execution of the
.I protoize
program causes your original source files to be converted
so that they contain ANSI C style function prototypes both
for function declarations and for function definitions.
After the conversion of your system, you should be
left with a set of equivalent (but prototyped) source files
with the same names as your original files.
Before it writes each converted file back to disk,
.I protoize
deletes the original file and creates a new output file
with the same name.  This insures that any copies of the
input files which are really just hard links to the same original
files will not be altered by the conversion process.
.PP
.B WARNING.
It is strongly recommended that you make backup copies of
all files that may be converted by
.I protoize
(including both
.I base
files and
.I include
files)
before you execute the
.I protoize
program.
It is also recommended that you
.B never
run
.I protoize
when logged in as user
.B root.
Doing this may cause your system include files to be
undesirably scrambled.
.PP
After conversion of your entire set of source files,
it is recommended that you check the changes made by
.I protoize.
There are two ways to do this.  First, you can run
.I diff(1)
and look at the differences between the converted and
unconverted (backup) versions of each of your source files (including
the
.I include
files).  Second,
it is recommended that you fully recompile and re-link all of the
converted programs
(using some ANSI compatible C compiler
such as the GNU C compiler) immediately following conversion.
This will quickly alert you to any
anomalies in the conversion process.
.SH OPTIONS
.I Protoize
recognizes the following options:
.TP
.BI \-s<func-name>
Suppress option.
This option gives the user the ability to
force
.I protoize
to ignore a given function (or several functions) during the conversion
process.
When this option is used, no prototyping (for either function
definitions or declarations) will take place for the function named by
.B <func-name>.
This option is useful for avoiding conversion of certain functions
which use old fashioned mechanisms to accept a variable number of
arguments.  Such functions are often called
.I varargs
functions.  It is generally inappropriate to convert these
.I varargs
functions to prototype form via automated means.
Rather, you should convert these functions
.I by hand
(and possibly convert them to use the new ANSI C
.I stdarg
conventions at the same time).
Note that you may invoke
.I protoize
with as many
.B \-s
options as you like, thereby suppressing the conversion of as
many different function as you like.
.TP
.BI \-C
C++ conversion mode.
Normally,
.I protoize
writes its (converted) output files back to files of the same names
as the original (unconverted) input files (but only after deleting the
original input files).
In C++ conversion mode, the same things happen except
that after each output file is written,
a check is made to see if the given output file has a
.B .c
suffix.  If it does, then the given file is renamed, and its suffix
is changed to
.B .C.
This makes the output file
acceptable as a C++ input file for either the GNU C++ compiler or
for the AT&T Cfront translator.
.TP
.BI \-v
Verbose mode.  For each output file
created, print the file's name and, as the conversion for that file
proceeds, print out the (original file) line number at which the
current item conversion is
taking place.  These messages are
printed on
.I stderr.
This option is useful if
.I protoize
gets confused by your source code for any reason (which happens only rarely).
In such cases, using this option can help you find the approximate
source line for the offending source code.
.TP
.BI \-d
Debug mode.  This option is to be used only by those doing maintenance
on the
.I protoize
program itself.  It causes additional messages concerning the internal
state of
.I protoize
to be written to
.I stderr.
.SH EXAMPLES
Assume that you have
a directory with
all of the files for your system in it.  Also
assume that your system consists of two
executable programs, one built from the files
.B s1.c, s2.c,
and
.B s3.c,
and the other built from the files
.B s4.c
and
.B s5.c.
Finally, assume that these source files share some include files called
.B s1.h
and
.B s2.h
in the same directory.
.PP
In order to properly convert such a system of programs, you
would need to perform the following steps in exactly the order shown below
(after making backup files of course).
.sp 1
.in +0.5i
.ft B
gcc -fgen-prototypes -o prog1 s1.c s2.c s3.c
.br
gcc -fgen-prototypes -o prog2 s4.c s5.c
.br
protoize *.p
.br
rm *.p
.sp 1
.ft R
.in -0.5i
.PP
In the example above, the first invocation of
.I gcc
causes three
.B .p
files to be created.
These are called
.B s1.c.p, s2.c.p,
and
.B s3.c.p.
These files contain prototyping information
.I both
for their corresponding
.B .c
files and for any and all related include files
which are used by these
.I base
.B .c
files.
.PP
After the compilation of all of the files which make up
.I prog1,
you would compile all of the files which make up
.I prog2
in a similar fashion (i.e. remembering to use the
.B -fgen-prototypes
option again).
This step would create two more
.B .p
files called
.B s4.c.p
and
.B s5.c.p.
.PP
Finally, after all of the necessary
.B .p
files have been generated, you would invoke the
.I protoize
program, and give it the names of all of the
.B .p
files needed for conversion.
.I Protoize
will then proceed to read in all of the information from the
.B .p
files and then convert all of the source files (both
.I base
files and
.I include
files) which form a part of your system.
.PP
Note that the only
.I include
files which are converted are
ones for which the current user has write access to
the containing directory.  Thus, no
.I "system include"
files are ever converted (unless the user also has write access to
the directories which contain them).
.PP
After the protoization of your entire source code system,
you will probably want to delete all leftover
.B .p
files via the
.I rm(1)
command.
In the example above, this is accomplished simply via the
.I "rm *.p"
command.
This completes the entire protoization process.
.SH CAVEATS
The
.I protoize
program doesn't just get information from your own
.I \.p
files.  Every time
.I protoize
executes, it also reads the file std.c.p
from some standard installation directory
(if it exists) to obtain a pre-written set of function prototypes for
various standard system-supplied functions.  These prototypes are effectively
added to the set of prototypes which 
.I protoize
can use to perform prototype substitutions on your source files.
For this reason, if the source code for your own system of programs
contains its own function definitions
for functions with the same names as standard system-supplied functions,
.I protoize
will issue warning messages about multiple function definitions.
In such cases, it is suggested that you universally change the name of
.I your
function(s) to prevent any further conflicts or confusion
with system-supplied functions.
.PP
Normally, the information which
.I protoize
uses to perform its work comes from
the GNU C compiler via the
.B -fgen-prototypes
option.  When this option is used, full ANSI-C style function
prototypes are generated, but all type specifications within these
prototypes (for both parameter types and function return types) are
.I "full expanded,"
that is to say that symbolic type names, created via
.B typedef
statements are
.B not
used in the prototypes which GCC emits by default when the
.B -fgen-prototypes
option is used.
.PP
If you wish to create more readable (but still fully prototyped)
programs, you should use the
.B -fproto-typedefs
option in addition to the
.B -fgen-prototypes
option when you initially compile your programs (with GCC)
in preparation for protoization.  Using
.B -fproto-typedefs
causes GCC to write function prototypes to your
.B .p
files such that symbolic type names declared in
.B typedef
statements are used (where possible) to specify
formal parameter types and return types for all function.
.PP
If you decide to use the
.B -fproto-typedefs
option, you should be aware
that this option can have undesirable side effects.  Specifically,
it may force you to move around your
.B typedef
statements in the converted code (before you can successfully recompile it).
If you simply want to do temporary testing of your code (with prototypes)
it is strongly suggested that you
.B not
use the
.B -fproto-typedefs
option to GCC when compiling in preparation for protoization.
.PP
Note also that even when you request the use of
.B typedef
names in your prototypes (via
.B -fproto-typedefs
)
you will not always get all of the
.B typedef
names that you might expect.  Specifically,
.B typedef
names which are defined in terms of the
.B const
and/or
.B volatile
type modifiers
may sometimes cause trouble.  This is due partly to
the complicated semantics of these modifiers, and partly
to limitation of GCC's handling of them.  Problems
with these modifiers should be rare in practice.
.PP
As noted above in the description of the
.B -s
option, so-called
.I varargs
function do not map well to prototyped form.  Usually, the best
way to handle
.I varargs
functions will be to
suppress the conversion of these functions via
the
.B -s
(suppress) option.
.PP
If you use
.I protoize
to convert a system which consists of
multiple programs, and if these programs share some non-system
include files, conflicts may arise
when
.I protoize
attempts to convert function declarations in the common include files.
.PP
For instance, in the example above, if the
.B .c
files for
.I prog1,
and the 
.B .c
file for
.I prog2
both contained function definitions for the
function named
.I foobar,
and if there is a function
.I declaration
for the function
.I foobar
within one of the common
.I include
files which are shared between
.I prog1
and
.I prog2,
then
.I protoize
cannot know which of the two parameter lists from the two function
.I definitions
should be used to convert the
.I declaration
of 
.I foobar
in the common
.I include
file.
If the two formal parameter lists for the two different
function
.I definitions
in the two programs
both have exactly the same number, order, and types of
parameters, then it does not matter which of the two
formal parameter lists
.I protoize
chooses to perform the conversion of the
.I declaration
of
.I foobar
in the include file.
If, however, these two parameter lists differ (even slightly)
then there is an irreconcilable conflict.
In such cases, the
.I protoize
program issues a warning, picks one formal parameter list arbitrarily,
performs the conversion, and moves on.  The user is advised to heed such
warnings, and (if possible) to rename one of the two
conflicting functions in one
of the two programs (so that conflicts will not arise) and to begin again
on the whole conversion process.
.PP
Finally, note that it is naive to assume that conversion to
prototype format will make old style C code into legitimate
ANSI C code (or legitimate C++ code for that matter).  The
automatic protoization of your source files via the
.I protoize
program is only one step (albeit a big one) towards
full conversion.  A full conversion may also require
lots of editing "by hand".
.SH WARNINGS
There are several possible warning and/or error messages which
.I protoize
will issue for strange circumstances (e. g. missing input
files, etc).  These are mostly self-explanatory.
.PP
Note that
.I protoize
will refuse to convert anything if any of the
input
.B .p
files are older that any of the files it thinks should be
considered for conversion (i.e. all of the
.I base
and
.I include
files which were used to build the program(s) being converted.
This insures that all
.B .p
prototype files used to guide the operation of
.I protoize
are up-to-date (relative to the input files being converted).
If
.I protoize
complains about the last modification time of
.B .p
files (or the files to be converted) this indicates that you need to recompile
some or all of the files for the program(s) being converted (using the
.B -fgen-prototypes
option), and thereby insure that all of the
necessary
.B .p
files are up-to-date.
.SH FILES
.ta 2.0i
/usr/local/bin/gcc	GNU C compiler
.br
/usr/local/bin/protoize	the protoize program
.br
/usr/local/lib/std.c.p	standard system prototypes file
.SH "SEE ALSO"
gcc(1), g++(1)
.SH BUGS
There are many cases in which
.I protoize
can be hopelessly confused by
source code which has
comments, preprocessor commands, or macro-calls in the
vicinity of something which it has to convert.
Fortunately, these cases seem to be very rare
in practice.  All of the source files for GCC itself (with the exception of the
.I hard-params.c
file) have been successfully
protoized without any special modifications.
Keep in mind that
.I protoize
knows nothing about preprocessor commands or macro-calls,
and it knows very little about C style comments.
.PP
Because the information written to the prototype
files by GCC is derived from the original source code
.I after
it has been preprocessed, there are certain problems resulting from
the uses of macro-calls in function definitions and/or declarations.
These problems are a permanent part of
.I protoize
and there is no use complaining about them.
.PP
Bugs (and requests for reasonable enhancements) should be reported to
rfg@mcc.com.  Bugs may actually be fixed if they can be easily
reproduced, so it is in your interest to report them
in such a way that reproduction is easy.
.SH COPYING
Copyright (c) 1989 Free Software Foundation, Inc.
.br
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.
.br
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the
entire resulting derived work is distributed under the terms of a
permission notice identical to this one.
.br
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be included in
translations approved by the Free Software Foundation instead of in
the original English.
.SH AUTHORS
Written by Ron Guilmette at the Microelectronics and Computer Technology
Corporation. (rfg@mcc.com)
.sp 1
See the GNU CC Manual for the contributors to GNU CC.
-- 
// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg