rfg@pink.ACA.MCC.COM (Ron Guilmette) (06/21/89)
PROTOIZE 1.00 -- now available for anonymous FTP Many people have noted the need for tools to automate the conversion of old C code (in non-prototyped form) to new ANSI C and C++ prototype form. Well, I noted it too and I decided to do something about it. I would like to announce the availability of an automated prototyping assistant tool. This tool can automate much (but definitely not all) of the otherwise tedious work of converting a large system of source files to ANSI-C (or C++) prototype format. This assistant can convert most of the obvious cases, leaving you to do only the occasional tricky case manually. This prototyping assistant tool comes in two separate parts. The first part is an tool to gather up prototype information, and the second part is a tool to automatically edit this information into the proper places (i.e. function declarations and definitions) within a set of existing source files (both base files and include files). The information gathering tool is really a modified version of the GNU Project C compiler (GCC) Version 1.35 (or thereabouts). This was the most expedient base for an information gatherer, because it already had a whole (debugged) parser for full ANSI C. The other half of the automatic protyping "system" is the tool that actually munges the prototypes back into the source code at the right places. This is called "protoize". It is written in (reasonably portable) C. It is about as intelligent as it can be, given the limitations of its input. For instance, it knows the difference between static and extern functions, and about many other subtle points. The current "man page" for protoize is given below. To read it, run it through nroff or troff while using the -man option. Since the protoize program must be used with GCC, the current version (1.00) is being distributed as a set of patches against the (virgin) GCC 1.35 sources. These patches add several files to the GCC 1.35 sources, specifically: protoize.c the protoize program protoize.1 the man page proto-gen.c new addition to GCC (cc1) std.c prototypes for system functions You may obtain the current version of protoize via anonymous FTP from yahi.stanford.edu (36.83.0.92). The compressed set of GCC 1.35 patches is in: ~ftp/pub/protoize-1.00.Z This software is distributed with the same terms and conditions as other GNU software distributed by the Free Software Foundation. Specifically, if while using it, your source files vanish (perhaps because you didn't bother to read the documentation first) and if you were too dumb to have regular backups made, you cannot hold me, my employer, FSF, or anyone else liable for direct, incidental, or consequential damages, or for loss of revenue, teeth, hair, or other vital or disposable organs or appendages. ****** MAKE BACKUPS ****** **** YOU'VE BEEN WARNED **** P.S. If you like this program, or if you use it for any big conversion, I'd like to hear from you. Drop me a line at the address below. Also, if you find this code useful, and if you would like to see it become a permanent part of GCC, and be distributed with future versions (and be maintained), please send a note to that effect to: rms@wheaties.ai.mit.edu // Ron Guilmette - MCC - Experimental Systems Kit Project // 3500 West Balcones Center Drive, Austin, TX 78759 - (512)338-3740 // ARPA: rfg@mcc.com // UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg man page - "protoize.1" - cut here -------------------------------------------------------------------------- .TH PROTOIZE 1 "10 June 1989" "" .SH NAME protoize \- convert old style (K&R) C source code to ANSI C prototype format .SH SYNOPSIS .B protoize [ .B -C ] [ .B -v ] [ .B -d ] [ .B -s<func-name> \&... ] [ .I prototypes-file \&... ] .SH DESCRIPTION .I Protoize aids in the conversion of old style (K&R) C source code files to new style C source code files with ANSI function prototypes. This conversion is useful for eliciting more complete interface checking from ANSI C compilers, or as a preliminary step in the conversion of C programs to C++. .PP .I Protoize is designed to be used in conjunction with some C compiler which does preliminary .I "information gathering" of prototype information from the files to be converted. Currently, only the GNU C compiler is capable of producing the information need by .I protoize in a form which is acceptable to .I protoize. .PP .I Protoize actually has two primary functions. First, It converts existing (old style) function declarations and definitions to prototype form. Second, for cases in which functions are called before they have been declared (i.e. points of .I implicit function declarations), .I protoize inserts new prototype style function declarations into the source code. For each case of an implicit function declaration, .I protoize inserts the new (explicit) function declaration at the very beginning of the block which contains the implicit declaration. The insertion of these additional explicit function declarations assures that .B all function calls in the source code being converted will be checked for correctness (by an ANSI C compiler). .PP .I Protoize supports the conversion of whole systems of C source code to function prototype form. Such .I "en-mass conversions" may be performed in a batch mode, with .I protoize converting an entire system in one run. .I Protoize is able to convert entire systems of C source code because it knows how to use information (gleaned by the C compiler) from one source file to add prototypes for function definitions and declarations in other source files. .PP Each conversion of a system of source code to prototype format consists of two parts. First, you must do .I "information gathering" by recompiling all of the source files that make up a given executable program. These recompilations must be done with the GNU C compiler, and you must use the .B -fgen-prototypes option for each of these compilations. .PP As you perform individual compilation steps (using the .B -fgen-prototypes option) you may notice that a side-effect of these compilations is to leave files with a .B .p suffix in the same directory with the original .I base source files. During compilation with .B -fgen-prototypes, one such .B .p file is created for each .I base source file compiled. These files contain function prototype information and additional coded information which can be used by .I protoize to convert your source code to function prototype format. .PP After all of the .B .p files corresponding to all of the .B .c files for the source code have been created, you may perform the actual prototype conversion step for the entire source code system by using the .I protoize program. To perform the actual conversion step, you simply invoke .I protoize and supply it with the list of .B .p files for the system to be converted. Usually, this list can be automatically expanded by your shell, and you may simply use the notation .B *.p as the command line argument to .I protoize. .PP Execution of the .I protoize program causes your original source files to be converted so that they contain ANSI C style function prototypes both for function declarations and for function definitions. After the conversion of your system, you should be left with a set of equivalent (but prototyped) source files with the same names as your original files. Before it writes each converted file back to disk, .I protoize deletes the original file and creates a new output file with the same name. This insures that any copies of the input files which are really just hard links to the same original files will not be altered by the conversion process. .PP .B WARNING. It is strongly recommended that you make backup copies of all files that may be converted by .I protoize (including both .I base files and .I include files) before you execute the .I protoize program. It is also recommended that you .B never run .I protoize when logged in as user .B root. Doing this may cause your system include files to be undesirably scrambled. .PP After conversion of your entire set of source files, it is recommended that you check the changes made by .I protoize. There are two ways to do this. First, you can run .I diff(1) and look at the differences between the converted and unconverted (backup) versions of each of your source files (including the .I include files). Second, it is recommended that you fully recompile and re-link all of the converted programs (using some ANSI compatible C compiler such as the GNU C compiler) immediately following conversion. This will quickly alert you to any anomalies in the conversion process. .SH OPTIONS .I Protoize recognizes the following options: .TP .BI \-s<func-name> Suppress option. This option gives the user the ability to force .I protoize to ignore a given function (or several functions) during the conversion process. When this option is used, no prototyping (for either function definitions or declarations) will take place for the function named by .B <func-name>. This option is useful for avoiding conversion of certain functions which use old fashioned mechanisms to accept a variable number of arguments. Such functions are often called .I varargs functions. It is generally inappropriate to convert these .I varargs functions to prototype form via automated means. Rather, you should convert these functions .I by hand (and possibly convert them to use the new ANSI C .I stdarg conventions at the same time). Note that you may invoke .I protoize with as many .B \-s options as you like, thereby suppressing the conversion of as many different function as you like. .TP .BI \-C C++ conversion mode. Normally, .I protoize writes its (converted) output files back to files of the same names as the original (unconverted) input files (but only after deleting the original input files). In C++ conversion mode, the same things happen except that after each output file is written, a check is made to see if the given output file has a .B .c suffix. If it does, then the given file is renamed, and its suffix is changed to .B .C. This makes the output file acceptable as a C++ input file for either the GNU C++ compiler or for the AT&T Cfront translator. .TP .BI \-v Verbose mode. For each output file created, print the file's name and, as the conversion for that file proceeds, print out the (original file) line number at which the current item conversion is taking place. These messages are printed on .I stderr. This option is useful if .I protoize gets confused by your source code for any reason (which happens only rarely). In such cases, using this option can help you find the approximate source line for the offending source code. .TP .BI \-d Debug mode. This option is to be used only by those doing maintenance on the .I protoize program itself. It causes additional messages concerning the internal state of .I protoize to be written to .I stderr. .SH EXAMPLES Assume that you have a directory with all of the files for your system in it. Also assume that your system consists of two executable programs, one built from the files .B s1.c, s2.c, and .B s3.c, and the other built from the files .B s4.c and .B s5.c. Finally, assume that these source files share some include files called .B s1.h and .B s2.h in the same directory. .PP In order to properly convert such a system of programs, you would need to perform the following steps in exactly the order shown below (after making backup files of course). .sp 1 .in +0.5i .ft B gcc -fgen-prototypes -o prog1 s1.c s2.c s3.c .br gcc -fgen-prototypes -o prog2 s4.c s5.c .br protoize *.p .br rm *.p .sp 1 .ft R .in -0.5i .PP In the example above, the first invocation of .I gcc causes three .B .p files to be created. These are called .B s1.c.p, s2.c.p, and .B s3.c.p. These files contain prototyping information .I both for their corresponding .B .c files and for any and all related include files which are used by these .I base .B .c files. .PP After the compilation of all of the files which make up .I prog1, you would compile all of the files which make up .I prog2 in a similar fashion (i.e. remembering to use the .B -fgen-prototypes option again). This step would create two more .B .p files called .B s4.c.p and .B s5.c.p. .PP Finally, after all of the necessary .B .p files have been generated, you would invoke the .I protoize program, and give it the names of all of the .B .p files needed for conversion. .I Protoize will then proceed to read in all of the information from the .B .p files and then convert all of the source files (both .I base files and .I include files) which form a part of your system. .PP Note that the only .I include files which are converted are ones for which the current user has write access to the containing directory. Thus, no .I "system include" files are ever converted (unless the user also has write access to the directories which contain them). .PP After the protoization of your entire source code system, you will probably want to delete all leftover .B .p files via the .I rm(1) command. In the example above, this is accomplished simply via the .I "rm *.p" command. This completes the entire protoization process. .SH CAVEATS The .I protoize program doesn't just get information from your own .I \.p files. Every time .I protoize executes, it also reads the file std.c.p from some standard installation directory (if it exists) to obtain a pre-written set of function prototypes for various standard system-supplied functions. These prototypes are effectively added to the set of prototypes which .I protoize can use to perform prototype substitutions on your source files. For this reason, if the source code for your own system of programs contains its own function definitions for functions with the same names as standard system-supplied functions, .I protoize will issue warning messages about multiple function definitions. In such cases, it is suggested that you universally change the name of .I your function(s) to prevent any further conflicts or confusion with system-supplied functions. .PP Normally, the information which .I protoize uses to perform its work comes from the GNU C compiler via the .B -fgen-prototypes option. When this option is used, full ANSI-C style function prototypes are generated, but all type specifications within these prototypes (for both parameter types and function return types) are .I "full expanded," that is to say that symbolic type names, created via .B typedef statements are .B not used in the prototypes which GCC emits by default when the .B -fgen-prototypes option is used. .PP If you wish to create more readable (but still fully prototyped) programs, you should use the .B -fproto-typedefs option in addition to the .B -fgen-prototypes option when you initially compile your programs (with GCC) in preparation for protoization. Using .B -fproto-typedefs causes GCC to write function prototypes to your .B .p files such that symbolic type names declared in .B typedef statements are used (where possible) to specify formal parameter types and return types for all function. .PP If you decide to use the .B -fproto-typedefs option, you should be aware that this option can have undesirable side effects. Specifically, it may force you to move around your .B typedef statements in the converted code (before you can successfully recompile it). If you simply want to do temporary testing of your code (with prototypes) it is strongly suggested that you .B not use the .B -fproto-typedefs option to GCC when compiling in preparation for protoization. .PP Note also that even when you request the use of .B typedef names in your prototypes (via .B -fproto-typedefs ) you will not always get all of the .B typedef names that you might expect. Specifically, .B typedef names which are defined in terms of the .B const and/or .B volatile type modifiers may sometimes cause trouble. This is due partly to the complicated semantics of these modifiers, and partly to limitation of GCC's handling of them. Problems with these modifiers should be rare in practice. .PP As noted above in the description of the .B -s option, so-called .I varargs function do not map well to prototyped form. Usually, the best way to handle .I varargs functions will be to suppress the conversion of these functions via the .B -s (suppress) option. .PP If you use .I protoize to convert a system which consists of multiple programs, and if these programs share some non-system include files, conflicts may arise when .I protoize attempts to convert function declarations in the common include files. .PP For instance, in the example above, if the .B .c files for .I prog1, and the .B .c file for .I prog2 both contained function definitions for the function named .I foobar, and if there is a function .I declaration for the function .I foobar within one of the common .I include files which are shared between .I prog1 and .I prog2, then .I protoize cannot know which of the two parameter lists from the two function .I definitions should be used to convert the .I declaration of .I foobar in the common .I include file. If the two formal parameter lists for the two different function .I definitions in the two programs both have exactly the same number, order, and types of parameters, then it does not matter which of the two formal parameter lists .I protoize chooses to perform the conversion of the .I declaration of .I foobar in the include file. If, however, these two parameter lists differ (even slightly) then there is an irreconcilable conflict. In such cases, the .I protoize program issues a warning, picks one formal parameter list arbitrarily, performs the conversion, and moves on. The user is advised to heed such warnings, and (if possible) to rename one of the two conflicting functions in one of the two programs (so that conflicts will not arise) and to begin again on the whole conversion process. .PP Finally, note that it is naive to assume that conversion to prototype format will make old style C code into legitimate ANSI C code (or legitimate C++ code for that matter). The automatic protoization of your source files via the .I protoize program is only one step (albeit a big one) towards full conversion. A full conversion may also require lots of editing "by hand". .SH WARNINGS There are several possible warning and/or error messages which .I protoize will issue for strange circumstances (e. g. missing input files, etc). These are mostly self-explanatory. .PP Note that .I protoize will refuse to convert anything if any of the input .B .p files are older that any of the files it thinks should be considered for conversion (i.e. all of the .I base and .I include files which were used to build the program(s) being converted. This insures that all .B .p prototype files used to guide the operation of .I protoize are up-to-date (relative to the input files being converted). If .I protoize complains about the last modification time of .B .p files (or the files to be converted) this indicates that you need to recompile some or all of the files for the program(s) being converted (using the .B -fgen-prototypes option), and thereby insure that all of the necessary .B .p files are up-to-date. .SH FILES .ta 2.0i /usr/local/bin/gcc GNU C compiler .br /usr/local/bin/protoize the protoize program .br /usr/local/lib/std.c.p standard system prototypes file .SH "SEE ALSO" gcc(1), g++(1) .SH BUGS There are many cases in which .I protoize can be hopelessly confused by source code which has comments, preprocessor commands, or macro-calls in the vicinity of something which it has to convert. Fortunately, these cases seem to be very rare in practice. All of the source files for GCC itself (with the exception of the .I hard-params.c file) have been successfully protoized without any special modifications. Keep in mind that .I protoize knows nothing about preprocessor commands or macro-calls, and it knows very little about C style comments. .PP Because the information written to the prototype files by GCC is derived from the original source code .I after it has been preprocessed, there are certain problems resulting from the uses of macro-calls in function definitions and/or declarations. These problems are a permanent part of .I protoize and there is no use complaining about them. .PP Bugs (and requests for reasonable enhancements) should be reported to rfg@mcc.com. Bugs may actually be fixed if they can be easily reproduced, so it is in your interest to report them in such a way that reproduction is easy. .SH COPYING Copyright (c) 1989 Free Software Foundation, Inc. .br Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. .br Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. .br Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be included in translations approved by the Free Software Foundation instead of in the original English. .SH AUTHORS Written by Ron Guilmette at the Microelectronics and Computer Technology Corporation. (rfg@mcc.com) .sp 1 See the GNU CC Manual for the contributors to GNU CC. -- // Ron Guilmette - MCC - Experimental Systems Kit Project // 3500 West Balcones Center Drive, Austin, TX 78759 - (512)338-3740 // ARPA: rfg@mcc.com // UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg