[comp.sys.mac] Utility for joining split files

dlv059@Mipl3.JPL.Nasa.Gov (07/27/88)

A few weeks ago I posted a question about a problem I was having with joining 
multi-part files that were received from comp.binaries.mac.  I received many 
helpful responses.  To show my gratitude I am here posting a short DCL command 
file that I developed to append the files, strip out comments, and download 
the resulting file to my Mac.  Of course this will only be useful to those of
you who use VMS.

The file JOINER.COM (which has some internal documentation) is executed by
typing @JOINER file, where 'file' is the root name of the multi-part document.
This assumes that you use a naming convention similar to mine. For example, 
the program "Equation Solver" was divided into three parts.  On the VAX, I 
called them EQUATE1.HQX, EQUATE2.HQX, and EQUATE3.HQX.  I joined them by 
typing @JOINER EQUATE, and the file EQUATE.HQX was created and downloaded to 
my Mac.

By the way, on my system, the command "KERMIT" runs the VMS Kermit, and "EDT" 
runs the EDT editor.  Your mileage may differ.

Hope this is helpful for some of you.

--------------------------cut here-----------------------------------------
!This is a short command file to append all the parts of a split .hqx file 
!into one file, edit out all the non-binhexed material, and send the result
!to the Mac via Kermit.  The split files are assumed to be named filen.hqx,
!where 'file' is the root file name and 'n' is the part number.  The required
!argument P1 is 'file', and the joined file will be called file.hqx.
!	NOTE: the edit commands were designed for files received from 
!comp.binaries.mac, and will not work with files from sumex or other archives.
!	--Dave Votaw, Jet Propulsion Lab  (dlv059@ipl.jpl.nasa.gov) 7/26/88
$ copy 'P1'1.hqx 'P1'
$ part = 2	! Assume at least two parts
$LOOP:		! This section finds out how many parts to the file
$ file = p1 + f$string(part) + ".hqx"
$ if F$SEARCH(file) .eqs. "" then goto append
$ part = part + 1
$ goto loop
$APPEND:	! Here we assemble the parts into one file
$ parts = part - 1
$ part = 2
$LOOP2:
$ file = p1 + f$string(part) + ".hqx"
$ append 'file' 'p1'.hqx
$ if part .eq. parts then goto clean
$ part = part + 1
$ goto loop2
$CLEAN:		! Here we clean up the assembled file
$ open/write outfile clean.com		! Create a com file to do it for us
$ write outfile "$ edt/nocom " + p1 + ".hqx"
$ write outfile "delete 1 thru ""---"""		! Removes introductory stuff
$ part = 2
$LOOP3:
$ write outfile "delete "" ---"" thru ""---"""	! Removes between-part stuff
$ if part .eq. parts then goto final
$ part = part + 1
$ goto loop3
$FINAL:
$ write outfile "delete ""---"""		! Removes last line
$ write outfile "exit"
$ close outfile
$ @clean			! Execute the file, then delete it
$ delete clean.com;*
$ open/write outfile ker.com	! Create a command file for Kermit
$ write outfile "send ''p1'.hqx"
$ write outfile "exit"
$ close outfile
$ kermit @ker			! Execute Kermit with command file
$ delete ker.com.
$ exit

Dave Votaw	dlv059@ipl.jpl.nasa.gov

straka@ihlpf.ATT.COM (Straka) (08/03/88)

In article <2955@utastro.UUCP> werner@utastro.UUCP (Werner Uhrig) writes:
...

I'm sorry to post this to this newsgroup because of its length, but here is
the utility I wrote ~1.5 years ago to combine split binhex files.  I tried to
post it to comp.sources.mac a few months back, but it never appeared, even
after assurances by Roger Long that it would.

It is called bhcomb (yes, I know, lousy name), and although not a perfect
utility, is still VERY robust.  Since it is compiled and special purpose, it
is reasonably efficient.  It was written under SYS5 UNIX(R), but as I
understand, it compiles and runs just fine under other BSD4.x just fine.
(Nothing cute, just std.in and std.out (and std.err).)

Rich Straka ihnp4!ihlpf!straka

Anyway, here goes!:

/*	bhcomb.c: combine and strip header information from BinHexed files.
	          for MacIntosh file transfer.
	Author: R. J. Straka
		(revised by G. A. Taylor)
	Revision 1.1a
	Date: July 6, 1987 (1.1a: March 31, 1988)

	Bhcomb is a program that takes a BinHexed MacIntosh file that
	has been broken into several pieces to avoid electronic mailer
	handling problems and splices them back together again.  Bhcomb
	does this process in a totally automated fashion (when
	accompanied by an appropriate shell script), and attempts to be
	fairly rigorous by:

	1) Looking for the logical start of the file (delimited by the
	   string: "(This file ...)"
	2) Checking each line of the input for proper length and validity
	   of all characters.
	3) Looking for the logical end of the file.

	Bhcomb was developed under UNIX SVR2, and uses stdin, stdout
	   and stderr exclusively:
		Stdin is used for the input.
		Stdout is used for the valid file output.
		Stderr is used for the garbage and diagnostics.

	After bhcombing, the user would typically use xmodem (or
	  similar) or macput on the resulting file.

	Bhcomb assumes the following BinHex file structure:

	several lines of unrelated header
	(This file must be converted with BinHex 4.0)
	:123456789012345678901234567890123456789012345678901234567890123
	1234567890123456789012345678901234567890123456789012345678901234
	.
	.
	.
	1234567890123456789012345678901234567890123456789012345678901234
	1234...4321:
	several lines of unrelated footer

	Additional unrelated headers and/or footers may be present
	   within the data stream.
	The actual data is prepended with "(This file... BinHex 4.0)"
	   and an extra blank line.
	The actual data begins with a framing ":" (not checked)
	The actual data must end with a framing ":"
	All data lines (except potentially the last) are of the same
	   length (default=64).
	The last data line is of random length, and ends with a ":".
	Certain characters are never seen within the BinHex portion:
		nothing < \012
		nothing > \012, yet < \040
		no spaces
		no . / ; < = > ? O g n o s t u v w x y z { } characters
		no | ~ \ ] ^ _ characters
		nothing > \176

	Data is gathered through stdin.
	Good data is sent to stdout.
	Bad data and diagnostics are sent to stderr.

	A shell line (or procedure) of the following form is recommended:

	   bhcomb <foo?.net >foo 2>foo.doc || echo ^G bhcomb Failed!

	Where the input filenames are foo1.net, foo2.net, ...  The shell
	   should put the files in the proper order given proper naming
	   convention by the files' creator.

	BUGS:
		More than one BinHexed file per invocation ignores all
		  but the first BinHexed file.
		Does not check for additional ":"s inside of the valid
		  portion of the data.
		Has no way to check for files in inappropriate order
		  (except for the first and last)
		Could be made more efficient by being table driven.
		No manual page.  (You can tell I don't write
		  applications code for a living.)
	
	Revision Notes:
		1.0:	Original Release
		1.1:	Now recognizes last line of exactly LENGTH chars
			  without complaining.
			Minor check added for out of sequence input files.
*/

#include <stdio.h>
#include <string.h>
#define	LENGTH	64			/* LENGTH = default BinHex line
					     length = 64
					*/
char header[] = "(This file must be converted with BinHex 4.0)";

main(argc,argv)
{
int valid=0, started=0, lth;
					/* started = "we have started
					     collecting valid BinHex data"
					   valid = "the last line encountered
					     was a valid BinHex line"
					   lth = line length
					*/
char inline[256];
while (gets (inline) >0)
	{
	/* Hacked to accept "StuffIt" encoding -- GAT 3/31/88 */
	if (strncmp (inline,"(This file must be converted with BinHex 4.0)",45)==0
	 || strncmp (inline,"(Convert with BinHex or StuffIt)",32)==0)
		{
		if (started != 0)	/* Have we already started? */
					/* If so, something is wrong! */
			{
			started=0;	/* Unused hook for multiple files */
			fprintf(stderr,"%s\n",inline);		  /* Print it */
			fprintf(stderr,"More than one BinHex file!\n");/*ERROR*/
			exit (1);
			}
		else
			{
			/* Print "standard header" and blank line -- GAT 3/31/88 */
			printf("%s\n",header);  /* "standard" header line */
			printf("\n");		/* dummy blank line */
			valid=1;		/* This line of data is valid */
			started=1;		/* We started data gathering
			*/
			}
		}
	else
		{
		lth=strlen (inline);

		if (lth == 0) /* Consume blank lines quietly  -- GAT 3/31/88 */
		    {
		    fprintf(stderr,"\n");
		    continue;
		    }

		if (badchars (inline,lth) != 0)	/* Do we have illegal chars? */
			{
			fprintf(stderr,"%s\n",inline);	/* Put to stderr */
			valid=0;			/* Line not valid */
			}
		else
			{				/* All chars OK */
			if (strlen (inline) != LENGTH)	/* if bad line length */
				{
				if (valid!=1)	/*not expecting last line with :*/
					{
					fprintf(stderr,"%s\n",inline); /*bad line*/
					valid=0;	/* Line not valid */
					}
				else			/*expecting last line with : */
					{
					if (findcolon (inline) == 0)
					/* if colon at end of line */
						{
						printf("%s\n",inline); /* last line */
						started=2;	/* FINISHED */
						exit (0);    /* NORMAL EXIT */
						}
					else
						{
						fprintf(stderr,"%s\n",inline);
						valid=0;	/* bad line */
						}
					}
				}
			else
				{
				if (started != 1)
					{
					fprintf(stderr,"%s\n",inline);	/* Print it */
					fprintf(stderr,"No beginning BinHex message; files may be out of order.\n");  /*ERROR*/
					fprintf(stderr,"Out of phase, get help. :-)\n");  /*ERROR*/
					exit (1);
					}
				else
					{
					printf("%s\n",inline);	/* Good line */
					valid=1;
					if (findcolon (inline) == 0)
					/* if colon at end of
					   this 64 character line */
						{
						started=2;	/* FINISHED */
						exit (0);    /* NORMAL EXIT */
						}
					}
				}
			}
		}
	}
fprintf(stderr,"Improper EOF; no ending colon!\n");  /* should never get here */
exit (2);
}

badchars(lptr,length)			/* Look for illegal characters */
char *lptr;
int length;
{
int badchar, p;
char c;
c='a';
badchar=0;
for (p=0;p<length;p++)
	{
	c=lptr[p];
	if (c < '\n')            {badchar=1; break;}
	if (c > '\n' && c < '!') {badchar=1; break;}
	if (c > '-' && c < 0)    {badchar=1; break;}
	if (c > ':' && c < '@')  {badchar=1; break;}
	if (c == 'O')            {badchar=1; break;}
	if (c > '[' && c < '`')  {badchar=1; break;}
	if (c == 'g')            {badchar=1; break;}
	if (c > 'n' && c < 'o')  {badchar=1; break;}
	if (c >  's')            {badchar=1; break;}
	}
return (badchar);
}

findcolon(lptr)			/* Look for : at end of line */
char *lptr;
{
int p;
p=strlen(lptr);
while (lptr[p--]=='\n') ;	/* get rid of all possible trailing \n_s */
if (lptr[p]==':')
	{
	return (0);
	}
else
	{
	return (1);
	}
}


-- 
Rich Straka     ihnp4!ihlpf!straka

Avoid BrainDamage: MSDOS - just say no!