[comp.unix.questions] Redirecting output in AWK

rkxyv@mergvax (Rob Kedoin) (10/04/88)

I am trying to use AWK to split one file into many formatted, smaller files.

The problem I am having is that I cannot output to more than 10 files.  I am
doing my output in a manner similar to:
	MY_FILE="foo";
	printf("abc\n") >> MY_FILE;
	close(MY_FILE);
Even though I am executing the close, I am still running out of files.  Does
anyone know of a fix to this ?

Thanks in advance,

		-Rob Kedoin

UUCP:	...{philabs,motown,icus}\!mergvax\!rkxyv
ARPA:	rkxyv%mergvax.UUCP@uunet.uu.net
BITNET:	rkxyv%mergvax.UUCP@uunet.uu.net.BITNET
SNAIL-mail: Linotype Company - R&D 425 Oser Avenue Hauppauge, NY 11788
VOICE: (516) 434 - 2729

dce@mips.COM (David Elliott) (10/04/88)

In article <816@mergvax> rkxyv@mergvax (Rob Kedoin) writes:
>
>I am trying to use AWK to split one file into many formatted, smaller files.
>
>The problem I am having is that I cannot output to more than 10 files.  I am
>doing my output in a manner similar to:
>	MY_FILE="foo";
>	printf("abc\n") >> MY_FILE;
>	close(MY_FILE);
>Even though I am executing the close, I am still running out of files.  Does
>anyone know of a fix to this ?

I tested awk (SVR3.1) using '>' on our system and it works just fine.
Are you sure that '>>' is affected by close?

Anyway, in case you can't get a fix in time, here are some tricks you
can use to get the job done:

1. Use awk to generate a set of lines of the form:

	filename text

   and pipe it to a shell "while read" loop.  The code would look
   like:

	awk '{ ... }' | sed 's/\\/\\\\/g' | while read file line
	do
		echo "$line" >> "$file"
	done

   (The sed script is required if your 'read' processes backslashes; mine
   does.)

2. Make multiple passes on the input, processing only 9 files per pass,
   putting error messages in the 10th.  The code would look like:

	INPUT=/usr/tmp/script.input.$$
	TEMPF=/usr/tmp/script.tempf.$$
	ERRORS=/usr/tmp/script.errs.$$
	cp input $INPUT
	while [ -s $INPUT ]  # while input is not empty
	do
		awk 'BEGIN {
			nfiles = 0;
			errfile = "'$ERRORS'";
		}
		(nfiles > 9) {
			print;
			next;
		}
		{
			process input, putting errors into errfile and
			incrementing nfiles when you open a new file
		}' $INPUT > $TEMPF
		if [ -s $ERRORS ]
		then
			cat $ERRORS 1>&2
			break
		fi
		cp $TEMPF $INPUT
	done
	rm -f $INPUT $TEMPF $ERRORS

3. Combine these two, doing 9 files of output in the awk, and printing
   the "filename text" lines to stdout for the while loop to process.
   If you have fewer than 10 files, you get all the work done in awk,
   and if you have more, you still come out OK, but slower.

   This should be easy to do if you define a function to do your
   outputting, letting it keep track of the number of output files
   and deciding whether to put the output into a file or to format
   it to stdout.

-- 
David Elliott		dce@mips.com  or  {ames,prls,pyramid,decwrl}!mips!dce

les@chinet.UUCP (Leslie Mikesell) (10/05/88)

In article <816@mergvax> rkxyv@mergvax (Rob Kedoin) writes:

>I am trying to use AWK to split one file into many formatted, smaller files.

>The problem I am having is that I cannot output to more than 10 files.  I am
>doing my output in a manner similar to:
>	MY_FILE="foo";
>	printf("abc\n") >> MY_FILE;
>	close(MY_FILE);
>Even though I am executing the close, I am still running out of files.  Does
>anyone know of a fix to this ?

This is supposed to be fixed in the SysVr3 new awk (nawk).  However I've
done something similar by generating a shar-like output and piping it
to /bin/sh.  The output can look something like this:
cat >file1 <<!EOF
..contents of file1
!EOF
cat >file2 <<!EOF
..contents of file2
!EOF
..etc.

Or, of course, you could use Larry Wall's perl which does not have
awk's limitations.

Les Mikesell

henry@utzoo.uucp (Henry Spencer) (10/06/88)

In article <816@mergvax> rkxyv@mergvax (Rob Kedoin) writes:
>I am trying to use AWK to split one file into many formatted, smaller files.
>The problem I am having is that I cannot output to more than 10 files...

Well, it won't help you right now, but the long-term fix is to complain
to your software supplier and ask them to get rid of the silly limit.
It's not that hard.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

domo@riddle.UUCP (Dominic Dunlop) (10/13/88)

In article <816@mergvax> rkxyv@mergvax (Rob Kedoin) writes:
>I am trying to use AWK to split one file into many formatted, smaller files.
>The problem I am having is that I cannot output to more than 10 files...

The way around this problem is to have awk produce output which looks like

	cat << E_O_F > file1
	Data for file1
	...
	E_O_F
	cat << E_O_F > file2
	Data for file2
	E_O_F
	...
	cat << E_O_F > file763
	Data for file 763
	...
	E_O_F

and pipe it into the standard input of a shell.  You can even have awk do
the piping by appending  | "sh"  to the relevant print and printf lines in
the awk script.
-- 
Dominic Dunlop
domo@sphinx.co.uk  domo@riddle.uucp