[fa.info-vax] sort makes big working files

JDEIFIK@USC-ISIB.ARPA (Jeff Deifik) (10/19/85)

I am attempting to sort a text file with the following characteristics:
2722 blocks
11343 lines
variable length lines maximum 32K
stream LF
carraige return carriage control
1848 characters/line actual maximum

I defined sortwork0..9 to be on 5 different big disks and said
$sort/work_files=1 foo bar

My working set went up to 6087
The free disk space on one disk went down by 500,000 blocks
Sort seemed to use space from mainly sortwork1
I aborted sort when the disk space on one disk ran low
On other runs sort reported out of disk space, and aborted

Why did sort try to use 500,000 blocks of space?
What can I do to sort this file?
	(I really want the key to be the entire line of text)

	Jeff Deifik	jdeifik@usc-isib.ARPA	jdeifik@isi.ARPA
-------

info-vax@cca.UUCP (10/19/85)

From: Jeff Deifik <JDEIFIK@USC-ISIB.ARPA>

I am attempting to sort a text file with the following characteristics:
2722 blocks
11343 lines
variable length lines maximum 32K
stream LF
carraige return carriage control
1848 characters/line actual maximum

I defined sortwork0..9 to be on 5 different big disks and said
$sort/work_files=1 foo bar

My working set went up to 6087
The free disk space on one disk went down by 500,000 blocks
Sort seemed to use space from mainly sortwork1
I aborted sort when the disk space on one disk ran low
On other runs sort reported out of disk space, and aborted

Why did sort try to use 500,000 blocks of space?
What can I do to sort this file?
	(I really want the key to be the entire line of text)

	Jeff Deifik	jdeifik@usc-isib.ARPA	jdeifik@isi.ARPA
-------

info-vax@sri-kl (10/22/85)

From: Wahl.ES@Xerox.ARPA

Try:

$SORT/PROC=TAG

I found that I HAD to do that to sort some files or the sort would fill
up the ENTIRE disk and then die because of lack of space.

--Lisa

info-vax@sri-kl (10/24/85)

From: emacs!infinet!wanginst!decvax!ittatc!dcdwest!sdcsvax!hutch@cca-unix (Jim Hutchison)

In article <4594@cca.UUCP> you write:
>From: Jeff Deifik <JDEIFIK@USC-ISIB.ARPA>
>
>I am attempting to sort a text file with the following characteristics:
>2722 blocks
>11343 lines
>variable length lines maximum 32K
>stream LF
>carraige return carriage control
>1848 characters/line actual maximum
>
>I defined sortwork0..9 to be on 5 different big disks and said
>$sort/work_files=1 foo bar
>
>My working set went up to 6087
>The free disk space on one disk went down by 500,000 blocks
>Sort seemed to use space from mainly sortwork1
>I aborted sort when the disk space on one disk ran low
>On other runs sort reported out of disk space, and aborted
>
>Why did sort try to use 500,000 blocks of space?
>What can I do to sort this file?
>	(I really want the key to be the entire line of text)
>
>	Jeff Deifik	jdeifik@usc-isib.ARPA	jdeifik@isi.ARPA
>-------

Sort uses buckets, those files are buckets.  There is an option,
don't have those manuals at hand (down a floor).  It is also nice
to know that if you drop the temp file size down, for small input
file that they will get sorted faster (fs overhead).

Shrinking the file size down for big sorts will take longer, but
then again longer is sooner than never.

/*
	Jim Hutchison	UUCP:	{dcdwest,ucbvax}!sdcsvax!hutch
			ARPA:	hutch@sdcsvax
  [ Of course, these statements were typed into my terminal while I was away. ]
*/

mike@yetti.UUCP (Mike Clarkson ) (10/28/85)

Almost all sort algorithms use geometrically increasing space (and time)
to sort a file.  Although the sizes of the files created seem huge, 
they may be correct.  The simple way around this is to break the large 
file into say 10 smaller files and SORT them individually, and then use
MERGE to take the 10 sorted small files, and create your final sorted file.

{decvax|inhp4|linus|watmath}!utzoo!yetti!mike
-- 
Mike Clarkson,		  ...!allegra \		     BITNET: FS300013@YUSOL
CRESS, York University,	  ...!decvax   \			
4700 Keele Street,	  ...!ihnp4     > !utzoo!yetti!mike
North York, Ontario,	  ...!linus    /		     
CANADA M3J 1P3.		  ...!watmath /		     AT&T^G: (416) 667-3954