[comp.unix.programmer] sort difficulties

levisonm@qucis.queensu.CA (Mark Levison) (03/05/91)

  I have been trying to use the sort facility on machine (Sun 3/80, SunOS 4.0.3)
for the past few days. I have run into several problems first the -z option gets

maestrosh[277]sort -m -z 5000 breakup
sort: invalid use of command line options
maestrosh[278]

  Second to avoid running out of memory during the sort phase I have had to use
the split function and then sort each file individually followed by a merge
phase. When merging the result I have tried to use the -u option ie.

sort -n -mu file1 file2 >! outputfile

but this causes sort to include only one of the following lines

-----line 1 (wrapped around to avoid news/mailer problems)
77 0 4408 4410 1 DOCUMENT 1 0 8 SECTION 1 7 9 HEADING 1 8
/DOCUMENT/SECTION/HEADING

-----line 2
77 0 4689 4690 1 DOCUMENT 1 0 8 SECTION 2 15 9 HEADING 1 16
/DOCUMENT/SECTION/HEADING

  These lines are obviously different. In the short term I have gotten around
the problem by leaving the uniqing until after the merge phase and using the
uniq command which appears to work. As far as I can tell from the man page the
sort fields should be the entire line unless I specify otherwise on the command
line.

  Can anyone shed some light on these on these two problems?

Mark Levison
levisonm@qucis.queensu.ca

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/12/91)

In article <1105@maestro.queensu.CA> levisonm@qucis.queensu.CA (Mark Levison) writes:
> maestrosh[277]sort -m -z 5000 breakup
> sort: invalid use of command line options

Some sorts insist that you leave out the space: -z5000. (Pretty silly
interface, if you ask me. I don't see why sort should care what the
maximum line length is beforehand.)

> sort -n -mu file1 file2 >! outputfile

Wasn't there some talk about this in comp.bugs.sys5 recently? Basically,
yes, that's what some sorts do with -n, and you should either get used
to it or use some field specifiers to avoid it.

---Dan