klm@goon.cme.nbs.gov (Ken Manheimer) (05/06/89)
Issue: Unix 'join' utility dumps core when applied to certain very simple data Environment ----------- OS: Sun OS 3.5, 4.0, 4.0.1 CPUs: Apparently irrelevant - explicitly tested on 3/50, 3/180, 3/280 Severity: moderate Problem Description ------------------- Applied to specifically formed data in a specific mode, join works part way and then dumps core. Repeat By: ---------- Given files: "/tmp/base": a b b b b b b b b b b b b b b b b b b b b b <EOF> and "/tmp/add": ac c <EOF> then: % join -a3 /tmp/base /tmp/add a b Segmentation fault (core dumped) % Some apparently crucial aspects of these files - a line in the first file ('/tmp/base'), other than the first line, must have exactly twenty fields in it, and that line must come up for comparison against a line from the other file ('/tmp/add'). I cooked down these files from a real situation where i was merging two files of well over one thousand lines apiece. The situation is not artificial - the lines represent file-names followed by dates in a registry for a backup system, where the new items are being merged with the established entries. Workaround ---------- Use 'cat', 'sort', and 'awk' to accomplish the 'join -a3': % cat /tmp/base /tmp/add | sort -b +0 -1 +n1 - | awk '\ BEGIN { PREVIOUS = ""}\ $1 == previous { printf " %s",$2 }\ $1 != previous { if(NR != 1) printf "\n"\ printf "%s", $0;\ previous = $1 }\ END {printf "%c",10} ' a b ac c b b b b b b b b b b b b b b b b b b b b % Ken Manheimer klm@cme.nbs.gov or ..!uunet!cme-durer!klm National Institute of Standards and Technology (Formerly "National Bureau of Standards") CME Factory Automation Systems, Software Support "For without the inner the outer loses its meaning; and without the outer, the inner loses its substance." - R.D.Laing _The Politics of Experience_