rick@ariel.UUCP (R.MAUS) (01/10/85)
The following report was recently submitted to the UNIX* Support Group. I am reposting it for the benefit of the network community. PROBLEM: The "-f" option of "sort(1)" does not perform as advertised, when used in conjunction with the "-u" option. Given the following three examples, the output hinges on the first occurance of the text. The word is not folded to lower-case as described other than for comparision purposes. REPEAT-BY: $ sort -fu <<! $ sort -fu <<! $ sort -fu <<! JUNK Junk junk Junk junk JUNK junk JUNK Junk ! ! ! PRODUCES: JUNK Junk junk CONCLUSION: The output should always be in lower case. Changing the manual page to indicate the problem will only serve to confuse the situation. * UNIX is a trademark AT&T Bell Laboratories. Richard L. Maus, Jr. (Rick) AT&T-ISL HO 1K313 201-834-4532 ...!ho???!ariel!rick
eli@ahuta.UUCP (e.mantel) (01/11/85)
REFERENCES: <826@ariel.UUCP> With regard to "f" and several other options, the sort(1) manual page I'm looking at says: "The *ordering* is affected by the following options..." It is an idiosyncracy of the "u" option that the selection of the line output from a set of equal lines is not predictable by the user. The behavior of the "f" option is probably preferable to what ariel!rick suggests it should be. If I want to ignore case, I'll just use tr(1) before I ever turn it over to sort. Eli Mantel, AT&T Information Systems, Holmdel, NJ 07733 (ahuta!eli)
mike@enmasse.UUCP (Mike Schloss) (01/13/85)
> CONCLUSION: > The output should always be in lower case. Changing the manual > page to indicate the problem will only serve to confuse the > situation. > WRONG CONCLUSION: Extreanous flags should not be added to utilites when there already exists a perfectly good utility to perform that function. "sort -fu" caused some confusion. the output from "sort -f | uniq" is self explanatory. What you prbably want is : ... | tr "[A-Z]" "[a-z]" | sort -f | uniq | ...
kpmartin@watmath.UUCP (Kevin Martin) (01/14/85)
> Extreanous flags should not be added to utilites when there >already exists a perfectly good utility to perform that function. > "sort -fu" caused some confusion. > the output from "sort -f | uniq" is self explanatory. Of course, this only works if you want to sort and uniq by entire lines. If only part of the line is the sort key, you're stuck with the -u flag.
bruce@ISM780.UUCP (01/14/85)
> CONCLUSION: > The output should always be in lower case. Changing the manual > page to indicate the problem will only serve to confuse the > situation. I disagree, you've ignored the fact that there may be other stuff on the line that isn't part of the key. For example if my input were: sort -fu +0 -1 <<! JUNK fizz Junk frap junk fart ! Your fix would produce: junk fizz This doesn't correspond to an actual input line.
jim@ISM780B.UUCP (01/14/85)
The manual says "u -- Suppress all but one in each set of equal lines". It doesn't say which one. Now a *real* bug with sort -f is that it folds lower case onto upper case rather than vice versa as advertised. This means that [\]^_` (the ASCII characters between Z and a) sort after the letters instead of before.
preece@ccvaxa.UUCP (01/16/85)
The use of tr with sort is unlikely to do the desired thing. When I sort things I usually want the output to look like the input, including use of upper and lower case. It's only the ordering mechanism that should ignore case. You could write a similar sort using simple tools to (1) extract key fields into a file, (2) tr them, (3) attach the key fields on to the front of the corresponding data lines, (4) sort, (5) strip off the key fields. This seems like an awful lot of effort to do the natural kind of sort for text fields. The use of the -f flag seems like a perfectly natural way to tell the sort program to use special conventions for a text field just as the -n flag tells sort that to use special conventions for numbers. scott preece ihnp4!uiucdcs!ccvaxa!preece