tml@santra.UUCP (Tor Lillqvist) (12/23/87)
I have noticed some strange behaviour in sed (similar problems probably also exist in other users of regexp(3)). I have the following sed script: sed -e 's/^\([^.]*\)[^:]*:\([^ ]*\) \1/\2 \1/' \ -e 's/^\([^.]*\)[^:]*:\([^ ]*\) /\2 \1:/' \ -e 's/\\-/-/g' -e 's/\\\*-/-/g'\ -e 's/^\.TH [^ ]* \([^ ]*\).* \([^-]*\)/\2(\1) /' and this input file: ssignal.3c:.TH SSIGNAL 3C "" "" HP-UX ssignal, gsignal \- software signals stdio.3s:.TH STDIO 3S "" "" HP-UX stdio \- standard buffered input/output stream file package stdipc.3c:.TH STDIPC 3C "" "" HP-UX ftok \- standard interprocess communication package string.3c:.TH STRING 3C "" "" HP-UX strcat, strncat, strcmp, strncmp, strcpy, strncpy, strlen, strchr, strrchr, strpbrk, strspn, strcspn, strtok \- character string operations strtod.3c:.TH STRTOD 3C "" "" HP-UX strtod, atof, nl_strtod, nl_atof \- convert string to double-precision number I get the output: ssignal, gsignal (3C) - software signals stdio (3S) - standard buffered input/output stream file package ftok (3C) - standard interprocess communication package strcat, strncat, strcmp, strncmp, strcpy, strncpy, strlen, strchr, strrchr, strpbrk, strspn, strcspn, strtok (3C) - character string operations strtod, atof, nl_strtod, nl_atof (3C) - convert string to double-precision number which isn't what I want. However, if I change the sed script to: sed -e 's/^\([^.]*\)\.[^:]*:\([^ ]*\) \1/\2 \1/' \ -e 's/^\([^.]*\)\.[^:]*:\([^ ]*\) /\2 \1:/' \ -e 's/\\-/-/g' -e 's/\\\*-/-/g'\ -e 's/^\.TH [^ ]* \([^ ]*\).* \([^-]*\)/\2(\1) /' I get: ssignal, gsignal (3C) - software signals stdio (3S) - standard buffered input/output stream file package stdipc:ftok (3C) - standard interprocess communication package string:strcat, strncat, strcmp, strncmp, strcpy, strncpy, strlen, strchr, strrchr, strpbrk, strspn, strcspn, strtok (3C) - character string operations strtod, atof, nl_strtod, nl_atof (3C) - convert string to double-precision number which is what I want. I.e. I add an \. after the \([^.]*\) . (As you probably notice, I am trying to enhance the /usr/lib/mkwhatis script so that the whatis database would include the title of the manual page in case it isn't the same as the (first) entry.) Is this a bug in sed or regexp(3), or what? The same behaviour occurs both in HP-UX on the 9000/840 and BSD4.3 on a VAX. -- Tor Lillqvist, Technical Research Centre of Finland tml@fingate.bitnet == tml@santra.uucp == mcvax!santra!tml