rjk@sawmill.uucp (Richard Kuhns) (11/10/89)
I'm not entirely sure that this is the newsgroup I should use, but I've seen a number of perl questions/answers and I don't know of a better newgroup (until comp.lang.perl comes along). My question: I'd dearly love to have a filter, written in perl (the rest of the code for this project is in perl, and I'll post it when I get it working), which would turn the string `B^HBO^HOL^HLD^HD' into `$bold_startBOLD$bold_end', where $bold_start and $bold_end are predefined character strings. I have a filter that does this already written in C, but it seems to me I should be able to do it easier in perl (using regular expressions?), but I can't come up with a good way to do it. /(.)\010$1/ recognizes one element of such a string (always the first). s/(.)\010$1/$1/g specifically does NOT work (it only changes the first occurence). Thanks (in advance, of course). Rich Kuhns newton.physics.purdue.edu!sawmill!rjk
tchrist@convex.COM (Tom Christiansen) (11/11/89)
In article <RJK.89Nov9162936@sawmill.uucp> rjk@sawmill.uucp (Richard Kuhns) writes: |I'm not entirely sure that this is the newsgroup I should use, but |I've seen a number of perl questions/answers and I don't know of a |better newgroup (until comp.lang.perl comes along). |My question: I'd dearly love to have a filter, written in perl (the |rest of the code for this project is in perl, and I'll post it when I |get it working), which would turn the string `B^HBO^HOL^HLD^HD' into |`$bold_startBOLD$bold_end', where $bold_start and $bold_end are |predefined character strings. I have a filter that does this already |written in C, but it seems to me I should be able to do it easier in |perl (using regular expressions?), but I can't come up with a good way |to do it. /(.)\010$1/ recognizes one element of such a string (always |the first). s/(.)\010$1/$1/g specifically does NOT work (it only |changes the first occurrence). This is quite close to what you want: $SO = "\033[1m"; $SE = "\033[m"; $_ = "this string is B\010BO\010OL\010LD\010D today\n"; if (/(.)\010$1/) { $begin = $`; do { s/$&/$1/; } while /(.)\010$1/; ( $end = $' ) =~ s/.(.*)/$1/; s/^$begin/$&$SO/; s/$end$/$SE$&/; } print; I say "quite close" because if you consider the following string: $_ = "this string is B\010BO\010OL\010LD\010D and B\010BR\010RI\010IG\010GH\010HT\010T today\n"; The "and' also gets emboldened, which isn't quite right, but this should be a good starting point. It would be really nice if just s/((.)\010$1)+/${SO}$1${SE}/g; would somehow work without any explicit looping, but as with your substitute, $1 won't be reset on each scan. I'll forward this to the perl-users mailing list (who are waiting on comp.lang.perl) to see whether anybody there has any bright ideas. --tom Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
tchrist@convex.COM (Tom Christiansen) (11/12/89)
I just got mail from Larry Wall who pointed out that you need to use \1 in the LHS of the substitute. He said: |You want something like this: | | s/(.)\010\1/<<<$1>>>/g; | s/>>><<<//g; | |where <<< and >>> can be anything that don't occur in the text. | |Within a pattern you want to use \1, not $1, because $1 means interpolate |the previous pattern match. Which makes it work. When you're done, change the <<< and >>> into start-standout and end-standout, like this: s/<<</$SO/g; # or s/<<</\033[1m/g; or whatever s/>>>/$SE/g; --tom Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
maart@cs.vu.nl (Maarten Litmaath) (11/14/89)
At first I wrote it in sed: ----------8<----------8<----------8<----------8<----------8<---------- : top s/\(\n.*\)\n\(.\)^H\2/\1\2\ / t top s/\(.\)^H\1/\ \1\ / t top s/\n\([^\n]*\)\n/${bold}\1${endbold}/g ----------8<----------8<----------8<----------8<----------8<---------- Then I ran s2p on the script -> wrong perl script! ('\n' is handled incorrectly.) A rewrite in perl: ----------8<----------8<----------8<----------8<----------8<---------- #!/usr/local/bin/perl eval "exec /usr/local/bin/perl -S $0 $*" if $running_under_some_shell; $SO = "\033[7m"; $SE = "\033[m"; line: while (<>) { while (1) { while (s/(\n.*)\n(.)\010\2/$1$2\n/) { ; } last if !s/(.)\010\1/\n$1\n/; } s/\n(.*)\n/$SO$1$SE/g; print; } -- "Richard Sexton is actually an AI program (or Construct, if you will) running on some AT&T (R) 3B" (Richard Brosseau) | maart@cs.vu.nl, mcsun!botter!maart
merlyn@iwarp.intel.com (Randal Schwartz) (11/15/89)
In article <RJK.89Nov9162936@sawmill.uucp>, rjk@sawmill (Richard Kuhns) writes: | I'm not entirely sure that this is the newsgroup I should use, but | I've seen a number of perl questions/answers and I don't know of a | better newgroup (until comp.lang.perl comes along). | | My question: I'd dearly love to have a filter, written in perl (the | rest of the code for this project is in perl, and I'll post it when I | get it working), which would turn the string `B^HBO^HOL^HLD^HD' into | `$bold_startBOLD$bold_end', where $bold_start and $bold_end are | predefined character strings. I have a filter that does this already | written in C, but it seems to me I should be able to do it easier in | perl (using regular expressions?), but I can't come up with a good way | to do it. /(.)\010$1/ recognizes one element of such a string (always | the first). s/(.)\010$1/$1/g specifically does NOT work (it only | changes the first occurence). I saw this question come through the perl-users@virginia.edu mailing list first, but I'll post my reply here (being the token Perl wizard...:-): #!/usr/bin/perl $bold_start = "whatever"; $bold_end = "whatever"; while (<>) { if (/\010/) { s/(.)\010\1/\201\1\202/g; # surround bold with \201 and \202 s/\202\201//g; # optimize away all end-start pairs s/\201/$bold_start/og; # replace start with real start s/\202/$bold_end/og; # and likewise for end } print; } There you have it. OK, so it's not a one-liner... big deal. Just another Perl hacker, (lwall says he's "Not just another Perl hacker"... :-) -- /== Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ====\ | on contract to Intel's iWarp project, Hillsboro, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!uunet!iwarp.intel.com!merlyn | \== Cute Quote: "Welcome to Oregon... Home of the California Raisins!" ==/