[comp.editors] SED Question

smaxwell@hpcuhc.HP.COM (Susan Maxwell) (04/11/89)

Got a sed question for you:  I'm creating a sed filter to mask out and
delete unwanted characters in a trace file.  I'm having trouble with this
configuration:

        LINE I:    PROMPT>
        LINE I+1:  >
        LINE I+2:  <

Whenever I see this series, I want to delete ALL THREE lines.  I can't
create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
sed won't allow patterns to span lines.  Is there a way to do this, without
resorting to awk or ed?   

Susan Maxwell

uucibg@sw1e.UUCP (3929]) (04/11/89)

In article<2360001@hpcuhc.HP.COM> smaxwell@hpcuhc.HP.COM (Susan Maxwell) writes:
>Got a sed question for you:  I'm creating a sed filter to mask out and
>delete unwanted characters in a trace file.  I'm having trouble with this
>configuration:
>        LINE I:    PROMPT>
>        LINE I+1:  >
>        LINE I+2:  <
>Whenever I see this series, I want to delete ALL THREE lines.  I can't
>create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
>sed won't allow patterns to span lines.  Is there a way to do this, without
>resorting to awk or ed?   
>
>Susan Maxwell

cat "file to filter" | sed -e "/^PROMPT>/,/^</d"

should do the trick.  It says:  delete (inclusive) all lines from one which
starts with the text "PROMPT>" and one which starts with the text "<".
Note that you can do

sed -e "/^PROMPT>$/,/^<$/d" if you need the lines to contain exactly that
text.


Brian R. Gilstrap                          Southwestern Bell Telephone
One Bell Center Rm 17-G-4                  ...!ames!killer!texbell!sw1e!uucibg
St. Louis, MO 63101                        ...!bellcore!texbell!sw1e!uucibg
(314) 235-3929
#include <std_disclaimers.h>

hm@uva.UUCP (HansM) (04/12/89)

In article <2360001@hpcuhc.HP.COM> smaxwell@hpcuhc.HP.COM (Susan Maxwell) writes:
>
>Got a sed question for you:  I'm creating a sed filter to mask out and
>delete unwanted characters in a trace file.  I'm having trouble with this
>configuration:
>
>        LINE I:    PROMPT>
>        LINE I+1:  >
>        LINE I+2:  <
>
>Whenever I see this series, I want to delete ALL THREE lines.  I can't
>create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
>sed won't allow patterns to span lines.  Is there a way to do this, without
>resorting to awk or ed?   
>
>Susan Maxwell


How about "sed -f sedscript", where the file named sedscript contains:

/PROMPT>/ {
	    N
	    /PROMPT>\n>/{
			  N
			  /PROMPT>\n>\n</d
			}
	  }

Hope this is what you are looking for.

--
Hans Mulder	mcvax!uva!hm	hm%uva@hp4nl.nluug.nl

jgrace@bbn.com (Joe Grace) (04/12/89)

In article <2360001@hpcuhc.HP.COM> smaxwell@hpcuhc.HP.COM (Susan Maxwell) writes:
>
>Got a sed question for you:  I'm creating a sed filter to mask out and
>delete unwanted characters in a trace file.  I'm having trouble with this
>configuration:
>
>        LINE I:    PROMPT>
>        LINE I+1:  >
>        LINE I+2:  <
>
>Whenever I see this series, I want to delete ALL THREE lines.  I can't
>create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
>sed won't allow patterns to span lines.  Is there a way to do this, without
>resorting to awk or ed?   
>
>Susan Maxwell

Yes, there is a way to get sed to do this, but you have to be
wary of sed's shortcomings.

Try:
----
#! /bin/sh

(cat trace.out;  echo "SomeUniqueID")  \
| sed  \
    -e '/^PROMPT>$/{'  \
      -e 'N'  \
      -e '/\nSomeUniqueID$/{;s@\nSomeUniqueID$@@;q;}'  \
      -e 'N'  \
      -e '/\nSomeUniqueID$/{;s@\nSomeUniqueID$@@;q;}'  \
      -e '/^PROMPT>\n>\n<$/d'  \
    -e '}'  \
    -e '/^SomeUniqueID$/d'

If sed had a way of handling an EOF without quitting, the
SomeUniqueID would be unnecessary.  As sed is, the SomeUniqueID
is used to avoid losing lines which start out the same as your
pattern but which do not fully match your pattern --- and which
you therefore want to keep.

Cheers,

= Joe =
Joe Grace
ARPA: jgrace@bbn.com
UUCP: {harvard,husc6,decvax,etc.}!bbn!jgrace

rupley@arizona.edu (John Rupley) (04/12/89)

In article <38564@bbn.COM>, jgrace@bbn.com (Joe Grace) writes:
> In article <2360001@hpcuhc.HP.COM> smaxwell@hpcuhc.HP.COM (Susan Maxwell) writes:
> >Got a sed question for you:  I'm creating a sed filter to mask out and
> >delete unwanted characters in a trace file.  I'm having trouble with this
> >configuration:
> >        LINE I:    PROMPT>
> >        LINE I+1:  >
> >        LINE I+2:  <
> >Whenever I see this series, I want to delete ALL THREE lines.  I can't
> >create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
> >sed won't allow patterns to span lines.  Is there a way to do this, without
> >resorting to awk or ed?   
> 
> Yes, there is a way to get sed to do this, but you have to be
> wary of sed's shortcomings.
> 
> Try:
> ----
> #! /bin/sh
> 
> (cat trace.out;  echo "SomeUniqueID")  \
> | sed  \
>     -e '/^PROMPT>$/{'  \
>       -e 'N'  \
>       -e '/\nSomeUniqueID$/{;s@\nSomeUniqueID$@@;q;}'  \
>       -e 'N'  \
>       -e '/\nSomeUniqueID$/{;s@\nSomeUniqueID$@@;q;}'  \
>       -e '/^PROMPT>\n>\n<$/d'  \
>     -e '}'  \
>     -e '/^SomeUniqueID$/d'
> 
> If sed had a way of handling an EOF without quitting, the
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> SomeUniqueID would be unnecessary.  As sed is, the SomeUniqueID
> is used to avoid losing lines which start out the same as your
> pattern but which do not fully match your pattern --- and which
> you therefore want to keep.

Sed can check for EOF, I believe, and fairly simply, by use of the
address "$", for last line of file.  The following script should take
care of the last line problem, as well as some additional but probably
not all other abnormalities. 

sed -e '/^PROMPT>$/{
	: restart
	/^PROMPT>$/{
		h
		$q
		N
		/^PROMPT>\n>$/{
			h
			$q
			N
			/^PROMPT>\n>\n<$/d
			x
			p
			x
			s/^PROMPT>\n>\n//
			b restart
		}
		x
		p
		x
		s/^PROMPT>\n//
		b restart
	}
}'


John Rupley
 uucp: ..{uunet | ucbvax | cmcl2 | hao!ncar!noao}!arizona!rupley!local
 internet: rupley!local@megaron.arizona.edu
 (H) 30 Calle Belleza, Tucson AZ 85716 - (602) 325-4533
 (O) Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 - (602) 621-3929

jgrace@bbn.com (Joe Grace) (04/13/89)

John is definitely right about the "$" and bugs, even his sed style is
better (Thanks!).  But I just thought about another way you *could* do
it using the (new :-) "$" feature assuming your sed hold buffer is big
enough (likely, it won't be :-( :-( :-().  But this script is the
simplest so far, so... 

  sed -n -e 'H;${;x;s@^\(.\)\(.*\)@\2\1@;s@PROMPT>\n>\n<\n@@g;s@\n$@@;p;}'
  
= Joe =

Joe Grace
ARPA: jgrace@bbn.com
UUCP: {harvard,husc6,decvax,etc.}!bbn!jgrace

smaxwell@hpcuhc.HP.COM (Susan Maxwell) (04/14/89)

/ hpcuhc:comp.editors / smaxwell@hpcuhc.HP.COM (Susan Maxwell) /  5:43 pm  Apr 10, 1989 /

Got a sed question for you:  I'm creating a sed filter to mask out and
delete unwanted characters in a trace file.  I'm having trouble with this
configuration:

        LINE I:    PROMPT>
        LINE I+1:  >
        LINE I+2:  <

Whenever I see this series, I want to delete ALL THREE lines.  I can't
create a pattern /PROMPT\>\\n\>\\n\</d   as far as I can tell, because
sed won't allow patterns to span lines.  Is there a way to do this, without
resorting to awk or ed?   

Susan Maxwell

----------

jackh%twinpeaks@Sun.COM (John Hevelin) (04/17/89)

This works for me.  Put the following in a script
and call with "sed -f script file"


/PROMPT>/ {
N
/PROMPT>\n[^>]/ {
P
D
p
b
}
/PROMPT>\n>/ {
N
/PROMPT>\n>\n[^<]/ {
P
D
P
D
p
b
}
/PROMPT>\n>\n</ {
s/\n//g
d
}
}
}

merlin@violet.berkeley.edu (04/18/89)

I think this does it (script run with "-f"):

/PROMPT>/ {
$b eof
N
$b eof
N
/PROMPT>\n>\n</ {
d
b
}
p
d
}
b
: eof
p
d
q

rupley@arizona.edu (John Rupley) (04/18/89)

In article <23224@agate.BERKELEY.EDU>, merlin@violet.berkeley.edu writes:
> I think this does it (script run with "-f"):
> 
> /PROMPT>/ {
> $b eof
> N
> $b eof
> N
> /PROMPT>\n>\n</ {
> d
> b
> }
> p
> d
> }
> b
> : eof
> p
> d
> q

No cigar - try:

xxxPROMPT>
>
<
PROMPT>
>
PROMPT>
>
<

The script deletes what it shouldn't and doesn't what it should.

At least three previous postings give apparently correct solutions.

John Rupley
rupley!local@megaron.arizona.edu

rupley@arizona.edu (John Rupley) (04/18/89)

In article <99289@sun.Eng.Sun.COM>, jackh@sun.UUCP (John Hevelin) writes:
> This works for me.  Put the following in a script
> and call with "sed -f script file"

[sed script deleted] 

Seems like a buggy script -- for starters try it on:

xxxPROMPT>
>
<
PROMPT>
>
PROMPT>
>
<
testing22
PROMPT>
>

Several previous postings handle such oddities, as does the Lex one-liner:

%%
^PROMPT>\n>\n<\n		;

John Rupley
rupley!local@megaron.arizona.edu

bernsten@phoenix.Princeton.EDU (Dan Bernstein) (04/18/89)

How about

  (tr 'X\012' '\377X';echo '')
  | sed 's/XPROMPTX<X>X/X/g'
  | tr '\377X' 'X\012' | sed -n \$\!p

The echo '' to add a blank line and sed -n \$\!p  to take it away
are necessary because sed can't handle input without a terminating
newline. The above commands will delete any three lines with exactly
PROMPT, <, and >; I don't remember if that was the original question
but this solution is much easier to generalize than the others given,
not to mention faster.

As usual, since we're dealing with sed, if your original input does not
end with a newline then the last line will disappear. If you're worried
about this, take away the sed -n \$\!p; then normal inputs will have an
extra line on the end and no-newline last lines will not be munched.

Of course, both tr and sed will munch nulls, and any 255's will be
changed to X's.

---Dan Bernstein, bernsten@phoenix.princeton.edu