[comp.unix.questions] a Sed question

slj@mtung.ATT.COM (S. Luke Jones) (12/06/88)

Here is my problem:
	I have a file composed of multi-line records.
	Continuation lines begin with arbitrary white space.
	I want to extract *some* of the records.

for example:
	In this file, the "Here is my..." and "for example" lines
	begin multi-line records.  Suppose the records I want
	are those that begin with an uppercase letter ("Here...")

	sed -n -e '/^[A-Z]/,/^[^(spc)(tab)]/p' file

	will print each such record, but would also get the "for..."
	line of the second record.

	sed -n -e '/^[A-Z]/,/^[(spc)(tab)]/p' file

	will not get "for..." but will only get the first following
	line that begins with white space.

I've tried combinations and permutations of -n and '/pat/,/pat/!d'
and I give up.

Why oh why can't SED act like AWK does and *not* print this line???
-- 
                                                  S. Luke Jones
                                       AT&T Infor#####Bell Labs
        200 Laurel Avenue, Room MT 2E-337, Middletown, NJ 07748
    slj@mtung.att.com  -or-  ...!att!mtung!slj   (201)-957-2733

maart@cs.vu.nl (Maarten Litmaath) (12/07/88)

slj@mtung.ATT.COM (S. Luke Jones) writes:
\Here is my problem:
\	I have a file composed of multi-line records.
\	Continuation lines begin with arbitrary white space.
\	I want to extract *some* of the records.

\for example:
\	In this file, the "Here is my..." and "for example" lines
\	begin multi-line records.  Suppose the records I want
\	are those that begin with an uppercase letter ("Here...")

\	sed -n -e '/^[A-Z]/,/^[^(spc)(tab)]/p' file

\	will print each such record, but would also get the "for..."
\	line of the second record.

\	sed -n -e '/^[A-Z]/,/^[(spc)(tab)]/p' file

\	will not get "for..." but will only get the first following
\	line that begins with white space.

#! /bin/sh

exec sed -n -e '
	/^[A-Z]/{
: loop
		p
		n
		/^[ 	]/b loop
		/^$/ b loop
	}
' $*
-- 
fcntl(fd, F_SETFL, FNDELAY):          |Maarten Litmaath @ VU Amsterdam:
      let's go weepin' in the corner! |maart@cs.vu.nl, mcvax!botter!maart

pjh@otter.hpl.hp.com (Patrick Hyland) (12/08/88)

If you're willing to use '-f' rather than '-e' then, the following
sedscript does what you want:

#n
: noprint
s/^[A-Z]/&/
t print
n
b noprint
: print
p
n
s/^[ 	]/&/
t print
b noprint


							Patrick Hyland.

muller@mirsa.UUCP (Christophe Muller) (02/02/90)

That might be a stupid question, but does somebody know how to 
do that job with "sed" ? I wanted to find all the lines matching a
pattern, delete these lines _and_ the line immediately following.

After half an hour of tests, man, Unix books, i gave up and used a
small awk script that worked well the first time ! I just thought that
sed might be faster, but is it possible ? (Note: we still don't have
perl...)

   Cheers,
               Christophe.

-- Important: e-mail address is: muller@phoenix.src.umd.edu --

merlyn@iwarp.intel.com (Randal Schwartz) (02/02/90)

In article <520@mirsa.UUCP>, muller@mirsa (Christophe Muller) writes:
| That might be a stupid question, but does somebody know how to 
| do that job with "sed" ? I wanted to find all the lines matching a
| pattern, delete these lines _and_ the line immediately following.
| 
| After half an hour of tests, man, Unix books, i gave up and used a
| small awk script that worked well the first time ! I just thought that
| sed might be faster, but is it possible ? (Note: we still don't have
| perl...)

Foreseeing a possible reply involving Perl means you've been following
c.u.q for a while.  Well, here's a sed solution:

sed '/pattern/{N
d
}' <<EOF
foo
bar
pattern
bletch
biz
EOF

prints 'foo' 'bar' and 'biz'.  That should be a start.  The important
thing to remember is that 'd' makes you lose control of the current
line, and the next line is restarted at the top of your script.  So,
get all your stuff together, and then do your 'd'.

Just another sed hacker (when I'm not hacking Perl),
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (02/02/90)

In article <520@mirsa.UUCP> muller@mirsa.UUCP (Christophe Muller) writes:
: I wanted to find all the lines matching a
: pattern, delete these lines _and_ the line immediately following.
: 
: After half an hour of tests, man, Unix books, i gave up and used a
: small awk script that worked well the first time ! I just thought that
: sed might be faster, but is it possible ?

Certainly.

/pattern/{
    N
    d
}

If pattern matches, append the next line to the current line, and then
discard the whole pattern buffer.

: (Note: we still don't have perl...)

It doesn't buy you much in this case, depending on the pattern.  Searching
for /window/ in /etc/termcap, perl beats sed by only .2 seconds, less than
a 20% speedup.  On a more complicated pattern, you might get more.  Or less.

That's using the perl command:

    time perl -pe '<>,$_ = "" if /window/;' /etc/termcap > /dev/null
    1.0u 0.1s 0:02 40% 191+162k 23+5io 7pf+0w
vs
    time sed -f window.sed /etc/termcap >/dev/null
    1.2u 0.1s 0:02 56% 29+40k 24+0io 1pf+0w

where window.sed contains the appropriate commands.

awk weighs in at

    time awk '/window/{getline;next}{print}' /etc/termcap > /dev/null
    2.3u 0.1s 0:03 72% 55+112k 22+0io 4pf+0w

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

maart@cs.vu.nl (Maarten Litmaath) (02/02/90)

In article <520@mirsa.UUCP>,
	muller@mirsa.UUCP (Christophe Muller) writes:
\... I wanted to find all the lines matching a
\pattern, delete these lines _and_ the line immediately following.  [...]

----------8<----------8<----------8<----------8<----------8<----------
#!/bin/sh

case $# in
0)
	echo "Usage: `basename $0` pattern [files]" >&2
	exit 1
esac

q=\\
pattern=`echo x"$1" | sed -e 's/.//' -e "s-/-$q$q/-g"`
shift

sed "/$pattern/,/.*/d" $*
----------8<----------8<----------8<----------8<----------8<----------
--
  What do the following have in common:  access(2), SysV echo, O_NONDELAY?  |
  Maarten Litmaath @ VU Amsterdam:  maart@cs.vu.nl,  uunet!mcsun!botter!maart