[comp.editors] vi: search for or go to a line not of length n

lee@chsun1.uchicago.edu (dwight lee) (09/25/90)

A BITNET site sent me a uuencoded binary, and it showed up in my mailbox with
blanks at the ends of lines stripped, and some lines concatenated.  I'd like
to use vi to patch the file up so that it can be properly decoded.

Most uuencoded lines are 61 characters long.  I'd like to be able to search
for a line that isn't 61 characters long.  Any reasonably automated method
will do!  Regexps, macros... anything.

Also, is there a way for a search in vi to be inverted?  ie "search for the
next line which does not begin with a capital M".  This would also be good for
checking on mangled uuencoded files.

Feel free to suggest an alternate method of repair, too.  Thanks to you all!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Dwight A Lee  /  lee@chsun1.uchicago.edu  /  815-758-1389  /  tCS/BB  /  Font
I speak only for myself.  /  "I am not the only dust my mother raised" - TMBG

brister@decwrl.dec.com (James Brister) (09/25/90)

On 25 Sep 90 08:50:27 GMT, lee@chsun1.uchicago.edu (dwight lee) said:

> I'd like to be able to search
> for a line that isn't 61 characters long.  Any reasonably automated method
> will do!  Regexps, macros... anything.

This isn't so easy. Searching for a line that's LONGER than 61 characters
isn't to bad:

	/...............................................................*

(That's 63 periods, because you want 62 or more characters)

I'm not sure of a simple way to look for lines that are less than 61
characters in length.

> Also, is there a way for a search in vi to be inverted?  ie "search for
> the next line which does not begin with a capital M".  This would also be
> good for checking on mangled uuencoded files.

This is easy: 

	/^[^M]


James
--
James Brister                                           brister@decwrl.dec.com
DEC Western Software Lab., Palo Alto, California.         .....!decwrl!brister

dylan@ibmpcug.co.uk (Matthew Farwell) (09/25/90)

In article <lee.654252627@chsun1> lee@chsun1.uchicago.edu (dwight lee) writes:
>Most uuencoded lines are 61 characters long.  I'd like to be able to search
>for a line that isn't 61 characters long.  Any reasonably automated method
>will do!  Regexps, macros... anything.

I don't think theres any easy way to actually do this within vi. If you
want to do this, use something like awk or perl.

>Also, is there a way for a search in vi to be inverted?  ie "search for the
>next line which does not begin with a capital M".  This would also be good for
>checking on mangled uuencoded files.

Using /^[^M] would do this (where ^M is ^ (carat), NOT ctrl M. This
means match anything without M as the first character at the beginning
of the line. If you want to do an operation on all line which match
this regexp, use the ex commands :g or :v

:g/^[^M]/<ex command>
or
:v/^M/<ex command>

:v is just the opposite of :g (as in the -v option of grep)

I think it would be far easier for you to sort things out in perl or
awk. If you have any problems, mail me.

Dylan.
-- 
Matthew J Farwell                 | Email: dylan@ibmpcug.co.uk
The IBM PC User Group, PO Box 360,|        ...!uunet!ukc!ibmpcug!dylan
Harrow HA1 4LQ England            | CONNECT - Usenet Access in the UK!!
Phone: +44 81-863-1191            | Sun? Don't they make coffee machines?

steinbac@hpl-opus.HP.COM (Gunter Steinbach) (09/25/90)

> / hpl-opus:comp.editors / lee@chsun1.uchicago.edu (dwight lee) /
> 1:50 am  Sep 25, 1990 /

> Most uuencoded lines are 61 characters long.  I'd like to be able to search
> for a line that isn't 61 characters long.  Any reasonably automated method
> will do!  Regexps, macros... anything.

Just look for
/...................................................................../

(Make that 61 dots - I didn't count mine.)

> Also, is there a way for a search in vi to be inverted?  ie "search for the
> next line which does not begin with a capital M".  This would also be good for
> checking on mangled uuencoded files.

That one is easy:  
/^[^M]/ 

The first caret anchors you at the front of the line, the caret inside
square brackets inverts character- set search.  You should not have "set
ignorecase" on.

Good luck!

	 Guenter Steinbach		gunter_steinbach@hplabs.hp.com

Dan_Bloch@TRANSARC.COM (09/26/90)

lee@chsun1.uchicago.edu (dwight lee) writes:

> A BITNET site sent me a uuencoded binary, and it showed up in my mailbox
> with blanks at the ends of lines stripped, and some lines concatenated.
> I'd like to use vi to patch the file up so that it can be properly decoded.
>
> Most uuencoded lines are 61 characters long.  I'd like to be able to
> search for a line that isn't 61 characters long.  Any reasonably automated
> method will do!  Regexps, macros... anything.

Okay, here we go.  What you want to do is something like

    :v/^..............................................$/s/.*/&<TAG>/

That should be 61 dots.  I put in fewer in the interest of having it all
fit on one line.  What this does is finds every line that isn't exactly
61 characters long and put <TAG> on the end of it.  Then you can search
through the file for normally for <TAG>, take whatever actions you want,
and delete all the <TAG>s when you're done.  Obviously, you can use any
handy marker instead of <TAG>.

> Also, is there a way for a search in vi to be inverted?  ie "search for
> the next line which does not begin with a capital M".  This would also be
> good for checking on mangled uuencoded files.

That's what the :v command above does.  It's exactly like :g, only it
searches for lines not matching a given pattern.  It only works as a
line mode command, so if you want to treat each line individually you
need some trick like tagging them.

> Feel free to suggest an alternate method of repair, too.  Thanks to you all!

Couldn't you get the guy to retransmit it?

Dan Bloch
dan@transarc.com

avi@taux01.nsc.com (Avi Bloch) (09/26/90)

In article <62420017@hpl-opus.HP.COM> steinbac@hpl-opus.HP.COM (Gunter Steinbach) writes:
>> / hpl-opus:comp.editors / lee@chsun1.uchicago.edu (dwight lee) /
>> Also, is there a way for a search in vi to be inverted?  ie "search for the
>> next line which does not begin with a capital M".  This would also be good for
>> checking on mangled uuencoded files.
>
>That one is easy:  
>/^[^M]/ 
>

This doesn't always work, since an empty line is also a line 'which does not
begin with a capital M' and that search pattern will only find lines with at
least 1 character on it.

I don't of any way that will also catch empty lines.

Any takers?
-- 
	Avi Bloch
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel		Tel: (972) 52-522263
avi@taux01.nsc.com

dattier@ddsw1.MCS.COM (David W. Tamkin) (09/27/90)

Dan_Bloch@TRANSARC.COM wrote in <gazyNEr0BwwZIof1o_@transarc.com>:

|     :v/^..............................................$/s/.*/&<TAG>/
| 
| That should be 61 dots.  I put in fewer in the interest of having it all
| fit on one line.  What this does is finds every line that isn't exactly
| 61 characters long and put <TAG> on the end of it.  Then you can search
| through the file for normally for <TAG>, take whatever actions you want,
| and delete all the <TAG>s when you're done.  Obviously, you can use any
| handy marker instead of <TAG>.

Wouldn't 
    :v/^.....{sixty-one periods total}....$/s/$/<TAG>/
make the same substitution?

David Tamkin  Box 7002  Des Plaines IL  60018-7002  708 518 6769  312 693 0591
MCI Mail: 426-1818  GEnie: D.W.TAMKIN  CIS: 73720,1570   dattier@ddsw1.mcs.com

zlsiial@mcc.ac.uk (A.V. Le Blanc) (09/27/90)

>> Couldn't you get the guy to retransmit it?

Perhaps the original submission came from a site which, like mine,
truncates blank characters from the ends of all lines received.
I have never yet received a UUencoded file in the mail or by
ordinary FTP which I have been able to decode without preprocessing,
and that usually produces error messages anyway.

The code used for UUencoding was simply badly chosen, and some
of us hope it will be changed.

				Yours,
				A. V. Le Blanc
				ZLSIIAL@UK.AC.MCC.CMS

dylan@ibmpcug.co.uk (Matthew Farwell) (09/27/90)

In article <4769@taux01.nsc.com> avi@taux01.UUCP (Avi Bloch) writes:
>This doesn't always work, since an empty line is also a line 'which does not
>begin with a capital M' and that search pattern will only find lines with at
>least 1 character on it.
>
>I don't of any way that will also catch empty lines.

Offhand, I can't think of anything that will do this.  The only thing
that comes close is

:v/^M/#

which prints out lines which don't have an M at the beginning (including
empty lines), with their respective line numbers.  You can then use the
G command to go to those line numbers.  Bit of a bodge really.

Dylan.
-- 
Matthew J Farwell                 | Email: dylan@ibmpcug.co.uk
The IBM PC User Group, PO Box 360,|        ...!uunet!ukc!ibmpcug!dylan
Harrow HA1 4LQ England            | CONNECT - Usenet Access in the UK!!
Phone: +44 81-863-1191            | Sun? Don't they make coffee machines?

new@ee.udel.edu (Darren New) (09/29/90)

In article <1744@m1.cs.man.ac.uk> zlsiial@cms.mcc.ac.uk (A.V. Le Blanc) writes:
>The code used for UUencoding was simply badly chosen, and some
>of us hope it will be changed.

Hmmmm.... I think it has already been changed.  My version uses graves instead
of spaces.  Another alternative would be to use "abe", a clearly superior
choice.             -- Darren
-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---
----- Network Protocols, Graphics, Programming Languages, 
      Formal Description Techniques (esp. Estelle), Coffee -----

dylan@ibmpcug.co.uk (Matthew Farwell) (09/29/90)

In article <1744@m1.cs.man.ac.uk> zlsiial@cms.mcc.ac.uk (A.V. Le Blanc) writes:
>>> Couldn't you get the guy to retransmit it?
>Perhaps the original submission came from a site which, like mine,
>truncates blank characters from the ends of all lines received.
>I have never yet received a UUencoded file in the mail or by
>ordinary FTP which I have been able to decode without preprocessing,
>and that usually produces error messages anyway.

UUencode shouldn't even have blanks in it. (see below)

>The code used for UUencoding was simply badly chosen, and some
>of us hope it will be changed.

Tell me if I'm wrong, but this was my understanding of the uuencode
algorithim. 

3 8 bit bytes are taken from the file. These are mapped onto 4 8 bit
bytes, and the top 2 bits of each character are padded with 0's

ie:

11110110 11001001 10101001 gets converted to:

  111101   101100   100110   101001 which then gets mapped to:

00111101 00101100 00100110 00101001

You then have ascii characters with values ranging between 0 and 63.  In
ascii, this still includes control characters and spaces, so 33 is added
to these values.  You now have a file of characters which range between
33 and 96.  In ascii, these characters are all printable + therefore can
be mailed without mailers barfing.  The file is then split into lines
about 60 characters long, and each line has a 'M' added to the front to
stop mailers taking any ~'s as commands to the mailer.

Main point of all this.  UUencode adds 33 (not as would be expected 32)
so as not to include the complications which come with having spaces in
a file, which as seems to have happened in your case might get stripped
from the end of a line.  UUencode was designed to be as simple + robust
as possible, for transferring files between systems.  If you have a
uuencode which includes spaces, change it.

Dylan.
-- 
Matthew J Farwell                 | Email: dylan@ibmpcug.co.uk
The IBM PC User Group, PO Box 360,|        ...!uunet!ukc!ibmpcug!dylan
Harrow HA1 4LQ England            | CONNECT - Usenet Access in the UK!!
Phone: +44 81-863-1191            | Sun? Don't they make coffee machines?

ted@isgtec.uucp (Ted Richards) (10/02/90)

In article <1990Sep25.135441.10647@ibmpcug.co.uk> dylan@ibmpcug.CO.UK (Matthew Farwell) writes:
: In article <lee.654252627@chsun1> lee@chsun1.uchicago.edu (dwight lee) writes:
: >Most uuencoded lines are 61 characters long.  I'd like to be able to search
: >for a line that isn't 61 characters long.  Any reasonably automated method
: >will do!  Regexps, macros... anything.
 
: I don't think theres any easy way to actually do this within vi. If you
: want to do this, use something like awk or perl.

How about

  v/^.............................................................$/<cmd>

This executes <cmd> for every line that does not contain exactly 61
characters (provided you type in exactly 61 dots :-).  <cmd> could,
for example, put a mark at the beginning or end of every line that you
want to edit individually.
--
Ted Richards          ...uunet!utai!lsuc!isgtec!ted         ted@isgtec.UUCP
ISG Technologies Inc.   3030 Orlando Dr. Mississauga  Ont.  Canada   L4V 1S8