[comp.editors] A vi question

patkar@amide.ecn.purdue.edu (The Silent Dodo) (08/13/90)

I need to do the following in vi.

I have a file containing multiple occurances of a single letter.
A sample line is 

ccccoooossss

I need to replace all quadruple occurance by a single letter.
Thus the above line should become

cos

I tried something like

:1,$s/\(a-zA-Z)\)\1\1\1/\1/g

Vi does not allow \1 in the pattern to be searched.
Any pointers will be appreciated. 


 -- Anant
 (patkar@cn.ecn.purdue.edu)

meyering@cs.utexas.edu (Jim Meyering) (08/13/90)

In article <1990Aug12.194738.7902@ecn.purdue.edu> patkar@amide.ecn.purdue.edu (The Silent Dodo) writes:
    > I need to do the following in vi.
[ Change text like "ccccoooossss" to "cos" ]
    > :1,$s/\(a-zA-Z)\)\1\1\1/\1/g
[ doesn't work ]

If your "in vi" requirement is strict, I'm not
sure I can help.  Otherwise, `sed' can do it.

From in vi,
:%!sed 's/\([a-zA-Z]\)\1\1\1/\1/g'
works for me.

(It looks like maybe you had an extra right
paren and forgot the square brackets)

Regards,
-- 
Jim Meyering          meyering@cs.utexas.edu

was@hp-lsd.COS.HP.COM (Bill Stubblebine) (08/14/90)

patkar@amide.ecn.purdue.edu (The Silent Dodo)

> I need to do the following in vi.
> ccccoooossss -> cos

:%s/cc*/c/g
:%s/oo*/o/g
:%s/ss*/s/g

grahj@gagme.chi.il.us (jim graham) (08/14/90)

> I need to do the following in vi.
>[ Change text like "ccccoooossss" to "cos" ]

If you are trying to get rid of the attempt at BOLD text that nroff uses
(i.e., backspacing and re-typing the same character over and over again),
there is a better solution....  There is a program called ``pep'' that
resides on a number of the machines that archive sources for UN*X, that
will filter out all of this nonsense for you.  If the original file is called
foo.txt, you would do something like this:

   pep < foo.txt > foo.new
   mv foo.new foo.txt       (if you like foo.new....)  

NOTE:  I include the part about ``if you like foo.new'' because I'm not sure
       if you need to use just ``pep'' or ``pep -b''.  I know one of those
       works perfectly (IF THIS IS WHAT YOU HAD IN MIND).

Hope this helps.....
Jim Graham

patkar@ecn.purdue.edu (Anant Y Patkar) (08/15/90)

In article <401@taumet.com>, steve@taumet.com (Stephen Clamage) writes:
|> |:%s/cc*/c/g
|> |:%s/oo*/o/g
|> |:%s/ss*/s/g
|> 
|> This then also changes
|> 	'accent' to 'acent'
|> 	'loop' to 'lop'
|> 	'ass' to 'as'.
|> I wonder if that is what the original poster had in mind.

  Just to clear the confusion.  What I needed was to change
  quadruple occurances of any letter to a single occurance.
  That means if I use individual substitute commands for each
  letter, I am stuck with 52 commands (some files I have need
  more commands to take care of some other characters).

  Also it should change only quadruple occurances.  I don't want
  any spelling mistakes in the resulting file.  The solutions I 
  received my mail indicate that (possibly) there is no way to
  do this using the "ex" commands.  You have to invoke some UNIX
  commands like 'sed' from inside vi to do this.

  Hope this helps!

  -- Anant
 (patkar@cn.ecn.purdue.edu)

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) (08/16/90)

In article <14420003@hp-lsd.COS.HP.COM> was@hp-lsd.COS.HP.COM (Bill Stubblebine) writes:
>patkar@amide.ecn.purdue.edu (The Silent Dodo)
>
>> I need to do the following in vi.
>> ccccoooossss -> cos
>
>:%s/cc*/c/g
>:%s/oo*/o/g
>:%s/ss*/s/g


Why not: :%s/cc*oo*ss*/cos/g ? I am missing something ?

steve@taumet.com (Stephen Clamage) (08/16/90)

was@hp-lsd.COS.HP.COM (Bill Stubblebine) writes:

|> I need to do the following in vi.
|> ccccoooossss -> cos

|:%s/cc*/c/g
|:%s/oo*/o/g
|:%s/ss*/s/g

This then also changes
	'accent' to 'acent'
	'loop' to 'lop'
	'ass' to 'as'.
I wonder if that is what the original poster had in mind.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

egt@hprnd.HP.COM (Eric Tausheck) (08/17/90)

> > I need to do the following in vi.
> > ccccoooossss -> cos
> 
> :%s/cc*/c/g
> :%s/oo*/o/g
> :%s/ss*/s/g
> ----------

   Wouldn't one of the following be more accurate?
:%s/cc*oo*ss*/cos/g
:%s/ccccoooossss/cos/g

   I had the need to do this kind of editing too until I learned about
   col -b.

   If what you started with had a bunch of back-spaces between the
   letters too (as in an nroff or man output), the filter "col -b" does a
   nice job of getting rid of all this junk for you.

grahj@gagme.chi.il.us (jim graham) (08/17/90)

In article <14420003@hp-lsd.COS.HP.COM> was@hp-lsd.COS.HP.COM (Bill Stubblebine) writes:
>patkar@amide.ecn.purdue.edu (The Silent Dodo)
>
>> I need to do the following in vi.
>> ccccoooossss -> cos
>
>:%s/cc*/c/g
>:%s/oo*/o/g
>:%s/ss*/s/g

First off, if the sources I sent patkar@..... work, this is a dead issue.
the intent was to get rid of the overstriking done by nroff (etc) for bold
type.

Second, and here's where I may simply have more to learn about vi (which is
the case all the time.....), with your search/replace commands above, what
happens with strings such as "success", "pool", and so on?  I guess the real
question should be, does the "*" tell vi that you're talking about strings
of more than 2 of that character?  

Thanks,
Jim Graham

PS:  RTFM isn't a valid answer, since I didn't find the answer there.....
     at least, not in the manuals I have.

afsipmh@cid.aes.doe.CA (Patrick Hertel) (08/20/90)

In article <3440002@hprnd.HP.COM> egt@hprnd.HP.COM (Eric Tausheck) writes:
>> > I need to do the following in vi.
>> > ccccoooossss -> cos
>> 
>> :%s/cc*/c/g
>> :%s/oo*/o/g
>> :%s/ss*/s/g
>> ----------
>
>   Wouldn't one of the following be more accurate?
>:%s/cc*oo*ss*/cos/g
>:%s/ccccoooossss/cos/g
>
 I have seen a lot of answres of this type in the last fwe days but it
seems obvious, to me at any rate , that the question reffered to a more 
general solution. ie he wants to know how to edit out the quadrupling
of letters not just the specific case of ccccoooossss. I am not 
experienced enough with vi to supply an answer but so far I haven't seen it.

-- 
Pat Hertel                 Canadian Meteorological Centre
Analyst/Programmer         2121 N. Service Rd.
phertel@cmc.aes.doe.ca     Dorval,Quebec
Environment Canada         CANADA           H9P1J3

was@hp-lsd.COS.HP.COM (Bill Stubblebine) (08/20/90)

> First off, if the sources I sent patkar@.....  work, this is a dead
> issue.  the intent was to get rid of the overstriking done by nroff
> (etc) for bold type.

	It would have helped if you had said initially that these were
	nroff overstrikes.  In this case, it's easier to cure the problem
	than to cure the symptoms.  Try:

		... | nroff [options] | col -b

     NAME
	  col - filter reverse line-feeds and backspaces

     SYNOPSIS
	  col [ -blfxp ]

     DESCRIPTION

	  (stuff deleted)

	  If the -b option is given, col assumes that the output device in
	  use is not capable of backspacing.  In this case, if two or more
	  characters are to appear in the same place, only the last one
	  read will be output.

	  (more stuff deleted)


> Second, and here's where I may simply have more to learn about vi (which
> is the case all the time.....), with your search/replace commands above,
> what happens with strings such as "success", "pool", and so on?

	Well, you have a good point here.  No editor I know of (other than
	a guy with a green eye shade and a red pencil) knows that two 'o's
	belong in 'loop', but only one belongs in 'lop'.

> I guess the real question should be, does the "*" tell vi that you're
> talking about strings of more than 2 of that character?

	The '*' in a regular expression (RE) means 0 or more occurrences of
	the character preceding the '*'.  Thus, the RE 'cc*' means one or
	more adjacent instances of the character 'c'.

If you cannot locate col(1) at your site, try this shell script, colorfully
known as a 'here document':

ex << !
r file
%s/aaaa/a/g
%s/bbbb/b/g
...
%s/zzzz/z/g
%s/AAAA/A/g
%s/BBBB/B/g
...
%s/ZZZZ/Z/g
(...etc...)
w
q
!
                                Bill Stubblebine
                                Hewlett-Packard Logic Systems Div.
                                8245 N. Union Blvd.
                                Colorado Springs, CO  80920
                                was@hp-lsd.hp.com  (Internet)
                                (719) 590-5568

reddy@lion.austin.ibm.com (/150000;Austin) (08/20/90)

Hi,

Is there a quick way to make a range of lines in a file to be of same length ?
In other words I want the lines in question to contain the same number of
characters (padded with blanks if necessary at the end).
(awk, sed or shell script solutions are fine too).

Satish

fyl@ssc.UUCP (Phil Hughes) (08/21/90)

In article <1990Aug16.130232@ecn.purdue.edu>, patkar@ecn.purdue.edu (Anant Y Patkar) writes:

>   Just to clear the confusion.  What I needed was to change
>   quadruple occurances of any letter to a single occurance.
>   That means if I use individual substitute commands for each
>   letter, I am stuck with 52 commands (some files I have need
>   more commands to take care of some other characters).

>                                                 The solutions I 
>   received my mail indicate that (possibly) there is no way to
>   do this using the "ex" commands.  You have to invoke some UNIX
>   commands like 'sed' from inside vi to do this.

Here is my solution which worked fine in ed but does not work in vi
(which I don't understand).  I expect that ex would be happy with it
as well but didn't try it.  Anyway, the idea is to just look for any
character followed by 3 more of the same character and replace it by
one of the character. 

%s/\(.\)\1\1\1/\1/g

(And to think I avoided the "remember this" constructs for years. :-)

-- 
Phil Hughes, SSC, Inc. P.O. Box 55549, Seattle, WA 98155  (206)FOR-UNIX
     uunet!pilchuck!ssc!fyl or attmail!ssc!fyl            (206)527-3385

gwc@root.co.uk (Geoff Clare) (08/22/90)

Somebody asked:

>>> > I need to do the following in vi.
>>> > ccccoooossss -> cos

afsipmh@cid.aes.doe.CA (Patrick Hertel) wrote:

> I have seen a lot of answres of this type in the last fwe days but it
>seems obvious, to me at any rate , that the question reffered to a more 
>general solution. ie he wants to know how to edit out the quadrupling
>of letters not just the specific case of ccccoooossss. I am not 
>experienced enough with vi to supply an answer but so far I haven't seen it.

That's also how I read the original question.  My first attempt was
to try this:

    :s/\(.\)\1\1\1/\1/g

which works in "ed", but for some reason it didn't work in "vi".  So the
next best thing (if you really must do it from within vi) is to use '!' to
filter the relevant text through "sed".  E.g. for the whole file:

    1G!Gsed 's/\(.\)\1\1\1/\1/g'

or, if the backspaces from nroff emboldening are still there:

    1G!Gsed 's/\(.\)^H\1^H\1^H\1/\1/g'

(where ^H is entered using CTRL-V CTRL-H).
-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, Hayne Street, London EC1A 9HH, England.   Tel: +44-71-315-6600

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) (08/23/90)

In article <2394@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
<Somebody asked:
<
<>>> > I need to do the following in vi.
<>>> > ccccoooossss -> cos
<
<[ stuff deleted about how to use sed to deal with it ]
<
<or, if the backspaces from nroff emboldening are still there:
<
<    1G!Gsed 's/\(.\)^H\1^H\1^H\1/\1/g'
<
<(where ^H is entered using CTRL-V CTRL-H).
This can be done in plain vi by typing:

	%s/^H.//g

As the backspace sequence is used only to reprint the preceding character and
with this command all duplications are removed.
-- 
Dolf Grunbauer      Tel: +31 55 433233 Internet dolf@idca.tds.philips.nl
Philips Information Systems            UUCP     ...!mcsun!philapd!dolf
Some kind of happiness is measured out in miles

gbastin@x102c.harris-atd.com (Gary Bastin 60293) (08/23/90)

In article <1117@ssp11.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
>In article <2394@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
><Somebody asked:
><
><>>> > I need to do the following in vi.
><>>> > ccccoooossss -> cos
><
>This can be done in plain vi by typing:
>
>	%s/^H.//g
>
>As the backspace sequence is used only to reprint the preceding character and
>with this command all duplications are removed.

Nope.  Afraid this one won't work, either.  It DOES clean up all the
occurences of "^H."   But there is a subtle problem.  For example, on
my system, the man pages for csh when piped to a file create lots of
"^H." occurences, which this does clean.  This works fine for overstrikes
used to just embolden particular words.  But also generated are lines
of the form _^HC_^Hs_^Hh,  which generates "Csh" with an underline.  Using
the global replacement of "^H." with nothing completely obliterates the stuff
that uses the overstrike for really making underlined words, leaving just
the underline characters with nothing above them!  Which is
not what is desired, I am sure.  (Not a flame, just a subtle point!)

Gary Bastin              /-/-/      Internet: gbastin@x102c.ess.harris.com
Mail Stop 102-4826         |        phone: (407) 729-3045
Harris Corporation GASD    |        packet: WB4YAF @ N4JLR.FL.USA.NA   
P.O.B. 94000, Melbourne FL 32902    Speaking from, but not for, Harris!

new@ee.udel.edu (Darren New) (08/23/90)

>>	%s/^H.//g
>>As the backspace sequence is used only to reprint the preceding character and
>>with this command all duplications are removed.

Much better is %s/.^H//g which throws out the character *before* the backspace
in case it is an underline.   You know this is right because the backspace
when displayed overwrites the *previous* character, not the next. -- Darren
-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---

peter@ficc.ferranti.com (Peter da Silva) (08/24/90)

In article <1117@ssp11.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
> 	%s/^H.//g

This will be fun if you're underlining. How about:

	%s/.^H//g

Which more accurately emulates what actually happens, anyway.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

egt@hprnd.HP.COM (Eric Tausheck) (08/24/90)

>
>   I had the need to do this kind of editing too until I learned about
>   col -b.
>
>   If what you started with had a bunch of back-spaces between the
>   letters too (as in an nroff or man output), the filter "col -b" does a
>   nice job of getting rid of all this junk for you.

I should have added that before I discovered the "col -b" filter, I used to
execute the following vi command to nroff output:

	:%s/.^H//g

(where the "^H" is entered with <ctrl>V<ctrl>H)



On a more interesting note...
>
>    :s/\(.\)\1\1\1/\1/g
>
>which works in "ed", but for some reason it didn't work in "vi".  So the
>next best thing (if you really must do it from within vi) is to use '!' to
>filter the relevant text through "sed".  E.g. for the whole file:
>
>    1G!Gsed 's/\(.\)\1\1\1/\1/g'

This is very powerful (of ed and sed).  Thanks for sharing this gem
regarding sed/ed.

A shame vi/ex doesn't implement it because it would (almost?) give vi/ex
true complete regular expression matching capabilities.

I'm very rusty on my formal language theory so for all I know this
(ability to self reference to patterns during regular expression
matching) may have given vi/ex full regular expression matching
capabilities - anyone out there care to speculate?

   Eric Tausheck			email: egt@hprnd.hp.com
   Hewlett Packard Co.

roland@ai.mit.edu (Roland McGrath) (08/24/90)

No, don't replace ^H. with nothing; replace .^H with nothing to get rid of
overstriking and underlining.
--
	Roland McGrath
	Free Software Foundation, Inc.
roland@ai.mit.edu, uunet!ai.mit.edu!roland