cadman@cbnewsm.att.com (jerome.schwartz) (03/29/91)
How would I do a simple edit command, vi map or sed script, to replace all occurrences of repeated characters in a file with one of each of the characters. ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee to this: APPLE pie Thanks in advance, Jerry ********************************************************************** Jerome Schwartz | Standard Disclaimer: AT&T Bell Laboratories | Crawfords Corner Rd. | Views expressed are my own and Holmdel, N.J. 07733 | do not necessarily reflect those Phone : (201)-949-8560 | of my employer. **********************************************************************
FFAAC09@cc1.kuleuven.ac.be (Nicole Delbecque & Paul Bijnens) (03/29/91)
In article <1991Mar28.181335.7813@cbnewsm.att.com>, cadman@cbnewsm.att.com (jerome.schwartz) says: >How would I do a simple edit command, vi map or sed script, to >replace all occurrences of repeated characters in a file with one >of each of the characters. > >ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee > to this: APPLE pie try (on SYSV) tr -s '[\001-\377]' '[\001-\377]' < input > output giving: APLE pie ^ | one letter P! This program does not know to spell the words! -- Polleke (Paul Bijnens) Linguistics dept., K. University Leuven, Belgium FFAAC09@cc1.kuleuven.ac.be
cadman@cbnewsm.att.com (jerome.schwartz) (03/29/91)
In article <1991Mar28.181335.7813@cbnewsm.att.com>, cadman@cbnewsm.att.com (jerome.schwartz) writes: > > How would I do a simple edit command, vi map or sed script, to > replace all occurrences of repeated characters in a file with one > of each of the characters. > > ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee > to this: APPLE pie > > Thanks in advance, > Jerry The example should have been: AAAAPPPPPPPPLLLLEEEE ppppiiiieeee With 2 sets of 4 P's. Still with no success, Jerry ********************************************************************** Jerome Schwartz | Standard Disclaimer: AT&T Bell Laboratories | Crawfords Corner Rd. | Views expressed are my own and Holmdel, N.J. 07733 | do not necessarily reflect those Phone : (201)-949-8560 | of my employer. **********************************************************************
ruhtra@turing.toronto.edu (Arthur Tateishi) (03/30/91)
In article <1991Mar28.181335.7813@cbnewsm.att.com> cadman@cbnewsm.att.com (jerome.schwartz) writes: > >How would I do a simple edit command, vi map or sed script, to >replace all occurrences of repeated characters in a file with one >of each of the characters. > >ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee > to this: APPLE pie If your example was a typo and should have been: APLE pie then the following reg-exp will work. :s/\(.\)\1*/\1/g -- Red Alert. -- Q, "Deja Q", stardate 43539.1 Arthur Tateishi g9ruhtra@zero.cdf.utoronto.edu
FFAAC09@cc1.kuleuven.ac.be (Nicole Delbecque & Paul Bijnens) (03/30/91)
In article <1991Mar29.145724.5569@cbnewsm.att.com>, cadman@cbnewsm.att.com (jerome.schwartz) says: > >In article <1991Mar28.181335.7813@cbnewsm.att.com>, cadman@cbnewsm.att.com >(jerome.schwartz) writes: >> >> How would I do a simple edit command, vi map or sed script, to >> replace all occurrences of repeated characters in a file with one >> of each of the characters. >> >> ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee >> to this: APPLE pie >> >> Thanks in advance, >> Jerry > >The example should have been: > >AAAAPPPPPPPPLLLLEEEE ppppiiiieeee > >With 2 sets of 4 P's. > >Still with no success, The slighty changed problem has the next solution: sed -e 's/\(.\)\1\1\1/\1/g' inputfile > outfile This changes 4 repeates of the same letter to 1 letter. It leaves all the other letters (the space in your example) as is. -- Polleke (Paul Bijnens) Linguistics dept., K. University Leuven, Belgium FFAAC09@cc1.kuleuven.ac.be
noraa@cbnewsk.att.com (aaron.l.hoffmeyer) (04/01/91)
In article <1991Mar30.024851.24414@jarvis.csri.toronto.edu> ruhtra@turing.toronto.edu (Arthur Tateishi) writes: >In article <1991Mar28.181335.7813@cbnewsm.att.com> cadman@cbnewsm.att.com (jerome.schwartz) writes: >> >>How would I do a simple edit command, vi map or sed script, to >>replace all occurrences of repeated characters in a file with one >>of each of the characters. >> >>ie: This: AAAAPPPPLLLLEEEE ppppiiiieeee >> to this: APPLE pie > >If your example was a typo and should have been: APLE pie >then the following reg-exp will work. >:s/\(.\)\1*/\1/g > >-- >Red Alert. > -- Q, "Deja Q", stardate 43539.1 >Arthur Tateishi g9ruhtra@zero.cdf.utoronto.edu I've seen the original question asked several times in this news.group in just tha last six months. Yes, there are many solutions to this problem, using tr, sed, reg exps, awk, perl etc. etc. etc. But some of the responses and even the people asking the question ignore the situation that creates the problem. The ONLY time I have ever seen 4 characters in place of one character is when someone directs nroff output to a file, then searches for the literal backspace characters (and underscores, if present) and replaces them with nothing. nroff uses the trick of overstriking to embolden words (or underline them). So, the simplest solution to this problem is to not create it in the first place. If you filter the nroff output through col (which is a standard command in UNIX, I think - maybe it is just system V) with the -b option, then you get plain ASCII output that does not have backspaces (or underscores and backspaces) and multiple occurences of characters. I can't recall if there is an "Often Asked Questions" posting in this newsgroup, but if there is, I am sure this solution is in there. If there isn't such a posting for this group, maybe there should be and this question should be included. Aaron L. Hoffmeyer TR@CBNEA.ATT.COM