[comp.unix.questions] unwanted control characters in downloads

ldstern@rodan.acs.syr.edu (Larry Stern) (04/10/91)

To all: when I download certain files from a Sun/4 (running SunOS 4.1), such
as a 'man' file, my files contain all the _^H and other sequences used in
that file. Is there an easy way to eliminate this?

						Thanks, Larry Stern


-- 

Larry Stern                                  LDSTERN@RODAN.ACS.SYR.EDU

Dan_Jacobson@ATT.COM (04/10/91)

>>>>> On 9 Apr 91 19:53:18 GMT, ldstern@rodan.acs.syr.edu (Larry Stern) said:

Larry> To all: when I download certain files from a Sun/4 (running
Larry> SunOS 4.1), such as a 'man' file, my files contain all the _^H
Larry> and other sequences used in that file. Is there an easy way to
Larry> eliminate this?

pipe them thru "col -b"
-- 
Dan_Jacobson@ATT.COM  Naperville IL USA  +1 708 979 6364

sog@bierstadt.scd.ucar.edu (Steve Gombosi) (04/10/91)

Pipe the "man" output through "col" as in:

  man mumble | col -b > mumble_file

col will strip out all the whizbang backspace formatting stuff.
Some "man" programs have parameters which cause them not to produce
the backspace sequences. Other versions of "man" do not generate these
sequences when writing to a file. This varies from manufacturer to
manufacturer -- I think that this lack of consistency is what they mean
by "open systems" :-).

wolf@grasp1.univ-lyon1.fr (Christophe Wolfhugel) (04/10/91)

In article <1991Apr9.195318.18030@rodan.acs.syr.edu> ldstern@rodan.acs.syr.edu (Larry Stern) writes:
>To all: when I download certain files from a Sun/4 (running SunOS 4.1), such
>as a 'man' file, my files contain all the _^H and other sequences used in
>that file. Is there an easy way to eliminate this?

One solution could be to use vi to replace these characters:

vi the_file
:1,$s/.^H//g

^H can be obtained by pressing Ctrl+V and then Ctrl+H. The dor before ^H
indicates that any character before the ^H will also by replaced by
null string.  Hope this helps.

-- 
Christophe Wolfhugel (on irc: Zolf)  |  Email: wolf@grasp1.univ-lyon1.fr
INSA Lyon - Departement Informatique |  "Lapalisse au bordel: la duree de"
69621 Villeurbanne Cedex             |  "l'attente est fonction de la longueur"
France                               |  "de la queue."

leilabd@syma.sussex.ac.uk (Leila Burrell-Davis) (04/12/91)

wolf@grasp1.univ-lyon1.fr (Christophe Wolfhugel) writes:

> In article <1991Apr9.195318.18030@rodan.acs.syr.edu> ldstern@rodan.acs.syr.edu (Larry Stern) writes:
> >To all: when I download certain files from a Sun/4 (running SunOS 4.1), such
> >as a 'man' file, my files contain all the _^H and other sequences used in
> >that file. Is there an easy way to eliminate this?
>
> One solution could be to use vi to replace these characters:
>
> vi the_file
> :1,$s/.^H//g

We have a sed filter which will eliminate this kind of backspacing.
You need to replace <BACKSPACE> by the backspace character, ASCII 8,
then put the script in a file, make it executable and try:

	no_ul inputfile > outputfile

#-----start of script------
# no_ul - remove underline backspace char underlining escape sequences
sed \
    -e '/_<BACKSPACE>/s///g' \
$*
#-----end of script------
-- 
Leila Burrell-Davis, Computing Service, University of Sussex, Brighton, UK
Tel:   +44 273 678390              Fax:   +44 273 678470
Email: leilabd@syma.sussex.ac.uk  (JANET: leilabd@uk.ac.sussex.syma)

ires@kaspar.ires.com (Bruce R Larson) (04/19/91)

> To all: when I download certain files from a Sun/4 (running SunOS 4.1), such
> as a 'man' file, my files contain all the _^H and other sequences used in
> that file. Is there an easy way to eliminate this?
>

If you have "col" you can do this;

  		man "foo" | col -b >savefile

From the man page,

  NAME
	col - filter reverse line-feeds

	[ ... ]

	If the -b option is given, col assumes that the output dev-
	ice in use is not capable of backspacing.  In this case, if
	two or more characters are to appear in the same place, only
	the last one read will be output.

Bruce
-- 
Bruce R. Larson
Integral Resources, Milton MA
Internet:  blarson@ires.com
Uucp:  ..!{world|uunet}!ires.com!blarson