[comp.text.tex] Zoo file for msdos: change ``LF'' to ``CR'' for ascii files

xiaofei@acsu.buffalo.edu (Xiaofei Wang) (04/12/91)

/* nico@cs.ruu.nl (Nico Verwer) wrote */:
* The extension .zoo in multicol.zoo means that this file is a compressed

There is a difficulty using .zoo the files on msdos.

The files on unix uses ``LF'' as end of line. After the files zoo'ed
on unix and ftp'ed to a msdos machine and unzoo'ed there, ``LF'' remains.
However on msdos one needs ``CR'' as end of line. This creates some 
difficulities. [If the files have ``CR'' orginally, i.e. they are zooed on
msdos machine, then there is no any problem.]

For example, I ftp'ed sb30tex.zoo from eepsd.gtech.edu and unzoo'ed it.
A file texfonts.sub can not be read by the program until I changed all
of the ``LF'' to ``CR''. The easiest way of doing this is use micro-emacs
to resave the file. If there are too many ascii files. I would suggest
unzoo on unix and ftp the ascii files to msdos under ascii mode.

The excutable zoo and other compressing tools can be obtained
according the following file:

 ALL ABOUT ARCHIVES, LHZS, ZIPS, ZOOS, LIBRARIES, and SQUEEZED FILES

Some of the files in the SIMTEL20 MS/PCDOS Software Libraries have
been transformed by using one or another of the standard public domain
utilities that either SQueezes, LiBRaries, ARChives, LZHs, ZIPs, or
ZOOs files.

This transformation is performed to compress the files to minimize
download time, and/or combine several related files into a single
easily-managed file.  You cannot use or run any of these files without
first transforming them back to their original state.

These processed files are specially named with a file type (the last 3
letters of a file name after the '.') that signifies the transformation.
These are:

            .ARC   for files archived with PKPAK.EXE,
            .LZH   for files archived with LHARC.EXE,
            .ZIP   for files archived with PKZIP.EXE,
            .ZOO   for files archived with ZOO.EXE,
            .LBR   for files libraried with LU.EXE, and
            .?Q?   for squeezed files (middle letter is a Q).


                          ARC FILES

PKPAK is used to create and maintain file archives.  An archive is a
group of files collected together into one file in such a way that the
individual files may be recovered intact.  PKPAK will automatically
compress member files when adding them to the archive, and PKUNPAK
will expand them upon extraction.  For files with the .ARC extension,
you must have a copy of file PD1:<MSDOS.ARC-LBR>PK361.EXE to extract
the component files.  (PK361.EXE is a "self-extracting archive."  When
you run this program, it will produce PKPAK, PKUNPAK and related
documentation).  After you end up with a copy of PKUNPAK you can use
it to extract files.  An example of using PKUNPAK to unpack an ARChive
"FILE.ARC" is:
                     "A>pkunpak file"
You do not need to supply the ARC file type when specifying "file."


                          LZH FILES

LHARC is used to create and maintain file archives.  An archive is a
group of files collected together into one file in such a way that the
individual files may be recovered intact.  LHARC will automatically
compress member files when adding them to the archive, and will expand
them upon extraction.  For files with the .LZH extension, you must
have a copy of file PD1:<MSDOS.ARC-LBR>LH113C.EXE to extract the
component files.  LH113C.EXE is a "self-extracting archive."  When
you run this program, it will produce LHARC and related documentation.
After you end up with a copy of LHARC you can use it to extract files.
An example of using LHARC to unpack an LZH archive "FILE.LZH" is:
                     "A>lharc e file"
You do not need to supply the LZH file type when specifying "file."


                          ZIP FILES

PKZIP is used to create and maintain file archives.  An archive is a
group of files collected together into one file in such a way that the
individual files may be recovered intact.  PKZIP will automatically
compress member files when adding them to the archive, and PKUNZIP
will expand them upon extraction.  For files with the .ZIP extension,
you must have a copy of file PD1:<MSDOS.ZIP>PKZ110EU.EXE to extract the
component files.  (PKZ110EU.EXE is a "self-extracting archive."  When
you run this program, it will produce PKZIP, PKUNZIP and related
documentation).  After you end up with a copy of PKUNZIP you can use
it to extract files.  An example of using PKUNZIP to unpack an archive
"FILE.ZIP" is:
                     "A>pkunzip file"
You do not need to supply the ZIP file type when specifying "file."


                           ZOO FILES

ZOO.EXE is an archiving program that is similar to PKPAK, but
non-compatible.  ZOO can produce archives with long pathnames in them
(directory names as well as the file name) and it can store comments
about each file.  If you want to take apart a ZOO archive, you will
need a copy of ZOO.EXE.  Since it is a program in development, it's
hard to say what its file name will be when you read this, but
searching for ZOO*.* should turn up the correct file.  When this
article was written the current version of ZOO was ZOO201.EXE, which
may be found in the PD1:<MSDOS.ZOO> directory.  The zoo syntax for
file extraction is:
                    "A>zoo e file"
You do not need to supply the ZOO file type when specifying "file."


                           LBR FILES

LU and its relatives (LUP, LUU, LUE, LUT, LU86, LAR etc.), maintain
libraries of files.  Most LU-type programs do not perform any
compression.  Because of this, most people will squeeze files before
adding them to a library if they want to save space.  If you want to
remove the component files from an .LBR file, you should have a copy
of file PD1:<MSDOS.STARTER>LUE220.COM.  This will break up the library
into its component parts, and optionally unsqueeze any .?Q? files at
the same time.  The syntax for LUE would be:
                      "A>lue220 file"
where file was really FILE.LBR.

LUU.COM can be used to create a .LBR file.


                       SQUEEZED FILES

NUSQ.COM is used to unsqueeze, or expand files that have a "Q" as the
middle letter of the file type.  Such files have been squeezed, or
compressed with SQPC.COM or something similar.  These programs use
Huffman Encoding to reduce the size of the target file.  Depending on
the distribution of data in a file it can be reduced in size by 5% to
60% by squeezing it. If you download a file with a file type
indicating that it is squeezed, you will need file
PD1:<MSDOS.STARTER>NUSQ110.COM to expand it before you can use it.
The syntax to unsqueeze a file would be:
                   "A>nusq110 file.tqt"
where file.tqt was the file you wanted to unsqueeze.  You must supply
the full file name and type.


                       MORE INFORMATION

For more information on ARChives, see the documentation for
PKPAK/PKUNPAK which is included in the PK361.EXE file.  For more
information on LHZ archives, see the documentation for LHARC which is
included in the LH113C.EXE file.  For more information on ZIP
archives, see the documentation for PKZIP/PKUNZUP which is included in
the PKZ110EU.EXE file.  For ZOO archives, see Rahul Dhesi's excellent
documentation included in ZOO201.EXE and UGUIDE.ZOO.  The doc files
included with the various LU utilities will explain .LBR's, and
LUDEF5.DOC explains the layout of these files in detail.

                     -- Keith Petersen <w8sdz@WSMR-SIMTEL20.Army.Mil>
-- 
xiaofei@acsu.buffalo.edu / rutgers!ub!xiaofei / v118raqa@ubvms.bitnet

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (04/18/91)

In <70463@eerie.acsu.Buffalo.EDU> xiaofei@acsu.buffalo.edu (Xiaofei
Wang) writes:

>There is a difficulty using .zoo the files on msdos.

>The files on unix uses ``LF'' as end of line. After the files zoo'ed
>on unix and ftp'ed to a msdos machine and unzoo'ed there, ``LF'' remains.

If a text file was originally archived on a UNIX system and is
unarchived under MS-DOS, you will need to do newline conversions.

The best way to do this is to use my "flip" program.  It was posted to
comp.sources.misc about two years ago.  If you're lucky, it may be
available for ftp from somewhere.  I don't email it to end users to
minimize traffic, but if any well-known major archive site needs it I
will send it.
--
Rahul Dhesi <dhesi@cirrus.COM>
UUCP:  oliveb!cirrusl!dhesi

xiaofei@acsu.buffalo.edu (Xiaofei Wang) (04/18/91)

/* Rahul Dhesi <dhesi@cirrus.COM> wrote */:
* 
* If a text file was originally archived on a UNIX system and is
* unarchived under MS-DOS, you will need to do newline conversions.
* 
* The best way to do this is to use my "flip" program.  It was posted to
* comp.sources.misc about two years ago.  If you're lucky, it may be
* available for ftp from somewhere.  I don't email it to end users to
* minimize traffic, but if any well-known major archive site needs it I
* will send it.

After my post under the current subject line, many people
replied that flip is a great program. But none of them mentioned where
I get it. I tried simtel but no luck. a path is abolutely necessary for
simtel! So if some one knows, please post.

The ways I use with the program are:

1) zoo at unix end and then transfer to msdos (ftp, for example)
2) read into micro-emacs and resave it. (micro-emacs is available from
                                         clarkson)
3) use WP5.1 ``convert'' to convert it from WP4.1 to WP5.1 and then to ASCII. 
   No kidding, it works.
-- 
xiaofei@acsu.buffalo.edu / rutgers!ub!xiaofei / v118raqa@ubvms.bitnet

truett@belton.nec.com (Truett Smith) (04/19/91)

In article <71539@eerie.acsu.Buffalo.EDU> xiaofei@acsu.buffalo.edu (Xiaofei Wang) writes:
>/* Rahul Dhesi <dhesi@cirrus.COM> wrote */:
>* 
>* If a text file was originally archived on a UNIX system and is
>* unarchived under MS-DOS, you will need to do newline conversions.
>* 
>* The best way to do this is to use my "flip" program.  It was posted to
>* comp.sources.misc about two years ago.  If you're lucky, it may be
>* available for ftp from somewhere.  I don't email it to end users to
>* minimize traffic, but if any well-known major archive site needs it I
>* will send it.
>
>After my post under the current subject line, many people
>replied that flip is a great program. But none of them mentioned where
>I get it. I tried simtel but no luck. a path is abolutely necessary for
>simtel! So if some one knows, please post.
>
>The ways I use with the program are:
>
>1) zoo at unix end and then transfer to msdos (ftp, for example)
>2) read into micro-emacs and resave it. (micro-emacs is available from
>                                         clarkson)
>3) use WP5.1 ``convert'' to convert it from WP4.1 to WP5.1 and then to ASCII. 
>   No kidding, it works.
>-- 
>xiaofei@acsu.buffalo.edu / rutgers!ub!xiaofei / v118raqa@ubvms.bitnet

An additional method for the MS-DOS user to convert text files is to use
Vern Buerg's LIST program.  Just invoke the program to list the file to be
converted, mark the top and bottom lines of the file (Alt-H will tell you
how to do that), write out the marked text to a file (it will prompt for a
new file name), and exit the program with Escape.  The file that was
written out will have the newlines automatically converted.

Truett Smith
NEC America
San Jose, CA

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (04/20/91)

In response to various suggestions for newline conversions such as:

>>3) use WP5.1 ``convert'' to convert it from WP4.1 to WP5.1 and then to ASCII. 
...
>An additional method for the MS-DOS user to convert text files is to use
>Vern Buerg's LIST program.

In about 99% of cases various newline conversion methods will work.
However, you will find that most newline conversion algorithms don't do
it right.  Most of them blindly add CR before each LF (or, in the
other direction, blindly strip all occurrences of CR).  This fails in
some rare cases.

     -- A file may contain CR characters for underlining.  These
	should not be stripped when going from MS-DOS to **IX.  So,
	"CR LF" should be translated to LF, but "CR <stuff> CR <stuff>"
	should be left unchanged.

     -- A file may contain a sequence like "CR LF".  When going from
	**IX to MS-DOS, this should remain unchanged, *not* translated
	into "CR CR LF", which will confuse MS-DOS programs.

The above is one reason why zoo doesn't try to do newline conversions.
It's hard to do it right.  It's better to have a smart utility like
"flip" always do it right, instead of doing it 99% right.

I searched for a while for a good newline conversion program, and found
none that was both portable and correct, before deciding that I had to
write my own.
--
Rahul Dhesi <dhesi@cirrus.COM>
UUCP:  oliveb!cirrusl!dhesi