[comp.virus] VIRUSSUM format

kuhnle@ait.physik.uni-tuebingen.de (Volkmar Kuhnle) (05/16/91)

For about half a year, I regularly acquired the new VIRUSSUM.DOC by
Patricia Hoffman. Compliments to Mrs. Hoffman for her excellent and
detailed work!

But over the months al lot of new viruses (and strains of existing ones)
have been uncovered, so that VIRUSSUM.DOC grew in size. Since the
current version is about more than 500 K in length, is is getting
harder and harder to find informations about a special virus in
a file of this size, since I have to use a normal editor.

I came to the conclusion that an ASCII file is not appropriate for the
distribution of so much data. Therefore I would suggest to supply
future versions as DBF files (dbase format). Database programs which
are able to read DBF files are very common in the PC world. And it
would be much easier to find information about a virus quick in
an DBF file than in an ASCII file.

Any suggestions? Please e-amil them to this list, because I want to
start a dioscussion about the distribution of virus information.

Volkmar Kuhnle
kuhnle@aitxu2.ait.physik.uni-tuebingen.de

padgett%tccslr.dnet@mmc.com (Padgett Peterson) (05/17/91)

>From:    kuhnle@ait.physik.uni-tuebingen.de (Volkmar Kuhnle)

>But over the months al lot of new viruses (and strains of existing ones)
>have been uncovered, so that VIRUSSUM.DOC grew in size. Since the
>current version is about more than 500 K in length, is is getting
>harder and harder to find informations about a special virus in
>a file of this size, since I have to use a normal editor.

The 9104 version is in excess of 600k now something I have commumicated
with Patricia about since last year. For the short term, Vern Bureg's (sp?)
excellent shareware LIST facility can handle it easily plus provide search
facilities to find certain occurances. For editing, my WordStar  has no
problem (very good for translating DEBUG to source code too - plug) with
large files.

At one point last year I tried some simple reformatting & whitespace
elimination that reduced VSUM from 500k to around 300k without changing
the format. I also suggested page breaks at each virus to fit into a loose-
leaf folder and allow monthly updates so that the whole thing only had to be
downloaded once. My suspician is that the original version was done on an
IBM mainframe (used to be sorted in EBCDIC order)

However since it is copyrighted material, such forms cannot be distributed
without permission.

The problem with a .DBF format is that not everyone can read it. For that
reason my preference is for flat ASCII files, these can even be put onto
a mainframe if desired. There are a number of flat-file databases available
& is the reason I reformatted VSUM for Patti last May to have consistant
column boundaries & charactoristic formats. There is no question that VSUM
could be made much smaller without losing any functionality, the .ZIP ratio
gives a good indication of that.

ebrewer@ux1.cso.uiuc.edu (Ellen Brewer) (05/18/91)

kuhnle@ait.physik.uni-tuebingen.de (Volkmar Kuhnle) writes:

>But over the months al lot of new viruses (and strains of existing ones)
>have been uncovered, so that VIRUSSUM.DOC grew in size. Since the
>current version is about more than 500 K in length, is is getting
>harder and harder to find informations about a special virus in
>a file of this size, since I have to use a normal editor.
>
>I came to the conclusion that an ASCII file is not appropriate for the
>distribution of so much data. Therefore I would suggest to supply
>future versions as DBF files (dbase format). Database programs which
>are able to read DBF files are very common in the PC world. And it
>would be much easier to find information about a virus quick in
>an DBF file than in an ASCII file.

Distribution in ASCII file format is far preferable to any other
format.  While many people use dbase, it is far from universal. I
hesitate to think of the contortions I would have to go through to get
information from a DBF file. Nor do I think that an alternative
nonuniversal format is an acceptable solution.

What you need is a text file browser, since you don't need to edit,
just scan for strings and read. This would provide you with a general
purpose tool for looking at any text file, not just a way to look at
VIRUSSUM.DOC.

The software I use is LIST by Vernon D. Buerg. It's free for personal
use, with a suggested donation of $15 if you find it of value. It
doesn't insist on reading the whole file into memory at once. There's
probably a much more recent version than the one I have from 1987, but
even that one is very nice, and allows marking blocks of lines and
writing them to another file. I'm sure there are other programs to
fill this function too--this just happens to be what I use.

- --
Ellen Brewer (ebrewer@ux1.cso.uiuc.edu)
"Non ignara mali, miseris succurrere disco."

padgett%tccslr.dnet@mmc.com (Padgett Peterson) (05/30/91)

Mikael Larsson <vhc@Abacus.hgs.se> writes:

> > From:    Guillory@farwest.FidoNet.Org
> >
> > Other people may have beaten this to death but I propose that Patricia
> > Hoffman change her VIRUSSUM.DOC to something similar to PC Virus Index
> > (PCVI305.zip) by Dan McCool and Brian Clough.

>I disagree with this because that means that You can only access the
>database on a given computer platform. The VIRUSSUM.DOC should remain
>in pure ASCII given the benefits of being able to read it from
>whatever machine your using.

For what its worth, the individual virus files in PCVI305 are in
ASCII, it is the selector/viewer that is a program.

Since I automatically filter VIRUSSUM and reformat with form feeds
after each listing so that it goes into a loose leaf easily, I have
felt for some time that this would be nicer and allow "updates"
monthly rather than having to download/print the whole thing each
month. Telco fees can add up.

The only problem with such a flat file arrangement (access can be very
fast) is that it takes up a lot of disk space (200 viruses @ 2k per
cluster = 400k - smaller on floppy of course).

The big thing to remember is that VIRUSSUM is copyrighted material and
Patricia has put a lot of effort into it. Her decision on whether it
exists at all is final.

					Warmly,
						Padgett