[comp.dcom.modems] size of archive-file of recent messages *TOO BIG*

CMP.WERNER@R20.UTEXAS.EDU.UUCP (04/06/87)

hi,
	I just thought I take a quick look at the recent messages of the last
few days or weeks - and was confronted with a file of HUGE size which I
hesitate to FTP, as it seems such a waste of resources and time for the benefit.
in addition, many local users simply would not have the disk-space available to
deal with files of this size (and I often don't have this kind of free space
either - and we don't have TEMP or SCRATCH directories for anyone to use)

FTP>dir,
FTP>>v
FTP>>
                          Size   Prot  Write     Read     Writer
   PS:<ARCHIVES.MODEMS>
MODEMS-ARCHIV.TXT.1        195 775656  5-Apr-87 29-Mar-87 info-modems-request

this file seems to be close to a MegaByte - would it be possible to keep things
to under 100K?

I found a similar situation with older archives of MODEM7/XMODEM:
(please also note, how the 2 consecutive commands do not display all the
files I would have expected to see in the first case)

FTP>dir <archives.xmodem>,<archives.modems>,
FTP>>v
FTP>>
                          Size   Prot  Write     Read     Writer
   PS:<ARCHIVES.XMODEM>
MODEM7.ARCHIV.60823         64 775252 12-Aug-86  1-Dec-86 WANCHO
XMODEM-ARCHIV.TXT.1         30 775656  1-Apr-87  6-Apr-87 051332@UOTTAWA.BITNET
FTP>dir <archives.xmodem>,
FTP>>v
FTP>>
                          Size   Prot  Write     Read     Writer
   PS:<ARCHIVES.XMODEM>
MODEM7.ARCHIV.50902        193 775252  2-Sep-85 29-Nov-86 WANCHO
MODEM7.ARCHIV.60807        194 775252  7-Aug-86 29-Nov-86 WANCHO
MODEM7.ARCHIV.60823         64 775252 12-Aug-86  1-Dec-86 WANCHO
XMODEM-ARCHIV.TXT.1         30 775656  1-Apr-87  6-Apr-87 051332@UOTTAWA.BITNET
FTP>q

also, isn't it shortsighted to name the files:

	MODEM7.ARCHIV.ymmdd    rather than
	MODEM7.ARCHIV.yymm

I'd also prefer to see the file with the most recent messages named in a way
that would show it first in a directory listing:   <group-name>.00_most-recent
seems a first (not totally elegant) option that comes to mind.

		Cheers,		---Werner

PS: of course, no message should be sent without an expression of appreciation
	to the moderator for his efforts.  Anyway, my appreciation dwarves
	any thoughts of criticism or dissatisfaction.  So there !!  (-:
-------

dpz@paul.UUCP (04/07/87)

> From: CMP.WERNER@R20.UTEXAS.EDU (Werner Uhrig)

> 	I just thought I take a quick look at the recent messages of the last
> few days or weeks - and was confronted with a file of HUGE size which I
...
>                           Size   Prot  Write     Read     Writer
>    PS:<ARCHIVES.MODEMS>
> MODEMS-ARCHIV.TXT.1        195 775656  5-Apr-87 29-Mar-87 info-modems-request
> 
> this file seems to be close to a MegaByte - would it be possible to
> keep things to under 100K?

Um... I think you are mistaking the protection for the size.  This
looks like a TOPS-20 site, which keeps file protections in octal
fields of self, group, and world.  The 775656 means that self has all
privs, and that group and the world can read, execute, append, and
"dir" the file.  The actual size of the file is 195 TOPS-20 pages,
which ends up being in real life about 487K (still not too small).
The possible file attributes are:

		read	40		append	04
		write	20		dir	02
		execute 10		unused	01


						dpz
-- 
David P. Zimmerman     %     rutgers!dpz     %     dpz@rutgers.edu

WANCHO@SIMTEL20.ARPA.UUCP (04/07/87)

Werner,

Your points are well-taken.  At the moment, I am using a mail file
splitter program, designed to break up files too large for MM to
handle.  It's cutoff point is set at 200 disk pages or so, well within
the file size capabilities of MM to handle.  By my arithmetic, a disk
page holds 2.5K, (512 words/page * 7 chars/word) / 1024 chars/K.
Thus, 200 disk pages is 500K or about .49MB.  I suppose it can be
recompiled to break at some value you would consider more reasonable,
such as 100 disk pages.

You are also correct in that the generation numbers will cause a small
problem in 1990, with the leading digit a 0, which the system will
suppress.  However, there is a problem in that the yymm format implies
a file containing all of the correspondence for that month.  The mail
splitter isn't that smart (yet).  The second problem is that with the
smaller files you want, there may be more than one file for that
month.  I suppose yymmn would have to do as we can't have yymmnn as
the generation numbers can only go up to 131071...

The type field has the most potential.  It is .ARCHIV only for
historical reasons - the six character limit per field on ITS files,
from whence some of these archives came.  (From whence?)  Anyway, the
most logical change is to use that field - yymmdd - good until around
2079...

Then there's a question of time.  There are 21 archives we keep here.
One is frozen, and one hasn't gotten off the ground yet.  Nineteen
active archives.  I'm lucky if I remember to check the sizes every so
often and run the splitter program.

Obviously, the answer is to fix the splitter program - well, not "fix"
- it happens to work - just make it smarter.  That takes a bit of
doing.  The program is written in TOPS-20 MACRO-20...  Care to
volunteer...  anybody?  The task: make the program split mail files
into monthly groups and name the output files appropriately.

I'll take a look at it - shortly after April 15th...

--Frank
-------