[net.micro.amiga] Speed of seeks

cc1@ucla-cs.ARPA (UCLA Computer Club) (05/05/86)

Wait a sec... A Seek() call on the Amiga HAS to read every sector.

Each sector on the amiga contains a (byte, word, long, whatever) indicating
how much of the sector is used. So, if you want to read byte 600, you can't
just go to the second record-- you have to go to the first, find out how
much space it uses, go to the second, find out how much, etc.

Incidently, Amiga, how is your disk arrainged? Is it 80 tracks, 160 tracks,
single sides, double sided, 512 bytes, 256 words, 128 long words, etc.

I'm talking about logical, not physical, arraignment.
(and How big (in bytes) is one space (where there are 6 used spaces in each
sector))
-- 
Views expressed here may not be those of the Computer Club, UCLA, or anyone.

bruceb@amiga.UUCP (Bruce Barrett) (05/07/86)

In article <12593@ucla-cs.ARPA> cc1@ucla-cs.UUCP (Michael Gersten) writes:
>Wait a sec... A Seek() call on the Amiga HAS to read every sector.
>
>Each sector on the amiga contains a (byte, word, long, whatever) indicating
>how much of the sector is used. 
	True.  It's a long.  For floppoies it's >0 and <=488
>So, if you want to read byte 600, you can't
>just go to the second record-- you have to go to the first, find out how
>much space it uses, go to the second, find out how much, etc.
	False.  One sector is filled before the next is started.
>
>Incidently, Amiga, how is your disk arrainged? Is it 80 tracks, 160 tracks,
>single sides, double sided, 512 bytes, 256 words, 128 long words, etc.
For 3.5" Floppy:
	2	surfaces
	40	cylinders
	11	blocks per track
	512	bytes per block.
	488	usable data bytes per sector.

The 24 byte difference (512-488) is 6 long words (4 bytes each):
	sector type = data
	header key
	sequence numver
	data size (thus the confusion)
	next data block number (forward link)
	checksum for the sector

Hope that helps!
--Bruce Barrett
--C-A

bruceb@amiga.UUCP (Bruce Barrett) (05/08/86)

In article <1121@amiga.amiga.UUCP> bruceb@hunter.UUCP (Bruce Barrett) 
[that's me!] writes:
>For 3.5" Floppy:
>	2	surfaces
>	40	cylinders	<-- This is wrong!!
>	11	blocks per track
>	512	bytes per block.
>	488	usable data bytes per sector.

There 80 Cylinders on a 3.5" floppy, not 40.  There can be 40 on a 5.25"
under Ver 1.2.

Thanks Neil, for the correction.

>--Bruce Barrett
>--C-A

peter@baylor.UUCP (Peter da Silva) (05/14/86)

> Wait a sec... A Seek() call on the Amiga HAS to read every sector.
> 
> Each sector on the amiga contains a (byte, word, long, whatever) indicating
> how much of the sector is used. So, if you want to read byte 600, you can't
> just go to the second record-- you have to go to the first, find out how
> much space it uses, go to the second, find out how much, etc.

If this is the case, and if it is also the case that the Amiga directories
don't contain the file names (what do they contain? a hash table?), then C=
Amiga has some serious redesigning to do before I or anybody I know actually
go out and buy this thing.

Incidentally, despite the poor design of the files a seek() does not have to
read every sector... a mistake often made by library writers is to try to
make seek offsets simple integers. According to the library, the argument
to an absolute seek() (lseek(fd, off, 0) or lseek(fd, off, 2)) only needs
to be the returned value from a tell() call: it may indeed be a magic cookie
like a sector/offset pair (and in fact "magic cookie" is the way it's described
in the manual). It is under RSX/11M and on the ATARI 800.

This error is not restricted to relative newcomers: there's an IBM mainframe
implementation of 'C' that copies all files into fixed record length files
when you open them just so you can use UNIX-like seeks. If you want to do
a UNIX-like seek, build UNIX-like files (either one long "record" or a bunch
of maximum length records) so your offset calculations work. It's not
meaningful to seek to an unknown depth in a text file or other weird file
anyway.

The Lattice 'C' runtime library on the IBM-PC has an obscure bug related to
this, by the way, so I'm not surprised they screwed up their Amiga library
too...
-- 
-- Peter da Silva
-- UUCP: ...!shell!{baylor,graffiti}!peter; MCI: PDASILVA; CIS: 70216,1076

peter@baylor.UUCP (Peter da Silva) (05/14/86)

> 	False.  One sector is filled before the next is started.

Thank god.

> >Incidently, Amiga, how is your disk arrainged? Is it 80 tracks, 160 tracks,
> >single sides, double sided, 512 bytes, 256 words, 128 long words, etc.
> For 3.5" Floppy:
> 	2	surfaces
> 	40	cylinders
> 	11	blocks per track
> 	512	bytes per block.
> 	488	usable data bytes per sector.

Gee... that means that programs designed to be efficient on machines with a
power-of-2 sector size (as in... everyone else) will die horribly on this
one. Time to add another parameter in my header-files: "SECSIZE". This is
awfully reminiscent of the only thing I didn't like about the ATARI-800: the
file system.

OK, what are the factors of 488:

	488
	244
	122
	 61
	  8
	  4
	  2

That's a mite inconvenient... what do I do with my 16-byte records?
-- 
-- Peter da Silva
-- UUCP: ...!shell!{baylor,graffiti}!peter; MCI: PDASILVA; CIS: 70216,1076

zben@umd5.UUCP (Ben Cranston) (05/17/86)

In article <645@baylor.UUCP> peter@baylor.UUCP (Peter da Silva) writes:

>Incidentally, despite the poor design of the files a seek() does not have to
>read every sector... a mistake often made by library writers is to try to
>make seek offsets simple integers. According to the library, the argument
>to an absolute seek() (lseek(fd, off, 0) or lseek(fd, off, 2)) only needs
>to be the returned value from a tell() call: it may indeed be a magic cookie
>like a sector/offset pair (and in fact "magic cookie" is the way it's described
>in the manual). It is under RSX/11M and on the ATARI 800.

>This error is not restricted to relative newcomers: there's an IBM mainframe
>implementation of 'C' that copies all files into fixed record length files
>when you open them just so you can use UNIX-like seeks. If you want to do
>a UNIX-like seek, build UNIX-like files (either one long "record" or a bunch
>of maximum length records) so your offset calculations work. It's not
>meaningful to seek to an unknown depth in a text file or other weird file
>anyway.

The Software Tools NOTE/SEEK design uses two Fortran integers to store SEEK
addresses.  The predominant text data format on the Sperry 1100 system is a
variable length record, with the record length in a four byte header area.

My implementation of the Tools for the Sperry uses the first of the two
Fortran integers as the "character address within file" (i.e. 4 X wordaddr)
and the second Fortran integer as "character number within this record",
that is, how many characters back to go to get to the record header.  The
code uses this value to get "back in sync" after a random seek.

This has the advantage that the first word of the address appears to be a
normally-incrementing address, with 4-7 spaces between records.  It would
be possible to optimize NOTE address storage: if one knew that positions
stored would always be at the beginning of record and the file was always
ASCII one could keep just the first integer and supply "4" for the second.

Oh, and if the character code is "Fieldata" (tm) rather than ASCII then
the second word is negative.  For historical reasons only...

-- 
"We're taught to cherish what we have   |          Ben Cranston
 by what we have no longer..."          |          zben@umd2.umd.edu
                          ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben

campbell@maynard.UUCP (Larry Campbell) (05/17/86)

> Incidentally, despite the poor design of the files a seek() does not have to
> read every sector... a mistake often made by library writers is to try to
> make seek offsets simple integers. According to the library, the argument
> to an absolute seek() (lseek(fd, off, 0) or lseek(fd, off, 2)) only needs
> to be the returned value from a tell() call: it may indeed be a magic cookie
> like a sector/offset pair (and in fact "magic cookie" is the way it's described
> in the manual). It is under RSX/11M and on the ATARI 800.
> -- Peter da Silva
> -- UUCP: ...!shell!{baylor,graffiti}!peter; MCI: PDASILVA; CIS: 70216,1076

Wrongo.  First, the manual describes the offset as a long (NOT a "simple
integer" nor a "magic cookie").  Second, if the offsets AREN'T integers,
then almost any database library or package won't work (like dbm(3)),
because they often hash keys into offsets in the index file.
-- 
Larry Campbell                                 The Boston Software Works, Inc.
ARPA: maynard.UUCP:campbell@harvard.ARPA       120 Fulton Street
UUCP: {harvard,cbosgd}!wjh12!maynard!campbell  Boston MA 02109

breuel@h-sc1.UUCP (thomas breuel) (05/17/86)

||Wait a sec... A Seek() call on the Amiga HAS to read every sector.
||
||Each sector on the amiga contains a (byte, word, long, whatever) indicating
||how much of the sector is used. So, if you want to read byte 600, you can't
||just go to the second record-- you have to go to the first, find out how
||much space it uses, go to the second, find out how much, etc.
|
|If this is the case, and if it is also the case that the Amiga directories
|don't contain the file names (what do they contain? a hash table?), then C=
|Amiga has some serious redesigning to do before I or anybody I know actually
|go out and buy this thing.

No, this is not true. A Seek() call has to read the block allocation
list, a linked list of pointers to blocks allocated to a file.
Each block, except for the last one, is filled (or at least considered
to be filled for the purpose of Seek's). This means that for every 72
blocks in the file (36k) there is another block in the block allocation
list. Therefore, for large files (720k), a seek may have to go
through as many as 20 blocks in the block allocation list, which
can be quite slow. This was the point of my original complaint.
Supposedly, something is being done about this in 1.2, although
nobody from C/A has said anything specific as to *what* is being
done. My proposals are: (1) change the file system to use allocation
trees (2) buffer all allocation pointers for a file in memory (as
a backwards compatible fix; on a two drive system, this can mean
at most 20k of memory and gives the fastest seeks possible).

|Incidentally, despite the poor design of the files a seek() does not have to
|read every sector... a mistake often made by library writers is to try to
|make seek offsets simple integers. According to the library, the argument
|to an absolute seek() (lseek(fd, off, 0) or lseek(fd, off, 2)) only needs
|to be the returned value from a tell() call: it may indeed be a magic cookie
|like a sector/offset pair (and in fact "magic cookie" is the way it's described
|in the manual). It is under RSX/11M and on the ATARI 800.

Both lseek and fseek are guaranteed to seek to byte offsets under
UN*X (2.9). There may be systems with other behaviours, but I would
not want to be forced to use such a disaster...

|It's not
|meaningful to seek to an unknown depth in a text file or other weird file
|anyway.

Sure it is. You can do a binary search (and gain by doing it) even if
you don't have a fixed record length.

|The Lattice 'C' runtime library on the IBM-PC has an obscure bug related to
|this, by the way, so I'm not surprised they screwed up their Amiga library
|too...

I'm not sure what you mean by this. The Lattice runtime library
has (fortunately) nothing to do with the Amiga file system.

						Thomas.