[comp.unix.questions] meaning of "count=" for dd?

gst@wjh12.harvard.edu (Gary S. Trujillo) (10/24/89)

My empirical tests of dd leave me puzzled as to the real meaning of the
"count=" argument.  The manual page says "copy only n input records."
Trouble is, it doesn't say what sort of "record" it's talking about.
I would have thought that it would be the number of characters in a
block, as defined by one of the blocksize (e.g., "bs") arguments.

I find, however, that the number of characters copied by identical dd
commands is different depending on the data, leading one to suspect it's
counting newlines, or some such.  However, the numbers don't correlate
with what's reported for "lines" by wc.  (It's binary data - actually
a compressed cpio archive, so wc is the only way I can count what look
like newlines in the file.)

If anyone is interested in the problem, and would like more details,
please send email.  I'm working on improvements to a backup-to-floppy
script for my UNIXpc.  I just recently discovered a bug in the script
I've been using; my modifications are an attempt to deal with the bug
by inserting an intermediate disk-to-disk pass over the data using a
floppy-sized image before actually writing to floppy, to make sure
that everything gets onto each floppy.

I thank you, my backups thank you (in advance, of course)!


-- 
	Gary Trujillo
	(gst@wjh12.harvard.edu)

gst@wjh12.harvard.edu (Gary S. Trujillo) (10/25/89)

In article <419@wjh12.harvard.edu> gst@wjh12.UUCP (Gary S. Trujillo) writes:
>My empirical tests of dd leave me puzzled as to the real meaning of the
>"count=" argument.  The manual page says "copy only n input records."
>Trouble is, it doesn't say what sort of "record" it's talking about.
>I would have thought that it would be the number of characters in a
>block, as defined by one of the blocksize (e.g., "bs") arguments.

Thanks to everyone who sent email replies - including the AT&T person who
has actually worked on the dd code(!).  The upshot of the comments was that
my problem may come from the fact (it is a fact) that I'm reading from a
pipe in my application, and that a "record" (called "block" in other
versions of the man page, apparently) corresponds to one physical read.
Thus, the amount read may vary, depending on a number of factors.

There was a suggestion that I try 512-byte blocks, since that's normally
the size of a program buffer ("but it's not foolproof").  My tests indicate
that suggestion seems to work.

The AT&T guy also says that results vary between System V and BSD UNIX
implementations, having to do with different implementations of the file
system, I think: "Network files will usually break into 2K blocks, no
matter what you ask for."

Thanks again, and happy "dd-ing!" :-)

-- 
	Gary Trujillo
	(gst@wjh12.harvard.edu)

cpcahil@virtech.uucp (Conor P. Cahill) (10/25/89)

In article <419@wjh12.harvard.edu>, gst@wjh12.harvard.edu (Gary S. Trujillo) writes:
> My empirical tests of dd leave me puzzled as to the real meaning of the
> "count=" argument.  The manual page says "copy only n input records."

What dd means by input counts is the number of reads performed in 
reading the input data.  For a file these numbers should always be the same,
but for a pipe this will probably not be true. The read() will get the 
current data in the pipeline and return a count that is smaller than the
block size.  dd will still count this as a full block.

The amount of data available in the pipeline will vary depending upon the
load of the system and the execution order of the two processes attached
to the pipeline.  The following dd runs illustrate the points above:

	cpcahil(virtech,428): ls -lRa | dd of=/dev/null obs=1024 ibs=8k
	0+202 records in
	246+1 records out

	cpcahil(virtech,429): ls -lRa > /tmp/ttt

	cpcahil(virtech,430): dd if=/tmp/ttt of=/dev/null obs=1024 ibs=8k
	30+1 records in
	246+1 records out

	cpcahil(virtech,431): ls -lRa | dd of=/dev/null obs=1024 ibs=8k
	0+205 records in
	246+1 records out

	cpcahil(virtech,432): dd if=/tmp/ttt of=/dev/null obs=1024 ibs=8k
	30+1 records in
	246+1 records out


Note that reading from a file will always return the same amount of "records",
but reading from a pipeline will not.


-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+