[comp.os.msdos.programmer] How do I SHORTEN a file without rewriting it?

alex@bilver.UUCP (Alex Matulich) (10/23/90)

I have a very large file which was made by writing a bunch of data
structures out to disk.  When I wish to delete a structure from this
data file, I simply read all the structures following the one to be deleted
and write them out one record-length forward, one at a time.  I use the
fseek(), fread(), and fwrite() functions for this.

This process does the job, but it leaves an unused record at the end of
the file, and of course the file length remains unchanged.

Is there a way to shorten a file, that is, chop some data off the end of
it, so that it doesn't consume as much physical space on the disk?  The
file I have is too big to read into memory and write back out again, and
there is not enough room on the disk to write out a temporary file.

You can email any help you want, but I'll be looking in this newsgroup
for answers also.  Thanks...

-- 
 _ |__  Alex Matulich   (alex@bilver.UUCP)
 /(+__>  Unicorn Research Corp, 4621 N Landmark Dr, Orlando, FL 32817
//| \     UUCP:  ...uunet!tarpit!bilver!alex
///__)     bitnet:  IN%"bilver!alex@uunet.uu.net"

bad@atrain.sw.stratus.com (Bruce Dumes) (10/24/90)

In article <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:
>
>Is there a way to shorten a file, that is, chop some data off the end of
>it, so that it doesn't consume as much physical space on the disk?  The
>file I have is too big to read into memory and write back out again, and
>there is not enough room on the disk to write out a temporary file.
>

Have you thought about using ftruncate()?

Bruce


--
Bruce Dumes			|  "You don't see many of *these* nowdays, |
bad@zen.cac.stratus.com		|   do you?"				   |

johnl@esegue.segue.boston.ma.us (John R. Levine) (10/24/90)

In article <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:
>Is there a way to shorten a file, that is, chop some data off the end of
>it, so that it doesn't consume as much physical space on the disk?

It is a poorly documented but reliable feature of MS-DOS systems that a
zero length write truncates a file.  That is, you seek to where you want
the EOF to be, then do a write(fd, "", 0) or the equivalent.

Note that if you're using a stdio library (fopen et al.) you almost certainly
cannot do this with fwrite.  Also, if your stdio is doing lf to cr/lf
translation, you may need to turn that off as well.

Also note that this hack is extremely unportable.  Unix systems, for
example, have a variety of ways of truncating files (ftruncate, fcntl, etc.)
none of which involve a zero-length write.

Regards,
John Levine, johnl@esegue.segue.boston.ma.us, {spdcc|ima|world}!esegue!johnl
-- 
John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|spdcc|world}!esegue!johnl
Atlantic City gamblers lose $8200 per minute. -NY Times

michi@ptcburp.ptcbu.oz.au (Michael Henning) (10/25/90)

alex@bilver.UUCP (Alex Matulich) writes:

>Is there a way to shorten a file, that is, chop some data off the end of
>it, so that it doesn't consume as much physical space on the disk?  The
>file I have is too big to read into memory and write back out again, and
>there is not enough room on the disk to write out a temporary file.

Ftruncate() (BSD call) will do the job. Under AIX (maybe others), there
is an fclear() call that allows you to punch holes into a file at arbitrary
places. The blocks corresponding the hole(s) are returned to the file system.
In SysV.4, you can use fntl() to do the same.

							Michi.

-- 
      -m------- Michael Henning			+61 75 950255
    ---mmm----- Pyramid Technology		+61 75 522475 FAX
  -----mmmmm--- Research Park, Bond University	michi@ptcburp.ptcbu.oz.au
-------mmmmmmm- Gold Coast, Q 4229, AUSTRALIA	uunet!munnari!ptcburp.oz!michi

dougs@videovax.tv.tek.com (Doug Stevens) (10/26/90)

>Is there a way to shorten a file, that is, chop some data off the end of
>it, so that it doesn't consume as much physical space on the disk?

There's a function called chsize(int handle, long size) in the Turbo-C
library for exactly this purpose.

david@csource.oz.au (david nugent) (11/02/90)

In <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:

>Is there a way to shorten a file, that is, chop some data off the end of
>it, so that it doesn't consume as much physical space on the disk?  The
>file I have is too big to read into memory and write back out again, and
>there is not enough room on the disk to write out a temporary file.

Write zero bytes at that position.

Some C libraries have a chsize() function which does exactly that.
Since those libraries also don't seem to allow writing of zero bytes
you will need to create your own write function.


chsize.c:


  int chsize (int fd, long newsize)
  {
     r = -1;

     if (lseek (fd, newsize, SEEK_SET) != -1L)
          r = _write (fd, NULL, 0);
     return r;
  }

  
_write.asm:

.model c,small

.code

_write PROC, fd:WORD, buf:PTR, count:WORD

   mov bx,[fd]
   mov cx,[count]
   mov dx,[buf]
   mov ah,40H
   int 21H
   jnc .W0
   mov ax,-1
.W0:
   ret

_write ENDP
   
-- 

        Fidonet: 3:632/348   SIGnet: 28:4100/1  Imex: 90:833/387
              Data:  +61-3-885-7864   Voice: +61-3-826-6711
 Internet/ACSnet: david@csource.oz.au    Uucp: ..!uunet!munnari!csource!david

dfoster@jarthur.Claremont.EDU (Derek R. Foster) (11/03/90)

In article <747@csource.oz.au> david@csource.oz.au (david nugent) writes:
>In <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:
>
>>Is there a way to shorten a file, that is, chop some data off the end of
>>it, so that it doesn't consume as much physical space on the disk?  The
>>file I have is too big to read into memory and write back out again, and
>>there is not enough room on the disk to write out a temporary file.
>
>Write zero bytes at that position.

If this works, it isn't documented in the Microsoft C manuals I have.
(And believe me, I searched!) After SEVERAL calls to Microsoft,
(Two seperate people told me it couldn't be done from either within C or
through DOS! I thought these people were supposed to be knowledgeable!)
and a great deal of loud cursing, I was finally led to the chsize()
function. This seems to be the only way of doing this from within
Microsoft C, (And I suspect Turbo C as well.) If you are using
streams, you will probably have to close your stream, reopen the file
using handles, chsize() it, close it again, reopen using streams...
What a mess. But it works, and is better than (in my case) copying
a 20-meg file to a shorter length...

>Some C libraries have a chsize() function which does exactly that.
>Since those libraries also don't seem to allow writing of zero bytes
>you will need to create your own write function.

I'm not sure why it is preferable to create one's own write function
instead of just using chsize(). What is the advantage?

Derek Riippa Foster

ingea@IFI.UIO.NO (Inge Bj|rnvall Arnesen) (11/03/90)

In article <9505@jarthur.Claremont.EDU> you write:
>In article <747@csource.oz.au> david@csource.oz.au (david nugent) writes:
>>Write zero bytes at that position.
>
>If this works, it isn't documented in the Microsoft C manuals I have.
>(And believe me, I searched!) After SEVERAL calls to Microsoft,
>(Two seperate people told me it couldn't be done from either within C or
>through DOS!

No, it's not documented in the MSC manuals, and I have never been able to
write 0 bytes to DOS through the C write()-call. But, that writing 0
bytes will set current file size is well documented in MS DOS tech. ref.
and other techinical MS DOS manuals and books, and the fact that
MicroSoft could not help you says more about MS than the write system
call.



-- 
Inge (BoB)  { ingea@ifi.uio.no }
=========================================================================
==   Inge Arnesen, University of Oslo, Norway.                         ==
==                                                                     ==

otto@tukki.jyu.fi (Otto J. Makela) (11/06/90)

In article <9505@jarthur.Claremont.EDU> dfoster@jarthur.Claremont.EDU (Derek R. Foster) writes:
   In article <747@csource.oz.au> david@csource.oz.au (david nugent) writes:
   >In <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:
   >[how do I shorten a file ?]
   >Write zero bytes at that position. [MeSsy-DOS only solution]

   If this works, it isn't documented in the Microsoft C manuals I have.
   (And believe me, I searched!) After SEVERAL calls to Microsoft,
   (Two seperate people told me it couldn't be done from either within C or
   through DOS! I thought these people were supposed to be knowledgeable!)
   and a great deal of loud cursing, I was finally led to the chsize()
   function. This seems to be the only way of doing this from within
   Microsoft C, (And I suspect Turbo C as well.) If you are using
   streams, you will probably have to close your stream, reopen the file
   using handles, chsize() it, close it again, reopen using streams...
   What a mess. But it works, and is better than (in my case) copying
   a 20-meg file to a shorter length...

Look at the fileno() function in your manual (your library does have it,
I hope).  Returns the file descriptor (handle as it's called in MeSsy-DOS)
for the given stream.
--
   /* * * Otto J. Makela <otto@jyu.fi> * * * * * * * * * * * * * * * * * * */
  /* Phone: +358 41 613 847, BBS: +358 41 211 562 (CCITT, Bell 24/12/300) */
 /* Mail: Kauppakatu 1 B 18, SF-40100 Jyvaskyla, Finland, EUROPE         */
/* * * Computers Rule 01001111 01001011 * * * * * * * * * * * * * * * * */

drd@siia.mv.com (David Dick) (11/08/90)

In <747@csource.oz.au> david@csource.oz.au (david nugent) writes:

>In <1162@bilver.UUCP> alex@bilver.UUCP (Alex Matulich) writes:

>>Is there a way to shorten a file, that is, chop some data off the end of
>>it, so that it doesn't consume as much physical space on the disk?  The
>>file I have is too big to read into memory and write back out again, and
>>there is not enough room on the disk to write out a temporary file.

>Write zero bytes at that position.

Of course, this won't work on a UNIX system.