NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) (03/14/91)
Hello, I have a question that I cannot answer sufficiently for myself: Given a collection of n files (.zip files in my case) is there an algorithm (short of brute-force) for determining the best arrangement of files to fit on as few disks (of a given size) as possible? Suppose, for example, I have 40 files that range in size from 30 to 400 kb. I want to move them all to floppies but I want to do so in such a way as to use as few floppies as possible. I want each file to be complete on one disk, i.e. I don't want to split files across disks. My first-guess algorithm is to always place the largest file that will fit on a particular disk on that disk. However, I have my suspicions that this is not the gauranteed best way of doing it. Unfortunately, I can't think of a better way nor can I demonstrate mathematically that the biggest-first algorthim is the best...(or not) Any thoughts? ndw
forbesmc@matai.vuw.ac.nz (03/15/91)
In article <26272@adm.brl.mil>, NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) writes: > Hello, > > I have a question that I cannot answer sufficiently for myself: > > Given a collection of n files (.zip files in my case) is there an > algorithm (short of brute-force) for determining the best arrangement > of files to fit on as few disks (of a given size) as possible? > > Suppose, for example, I have 40 files that range in size from 30 to > 400 kb. I want to move them all to floppies but I want to do so in > such a way as to use as few floppies as possible. I want each file to > be complete on one disk, i.e. I don't want to split files across > disks. > > My first-guess algorithm is to always place the largest file that will > fit on a particular disk on that disk. However, I have my suspicions > that this is not the gauranteed best way of doing it. Unfortunately, > I can't think of a better way nor can I demonstrate mathematically that > the biggest-first algorthim is the best...(or not) > > Any thoughts? > > ndw I think your first guess is right : have a list of the files in order of descending size, pick the largest file and write to the disk, find out how much space is left on the disk and put the largest remaining file that will fit onto it. Repeat until no more files will fit on the disk. Now repeat the entire procedure for the next disk etc. (this method is used to minimise the queuing time for printers etc so should work for minimising the number of disks). -- +------------------------------------------------------------------+ | I'm a toxophilite, | Murray Forbes, | | so what's your problem? | Physics Department, | |----------------------------| Victoria University of Wellington, | | standard disclaimer : | New Zealand. | | all opinions here are mine | FORBESMC@MATAI.VUW.AC.NZ | +------------------------------------------------------------------+
ts@uwasa.fi (Timo Salmi) (03/15/91)
In article <26272@adm.brl.mil> NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) writes: > >Given a collection of n files (.zip files in my case) is there an >algorithm (short of brute-force) for determining the best arrangement >of files to fit on as few disks (of a given size) as possible? : >Any thoughts? Your problem is an operations research problem called cutting plane problem. You could conduct a literature search using these keywords, or something near them. I'm not personally familiar with the exact algorithms, but I know that the task has been known and studied for a long time in the O.R. science. Perhaps readers of sci.math.num-analysis would have some further information on this. ................................................................... Prof. Timo Salmi Moderating at garbo.uwasa.fi anonymous ftp archives 128.214.12.37 School of Business Studies, University of Vaasa, SF-65101, Finland Internet: ts@chyde.uwasa.fi Funet: gado::salmi Bitnet: salmi@finfun
mcastle@mcs213e.cs.umr.edu (Mike Castle {Nexus}) (03/15/91)
In article <26272@adm.brl.mil> NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) writes: >My first-guess algorithm is to always place the largest file that will >fit on a particular disk on that disk. However, I have my suspicions >that this is not the gauranteed best way of doing it. Unfortunately, >I can't think of a better way nor can I demonstrate mathematically that >the biggest-first algorthim is the best...(or not) > >Any thoughts? Amazingly enough, I happen to have to do a program that does something very similiar to this (puttint objects into backpacks) for my data structures class. Of course, I haven't started on it yet. Starting off with the biggest file IS one method (a heuristic method, as my prof said). The method we have to use will be to go through all the possible packing combinations, reporting which one will work (well, we stop tracing a method back once it becomes apparent that that combination wont work). I believe the complexity of O(n^2) once you apply dynamic programming in addition to the divide and conquer method, but I'm not sure (too lazy to look it up in my notes :-). Maybe when I get done with that program, I'll upload the generic code to simtel and garbo. Interesting. -- Mike Castle (Nexus) S087891@UMRVMA.UMR.EDU (preferred) | XEDIT: Emacs mcastle@mcs213k.cs.umr.edu (unix mail-YEACH!)| on a REAL Life is like a clock: You can work constantly, and be right | operating all the time, or not work at all, and be right twice a day. | system. :->
ccoprrm@prism.gatech.EDU (Robert E. Minsk) (03/15/91)
This algoritm has a time complexity in O(n^2) and for "n" files of random size will use approximately 0.3*sqrt(n) extra disk. For a good discussion of this and many more algorithms see, Computer Algorithms Introduction to Design and Analysis by Sara Baase -- Robert E. Minsk - Information Technology | After WREKage their is Ruins... | ARPA: ccoprrm@prism.gatech.edu | Fridays at 10pm WREK Atlanta 91.1 | uucp: ...!{allegra,amd,hplabs,seismo,ut-ngp}!gatech!prism!ccoprrm Georgia Institute of Technology, Atlanta Georgia, 30332
amead@s.psych.uiuc.edu (alan mead) (03/16/91)
As a practical matter, you would be better served by utilities that "section" zip files into multiple parts of arbitrary (say 360K) sizes. I'm not sure where such a beast can be had, but it exists (I suppose you could write your own in TP pretty easily--just watch out for extra DOS characters like ^Z). I think I friend said that the version he had wasn't 100% reliable, so use with some caution. -alan mead : amead@s.psych.uiuc.edu
nmouawad@watmath.waterloo.edu (Naji Mouawad) (03/16/91)
In article <26272@adm.brl.mil> NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) writes: >Hello, > >I have a question that I cannot answer sufficiently for myself: > >Given a collection of n files (.zip files in my case) is there an >algorithm (short of brute-force) for determining the best arrangement >of files to fit on as few disks (of a given size) as possible? > >Suppose, for example, I have 40 files that range in size from 30 to >400 kb. I want to move them all to floppies but I want to do so in >such a way as to use as few floppies as possible. I want each file to >be complete on one disk, i.e. I don't want to split files across >disks. > >My first-guess algorithm is to always place the largest file that will >fit on a particular disk on that disk. However, I have my suspicions >that this is not the gauranteed best way of doing it. Unfortunately, >I can't think of a better way nor can I demonstrate mathematically that >the biggest-first algorthim is the best...(or not) > >Any thoughts? > > ndw This is the well know Knapsack problem, where you are given a knapsack and a set of different weights and you are asked to fit in the knapsack as much weights as possible. It is an NP-complete problem. Meaning that the best known algorithm will run in time exponential in the number of weights. If you really want to write such a program, your best bet is dynamic programing, were you built a table to solve the problem for a set of increasing knapsacks using earlier solutions to construct the current one. Having said that, your intuitive algorithm may give you on average a solution that differs from the optimal one by a constant. I suggest that you stick to it, unless you really want to squeeze the last byte out of your diskettes. --Naji. References: -Garey and Johnson: Computers and Intractability A guide to the Theory of NP-Completness, Freeman. -Aho Hopcroft and Ullman, Data Structures and Algorithms, Addison-Wesley. -- ------------------------------------------------------------------- | Naji Mouawad | nmouawad@watmath.waterloo.edu | | University |---------------------------------------------------| | Of Waterloo | "The Stranger in us is our most familiar Self" |
hampton@tigger.Colorado.EDU (Jeff Hampton) (03/16/91)
Expires: 04/01/91 References: <26272@adm.brl.mil> Sender: Jeff Hampton Followup-To: Distribution: usa Organization: University of Colorado, Boulder (dept. of Computer Science) Keywords: algorithms I have been thinking of a similar type of problem myself. The problem however is an NP-complete problem. If you use the strategy of first fit, the worst-case time complexity is on the order of n^2 and produces good results. Below is an example algorithm for nonincreasing first fit(taken from Computer Algorithms by Sara Baase, 2nd ED. p.339) input: S=(s1,...,sn), where 0<si<=1 for 1<=i<=n output: An array bin where for 1<=i<=n, bin[i] is the number of the bin into which si is placed. procedure NIFF (S:RealArray;n:integer;var bin:IntegerArray); var used: array[1..n] of real; {used[j] is the amount of space in bin j already used up.} i,j:integer; BEGIN {Sort S into nonincreasing order.} for j:=1 to n do used[j]:=0 end; for i:=1 to n do {Look for a bin in which si fits.} j:=1; while used[j]+si > 1 do j:=j+1; bin[i]:=j; used[j]:=used[j]+si; end; { for } end. { NIFF } The sort can be done in n*lg(n), and j is incremented while searching for an appropriate bin at most n(n-1)/2 times. All the other instructions are executed at most n times, so the worst-case time complexity is no more than n^2. The expected number of extra bins is at most roughly 0.3*sqrt(n). Other strategies include best-fit, and next-fit. Best-fit if the si or sorted in nonincreasing order, it works about as well as first-fit. Next fit is faster but will create extra bins. Thus are some things to think about when considering your problem. I hope this helps somewhat. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + Jeff Hampton | email: hampton@tigger.colorado.edu | University of Colorado + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
phys169@csc.canterbury.ac.nz (03/19/91)
In article <26272@adm.brl.mil>, NORM%IONAACAD.BITNET@cunyvm.cuny.edu ( Norman Walsh) writes: > > Given a collection of n files (.zip files in my case) is there an > algorithm (short of brute-force) for determining the best arrangement > of files to fit on as few disks (of a given size) as possible? As others have said, this can take a long time to solve. I have got a program to do exactly this, and you should be able to get it via anonymous ftp to cantva.canterbury.ac.nz [132.181.30.3] in the pc subdirectory. It outputs a series of COPY and PAUSE commands to the standard output, and has various levels of optimisation. Sorry, the source isn't available at the moment. > > My first-guess algorithm is to always place the largest file that will > fit on a particular disk on that disk. However, I have my suspicions > that this is not the gauranteed best way of doing it. That's what my program does with level zero optimisation does. The default level (2) of optimisation usually takes a bit longer (varies a lot with the data, but 50% longer is typical), i.e. not as long as the brute-force method but often enough it is better than your (DFF) method. There is little benefit in putting too much computational effort into finding the optimum; you might only save one diskette. What is worth doing is using a backup program that can split files across diskettes. I notice that the PKZIP file format specifications allow for multiple-volume backups, and wonder why this hasn't been implemented yet. Mark Aitchison, Physics, University of Canterbury, New Zealand.