[comp.sources.d] PATCH usage

graham@sce.UUCP (Doug Graham) (06/29/88)

    I was applying the last few patches for "patch" today, and had
quite some trouble because one of the patches got mangled in the
mail. I finally got it figured out, but spent quite some time
doing it. In the course of doing so, I noticed that the latest
patch (patch-10, haven't got 11 yet) is 80K bytes, while the entire
source for "patch" is only about 120K bytes.

    Does this make sense? Would it not be more reasonable to just
post the entire thing again with the patches installed? Then it
wouldn't be necessary to hunt around for previous patches to
bring it up to patchlevel 9 so that 10 could be applied.

    In a similar vein, I noticed that when "perl" first appeared
on the net, it was immediately followed by a large number of
patches. (Around 25 I think) Since these patches came so close
on the heels of the original source, it seems to me that they
must have been available when the original source was posted.
Why were they not applied before the source was posted?

    This is not a flame; these are questions that have baffled me
for some time, and I would really like to know the reasons for
doing things this way.

Doug.

nelson@sun.soe.clarkson.edu (Russ Nelson) (06/30/88)

In article <393@sce.UUCP> graham@sce.UUCP (Doug Graham) writes:
>    In a similar vein, I noticed that when "perl" first appeared
>on the net, it was immediately followed by a large number of
>patches. (Around 25 I think) Since these patches came so close
>on the heels of the original source, it seems to me that they
>must have been available when the original source was posted.
>Why were they not applied before the source was posted?

I cannot answer for Larry, but for my free IBM-PC editor, Freemacs, it
is *much* easier to have one distribution with patches than several
distributions.  I keep a copy of all my distributions and patches
forever, so I have an incentive to keep the number of distributions
low.

The comp.sources.unix programs often sit queued up for a while, and
during that time, perl && patch && virt are available for anonymous
ftp.  Therefore, by the time Usenet sees a program, there have already
been a number of "real" users finding bugs.  I suspect that Larry
keeps the moderator updated with patches, and he bundles them up when
he posts the program.

-- 
Pray that Bush gets re-elected so that the Republicans will be blamed for it.

lwall@devvax.JPL.NASA.GOV (Larry Wall) (07/01/88)

In article <393@sce.UUCP> graham@sce.UUCP (Doug Graham) writes:
>    In a similar vein, I noticed that when "perl" first appeared
>on the net, it was immediately followed by a large number of
>patches. (Around 25 I think) Since these patches came so close
>on the heels of the original source, it seems to me that they
>must have been available when the original source was posted.
>Why were they not applied before the source was posted?

In article <1129@sun.soe.clarkson.edu> nelson@sun.soe.clarkson.edu (Russ Nelson) writes:
: 
: I cannot answer for Larry, but for my free IBM-PC editor, Freemacs, it
: is *much* easier to have one distribution with patches than several
: distributions.  I keep a copy of all my distributions and patches
: forever, so I have an incentive to keep the number of distributions
: low.

I do make multiple distributions, but I don't generally send them off to
comp.sources.unix or I'd flood the net.  Those who ftp kits from me will
find subdirectories like ~ftp/pub/patch.2.0/kits@12, where the @12 indicates
that the kits in that directory are at patchlevel 12.

: The comp.sources.unix programs often sit queued up for a while, and
: during that time, perl && patch && virt are available for anonymous
: ftp.  Therefore, by the time Usenet sees a program, there have already
: been a number of "real" users finding bugs.  I suspect that Larry
: keeps the moderator updated with patches, and he bundles them up when
: he posts the program.

This is the primary reason you saw 25 quick patches for perl.  When I send
things off to Rich they are almost always at patchlevel 0.  It also happens
that newer things evolve faster.  It is a measure of perl 2.0's stability
that it's been three weeks since I sent it to Rich and there's only one
published patch (so far).

Rich had 10 of those 25 patches at the time he posted perl 1.0.  (I think I
was up to about 14 when the first kit came in here)  And he did include those
10 as an extra message.

It's true that it would take less overall bandwidth to repost the kits,
but it's never true at any point in time that the next patch is longer
than the distribution.  Or rather, it's only been true once, and that is
why you got perl 2.0 patchlevel 0 instead of perl 1.0 patchlevel 30.
That 30th patch would have been about 3 times bigger than a new distribution.
About 10 times bigger if I'd sent out a new patch every time I made a change.
Sometimes you just have to break sync and start fresh.

The danger, of course, is that you might never get back in sync.  I went
off to my hermitage to come up with a new rn over a year and a half ago,
and I still don't have a runnable version, let alone a distributable one.
It's just sittin' around in pieces on the floor.  The news 3.0 people want
me to integrate it with them, and I can't even integrate it with 2.11 yet.
One has only so much spare time, sigh...

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

james@bigtex.uucp (James Van Artsdalen) (07/01/88)

IN article <393@sce.UUCP>, graham@sce.UUCP (Doug Graham) wrote:
>     In a similar vein, I noticed that when "perl" first appeared
> on the net, it was immediately followed by a large number of
> patches. (Around 25 I think) Since these patches came so close
> on the heels of the original source, it seems to me that they
> must have been available when the original source was posted.
> Why were they not applied before the source was posted?

When a program is sent to comp.sources.unix, it isn't instantly posted
to the entire net.  If nothing else, Rich $alz has an employer who
doesn't pay him to handle net sources...  Also, I get the impression
Rich makes an effort to ensure that (1) junk doesn't get out and (2)
that sources will unpack correctly, do have man pages, and all that
stuff.  It clearly takes a lot of time.

When Larry sends in sources, he usually (always?) makes it available
via ftp at the same time, so that people with arpanet connections can
get the sources without waiting for the comp.sources.unix postings.
Those people then start compiling & using *and sending in bug reports*
before the sources are actually posted.  Thus it is possible that bugs
would be fixed fairly quickly, before the sources made it out.
Recalling the sources from Rich would just delay the real release
indefinitely: there are *always* more bugs.

I have perl & the other programs from devvax.jpl.nasa.gov available on
bigtex for anonymous uucp download via TB+ if there's anybody who
can't wait.
-- 
James R. Van Artsdalen   ...!ut-sally!utastro!bigtex!james   "Live Free or Die"
Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746

chip@vector.UUCP (Chip Rosenthal) (07/01/88)

In article <393@sce.UUCP> graham@sce.UUCP (Doug Graham) writes:
>I noticed that the latest patch (patch-10, haven't got 11 yet) is
>80K bytes, while the entire source for "patch" is only about 120K bytes.

I sometimes wish patch had a mode which replaced the entire file rather
than patching it.  This would have allowed, for example, patch patch
number 11, which really was a shar archive, to be a patch.

The obvious drawback is that you would lose your local hacks, so patch
should default to a "really?" question.  In this case, you would have to
back out your hacks, run patch, and then put them back in.  But you run
the risk of doing this anyway whenever you hack on software which will
potentially have future patches.
-- 
Chip Rosenthal /// chip@vector.UUCP /// Dallas Semiconductor /// 214-450-0400
{uunet!warble,sun!texsun!rpp386,killer}!vector!chip
I won't sing for politicians.  Ain't singing for Spuds.  This note's for you.

karl@ddsw1.UUCP (Karl Denninger) (07/03/88)

In article <1129@sun.soe.clarkson.edu> nelson@sun.soe.clarkson.edu (Russ Nelson) writes:
>In article <393@sce.UUCP> graham@sce.UUCP (Doug Graham) writes:
>>    In a similar vein, I noticed that when "perl" first appeared
>>on the net, it was immediately followed by a large number of
>>patches. (Around 25 I think) Since these patches came so close
>>on the heels of the original source, it seems to me that they
>>must have been available when the original source was posted.
>>Why were they not applied before the source was posted?
>
>I cannot answer for Larry, but for my free IBM-PC editor, Freemacs, it
>is *much* easier to have one distribution with patches than several
>distributions.  I keep a copy of all my distributions and patches
>forever, so I have an incentive to keep the number of distributions
>low.

But the enormous volume engendered by all the patch traffic is the point of
contention.  It is very inefficient to send a 90K original posting, then
200K of patches.  Much, much better to send the original, patched...

Your convenience is a poor excuse for the extra bandwidth consumed.  It's
much better for the *net as a whole* if authors are efficient in the
bandwidth they consume.  The present scheme (send original, then 20+ sets of
context-diff patches) is horridly inefficient.  There have been many recent
episodes of "patch-mania" that illustrate this.

Patch should *not* be used to make wholesale changes to programs over the
net.  It IS a wonderful tool for those "small bugs" -- but when the sum of
the patches gets to the point where you're consuming 50% of the bandwidth of
a repost, you should repost.  Remember, the patch itself is *useless*
without the underlying program, and thus by definition has a limited
audience.  A repost is useful in and of itself, and thus (by definition) has 
a larger audience.  This is even true for programs such as patch itself,
which nearly everyone uses -- if you miss a patch, you must now FIND that
patch to be able to use all the "wonderful" patches which follow.  Ugh.

--
Karl Denninger (ddsw1!karl) Data: (312) 566-8912, Voice: (312) 566-8910
Macro Computer Solutions, Inc.    "Quality solutions at a fair price"

bd@hpsemc.HP.COM (bob desinger) (07/05/88)

Karl Denninger (karl@ddsw1.UUCP) flames:
> Much, much better to send the original, patched...

Ideally, yes.  But (as Larry and others pointed out), the patches
don't exist when the original enters the queue.  The cause of the
problem here is that it takes a long time for the comp.sources.unix
queue to empty; the "original then patch-mania" syndrome is really a
symptom.  Solving the problem requires someone to volunteer to help
Rich Salz out, or else a better queueing method.

BTW, Rich does a great job, especially considering it's volunteer
labor.  I'd rather he continue upholding his high standards than lower
the quality of the postings by pushing them out the door faster.

> Patch should *not* be used to make wholesale changes to programs over the
> net.  It IS a wonderful tool for those "small bugs" -- but when the sum of
> the patches gets to the point where you're consuming 50% of the bandwidth of
> a repost, you should repost.

The net doesn't seem to share your conclusions: patches seldom get
flamed, but reposts almost always do.

> Remember, the patch itself is *useless*
> without the underlying program, and thus by definition has a limited
> audience.  A repost is useful in and of itself, and thus (by definition) has 
> a larger audience.  This is even true for programs such as patch itself,
> which nearly everyone uses -- if you miss a patch, you must now FIND that
> patch to be able to use all the "wonderful" patches which follow.  Ugh.

It's actually quite easy to get patch patches if you read the
prologues to any of them and if you can navigate the mail-routing
network.  Several sites offer patch over Internet for the privileged
few, uucp connections for the masses, and by mail for just about
everyone.  And don't forget comp.sources.wanted.

But this is merely a detail.  I don't agree with your underlying
argument.  It's relatively easy, although time-consuming, to keep up
with patches if you have a reliable newsfeed and if you keep up with
comp.sources.bugs.  The alternatives are to fix the bugs yourself or
run buggy software, neither of which appeal to me.

You want us to reduce net bandwidth by eliminating patches for
recently-posted sources.  But the number of bytes of patches is mouse
eyelashes compared to the non-technical junk that comes streaming from
your newsfeed.  Compare the source postings and patches with the size
of the angry screams over JJ@Portal, talk.politics.*, talk.bizarre, or
the rec.* groups.  If you want to speed up the net, don't waste time
trying to optimize the part that takes 2% of the bandwidth.

-- bd

barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (07/07/88)

In article <1323@ddsw1.UUCP> karl@ddsw1.UUCP (Karl Denninger) writes:
|But the enormous volume engendered by all the patch traffic is the point of
|contention.  It is very inefficient to send a 90K original posting, then
|200K of patches.  Much, much better to send the original, patched...

It is better to get 13 patches than 13 copies of the entire
distribution. 

As since I USE DAILY Larry's software, I do appreciate getting the
patches before I find the bugs the hard way.
-- 
	Bruce G. Barnett 	<barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP>
				uunet!steinmetz!barnett

karl@ddsw1.UUCP (Karl Denninger) (07/08/88)

In article <4764@vdsvax.steinmetz.ge.com> barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) writes:
>In article <1323@ddsw1.UUCP> karl@ddsw1.UUCP (Karl Denninger) writes:
>|But the enormous volume engendered by all the patch traffic is the point of
>|contention.  It is very inefficient to send a 90K original posting, then
>|200K of patches.  Much, much better to send the original, patched...
>
>It is better to get 13 patches than 13 copies of the entire
>distribution. 

I'm not arguing for dispensing with the patch process -- just for 25 
different sets.  

What I was suggesting was that when the patch size gets out of hand (ie: when
you post 200K of patches consisting of several sets, all at once to a 90K 
source) that consideration be given to sending another, base, distribution.

This both cuts traffic and increases the average information content to the
net as a whole in the group.  Here's how:

o Those who don't have the posting will get the entire package at the
  next full repost, whenever it may occur.  Thus, for these users, it
  has a much higher utility than "another patch" would have.

o Those who do already have the original get the same utility as they would
  from a patch; that is, a new version with bug fixes and enhancements.

I don't think that I was advocating complete reposts for each bug and
fix....... that would be a gross waste.   What I was advocating was common
sense instead of the "make it easier on the authors" approach.   When we're
talking major upgrade (ie: major version change) you're probably best off
doing a repost; context diffs get enormous.  A place to start would likely
be 75-80% of the size of a reposting... and adjust to taste and net.desire.

Now the mahjong tiles, that is 'nother matter, better left to a 
different group or unsaid completely :-)

--
Karl Denninger (ddsw1!karl) Data: (312) 566-8912, Voice: (312) 566-8910
Macro Computer Solutions, Inc.    "Quality solutions at a fair price"