[comp.software-eng] Paradigms for source control

jxh@cup.portal.com (05/11/88)

Some time ago, there was a discussion among a few friends of mine about
source-control programs.  I thought it was time to revive this issue, and
now that I have a news feed I thought that comp.software-eng was an
appropriate place for it to rage.  What follows is the entire text of the
discussion as I received it over email.  Beware of people's email addresses:
several have changed since then, notably mine.  Also, bear in mind that these
were personal messages originally, and were not composed for net consumption,
so first names are used freely.  Here is a list of the people involved, and
their real names and (I hope) current addresses:

Jim Hickstein (jxh)     jxh@cup.PORTAL.COM [me]
Jeff Woolsey (jlw)      woolsey@nsc.NSC.COM
Jeff Pomeroy (jlp)      jlp@CRAY.COM
Dan Germann (deg)       deg@kksys.UUCP
Mark Ransom (msr)       msr@kksys.UUCP
Alan Arndt (aga)        (not on the net just now: send to care of jxh)

I intended this discussion to revolve around issues of implementing our own
such program.  Now that I open this to the net, I would welcome thoughtful
insights about features of various (perhaps extant) programs; but still with
an eye toward implementing something.  Please, no flames about somebody-or-
other's lousy program that ate your file.  We should work toward specifying
a program that will solve problems for those involved.
============================================================================
-
From tsec!nsc!amdahl!meccts!kksys!deg Fri Sep 11 08:13:22 1987
Received: by nsc.NSC.COM; Thu Sep 10 21:44:44 1987
Received: by amdahl.UUCP (4.12/UTS580_/\o-/\)
id AA07489; Thu, 10 Sep 87 16:36:09 PDT
Received: by meccts.MECC.MN.ORG (smail2.5)
id AA16368; 10 Sep 87 04:57:37 CDT (Thu)
Received: by kksys.UUCP (smail2.3)
id AA01692; 10 Sep 87 03:36:15 CDT (Thu)
To: deg, meccts!amdahl!nsc!tsec!nsc-nca!jxh, meccts!amdahl!nsc!woolsey, msr,
        umn-cs!cray!jlp
Subject: Welcome to library-people mailing list
Date: 10 Sep 87 03:36:15 CDT (Thu)
From: deg@kksys.UUCP
Message-Id: <8709100336.AA01692@kksys.UUCP>

A week or so ago, Jim and I were talking about Revise, and noted that
it would be nice to have all our discussions down on paper.  We also
thought it would be nice if our discussions were not just between the two of
us, given that there are several people who are interested in source library
maintenance.  Therefore, I have created this mailing list.  Our intention is
to discuss the functions and features desirable in tools which maintain
source programs.

Anyone care to kick things off?

-Dan


From jxh Fri Sep 11 18:31:15 1987
To: jxh, tsec!nsc!deg@kksys, tsec!nsc!jlp@cray, tsec!nsc!msr@kksys, tsec!nsc!woolsey
Subject: library-people kickoff

Well, I suppose I'll start, but this is probably not the first kickoff, since
we're not synchronized.
Let me first propose that Alan Arndt be added to this group: Al works for
Ultron Labs in San Jose, and uses Polytron PVCS (Polytron Version Control
System).  He has quite a bit of experience with it.  I do not propose that he
be added to sell Polytron to us, but that his experience with its good *and*
bad points may serve us well.  He is not on the net as I speak, but I intend
to help him rectify this situation soon.  Tentatively, send to nsc-nca!aga.

For those of you who just tuned in, let me recap my discussions with Dan.
I have been sort of working on Revise, which is a descendent of UPDATE/MODIFY
from the Cybers, in that it has some of the same basic assumptions, namely that
there are Decks containing Lines (they got rid of the word Card at this point),
and that lines may be Active or Inactive, and that they may be inserted, and
deactivated, but
not actually deleted, by a modset.  A modset can also reactivate "deleted"
lines.  It is different from UPDATE/MODIFY in that
it is a standard Pascal program designed for portability, and it operates on
simple Text files: the PL (program library) format is specified to be so
general as not to cause problems on any known Pascal.  Tradeoffs were made in
favor of portability over efficiency of time and space.  This is a very pure
example of the tenet that the more adaptable a system is, the less well-adapted
it is in each different circumstance.  It is also different from UPDATE/MODIFY
in that it operates on only one PL at a time.  Dan and I identified this is
a major departure from a quite useful feature, namely the *OPL directive,
which indicates that several OPL files be considered as "OldLib" for one
run.

There are those in Unix land who use a program called SCCS, and another called
RCS (not caps).  I am less familiar with these, but I think that PVCS is
related: They operate by controlling the "checking out" and subsequent
"checking in" of modules.  One checks out a module, edits it, and checks it
in, identifying a "version" as the label for this instance of that module.
Basically, these programs run COMPARE over the current and new modules, and
store the differences.  Also, they tend to store the most "recent" "version"
in a form most efficient to access: the stored changes are "inverted" so that
the program applies a change to "go back" to a "previous" version.  All
this is in quotes because, as we know, program modifications are not
necessarily monotonic with time.  An example is the distribution kit for
Revise, which gives a base PL, and two Modsets "CYBER" and "VAX", one of
which is applied to make Revise compile in each environment.
Neither CYBER nor VAX is more "recent" than the other, although one could
certainly say that they are "versions."  I often use the word "flavors" to
express this concept.

The differences between SCCS/RCS and UPDATE/MODIFY are radical.  They are
so different, that I say that each represents a "paradigm" of source
control generally.  The UPDATE/MODIFY paradigm is the one with which I am
most familiar, so I naturally want a program that embodies it to run in my
preferred (enforced?) environment.  Those "brought up" on RCS or SCCS cannot
see any merit in the other paradigm, at least those to whom I have talked.

I invite you to make your own observations about these paradigms, and about
specific programs.  Certainly, there are features conceivable that exist in
neither of these worlds: let us consider them, as well.  Our goal should be
to establish a new paradigm which embodies the best features of all others,
and can be implemented well.  We can then move to design questions about
specific implementations.

Fire away.

Jim Hickstein
...!nsc!tsec!nsc-nca!jxh
VSAT Systems, Inc.
San Jose, CA


From tsec!nsc!nsc.NSC.COM!woolsey Tue Sep 15 20:23:20 1987
Received: by nsc.NSC.COM; Tue Sep 15 10:32:16 1987
Received: by pubs.nsc.com; Tue Sep 15 10:30:31 1987
Date: Tue, 15 Sep 87 10:30:31 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8709151730.AA13546@pubs.nsc.com>
To: deg@kksys.UUCP, jlp@cray.COM, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        tsec!nsc-nca!aga
Subject: library-people first down

>[Al] uses Polytron PVCS (Polytron Version Control System).

Department of Redundancy Department

I looked into a similar package while at TGC; I think Grady [Davis, now @ VSI]
got it for us to evaluate.  I can't remember the name of the package. That's
some indication of how impressed I was.  Perhaps Jim can remember its name.
It was of the general paradigm rampant among small machines for such packages.
(For these purposes a small machine is anything of VAX power or smaller, or
anything running Unix (sorry, Jeff).)

>it has some of the same basic assumptions, namely that
>there are Decks containing Lines (they got rid of the word Card at this point),

but not the word "Deck"?

>it is a standard Pascal program designed for portability, and it operates on
>simple Text files: the PL (program library) format is specified to be so
>general as not to cause problems on any known Pascal.  Tradeoffs were made in
>favor of portability over efficiency of time and space. 

Weren't there some cases in the code for Revise where an order of magnitude
improvement could be had at no cost in portability?  Didn't you find some of
those, Jim?

>It is also different from UPDATE/MODIFY in that it operates on only 
>one PL at a time.  Dan and I identified this is [sic]
>a major departure from a quite useful feature, namely the *OPL directive,
>which indicates that several OPL files be considered as "OldLib" for one
>run.

Strike one.  It could be difficult to restore this feature efficiently, judging
from present Revise performance with one PL on machines we can afford to 
purchase.

>There are those in Unix land who use a program called SCCS, and another called
>RCS (not caps).

Huh?  RCS is still the name of RCS, even though RCS might be invoked as rcs.

>I am less familiar with these, but I think that PVCS is
>related: They operate by controlling the "checking out" and subsequent
>"checking in" of modules.  One checks out a module, edits it, and checks it
>in, identifying a "version" as the label for this instance of that module.

Can any of these programs operate correctly if the directory and files where
the "library" are are read-only?  The check-out process usually wants to note
somewhere that someone is doing something which could cause inconsistencies
in someone's view of the state of the "library".

M/U users used a property of the NOS file system, namely that a D/A file
was BUSY (write-locked by someone else) or unwritable (someone has it in
READ without ALLOW-MODIFY or ALLOW-APPEND).

>Basically, these programs run COMPARE over the current and new modules, and
>store the differences.  Also, they tend to store the most "recent" "version"
>in a form most efficient to access: the stored changes are "inverted" so that
>the program applies a change to "go back" to a "previous" version.  All
>this is in quotes because, as we know, program modifications are not
>necessarily monotonic with time.  An example is the distribution kit for
>Revise, which gives a base PL, and two Modsets "CYBER" and "VAX", one of
>which is applied to make Revise compile in each environment.

I think the importance of this concept cannot be overstated, but has been
overlooked by all of the "small system" [as above] library tools.  This is
feature code, a sort of conditional compilation/assembly pulled back one
level of processing.

>The differences between SCCS/RCS and UPDATE/MODIFY are radical.

That's putting it mildly.  The only thing in common is that they are
attempts to solve the same problem.  Well, almost.  I think MODIFY/UPDATE
recognized additional sub-problems and tried to solve those, too.

>They are
>so different, that I say that each represents a "paradigm" of source

No need for quotes here...

>control generally.  The UPDATE/MODIFY paradigm is the one with which I am
>most familiar, so I naturally want a program that embodies it to run in my
>preferred (enforced?) environment.  Those "brought up" on RCS or SCCS cannot
>see any merit in the other paradigm, at least those to whom I have talked.

Indeed.  Perhaps we can attack their character by saying that they never saw
the need because they never worked on a VERY large product (such as NOS) 
requiring coordinated effort by sizable teams.

I see here also another major difference between M/U and *CS:  The former is
monolithic, while the latter is incremental.  Let me explain.  M/U run on 
machines where (at least theoretically) there is enough power available that
it is no great drain on resources to keep editing a modset and creating a 
COMPILE file every time you want to assemble/compile something.  You do not
notice how much (possibly redundant) work you are asking the machine to 
perform,  because small increments of work are not noticable.  This is true
only up to a point, as I would sometimes go the *CS route: pull out a source
file, edit it for two days, THEN use compare to make a modset.

*CS run on machines without sufficient power to hide these small increments
of work.  Thus the paradigm changed to permit the elimination of most of
the MODSET -> COMPILE file operations in an edit cycle.  The time required
for the edit cycle remains on the not-enough-time-to-get-coffee side of the
line, whereas with M/U (and Revise, as we have seen) it retreats past coffee
and on into time-enough-to-read-War-and-Peace territory.  Nothing like
breaking a train of thought to introduce errors and reduce engineer 
effectiveness.  So small increments of change are evaluated.  Other examples
of this technique are incremental compilers, and Chess 0.5 (as featured in BYTE
some time ago).  There was even some talk at UCC about a text editor that would
spit out a modset when you were done.

>I invite you to make your own observations about these paradigms, and about
>specific programs.  Certainly, there are features conceivable that exist in
>neither of these worlds: let us consider them, as well.  Our goal should be
>to establish a new paradigm which embodies the best features of all others.

Dream on, then?  OK.  

I like named modsets.  I like independent modsets.  I like being protected
from disaster (significant effort required to make changes stick).  I like
to remove modsets.  I like to make virtual OPLs (*OPLFILE) (Often I'd use
four such PLs in building the Cray Station.).  I would like to minimize
work and maximize speed using incremental techniques.  I'm not completely
comfortable with the smallest unit of change being a line, as the significance
of lines diminishes in modern languages. For that matter, we aren't always
maintaining programs.  Soon library-people shall enter the world of the DBMS.

The biggest difference between MODIFY and REVISE/UPDATE is random access, and
everything you can do with it.

Your turn.



From tsec!nsc!amdahl!meccts!kksys!deg Sat Oct  3 18:19:31 1987
Received: by nsc.NSC.COM; Tue Oct  6 00:28:21 1987
Received: by amdahl.UUCP (4.12/UTS580_/\o-/\)
id AA25752; Mon, 5 Oct 87 23:39:04 PDT
Received: by meccts.MECC.MN.ORG (smail2.5)
id AA06036; 6 Oct 87 00:04:28 CDT (Tue)
Received: by kksys.UUCP (smail2.3)
id AA10528; 5 Oct 87 22:30:41 CDT (Mon)
To: deg, meccts!amdahl!nsc!tsec!nsc-nca!aga,
        meccts!amdahl!nsc!tsec!nsc-nca!jxh, meccts!amdahl!nsc!woolsey, msr,
        umn-cs!cray!jlp
Subject: Ramblings from one of the guys "in the back"
Date: 5 Oct 87 22:30:41 CDT (Mon)
From: deg@kksys.UUCP
Message-Id: <8710052230.AA10528@kksys.UUCP>

>SCCS, RCS, etc...

SCCS attempted to enforce module integrity by allowing only one user to
gain access to the "source" of the module in "edit" mode at one time.
There were several commands in the SCCS package, "admin" to administrate
the SCCS files, "delta" to apply a change to a SCCS file, "get" to obtain
the source to a SCCS file, etc.  I believe that "get -e" (or something)
was used to declare that you intended to make changes to the file, rather
than just look at or compile it.  You were not allowed to do a "get -e" on
a file that someone else had interlocked by their "get -e".  The interlock
was cleared when the "delta" to the SCCS file was posted.  The "teeth"
in SCCS were due to the files being owned by a project leader, who created
and modified the SCCS files.  He (actually, SCCS probably did this by
default) would set the file permissions so that only he would be able
to write the SCCS versions of the files.  The project team members would
be in a unix group that had permission to read the SCCS files.
I beileve that RCS works similarly, except that the commands are
"ci" (check in) [don't mistype your editor name or the file disappears!]
and "co" (check out).  I have no idea how they work.  In either case, if
the SCCS/RCS files were not protected BY THE OPERATING SYSTEM, any person
with write permission to the files could apply any change at any time.
In fact, they could trash the files completely.  What an interesting
implementation of "source code control".  Actually, this is no different
from Modify/Update/Revise/yournamehere; perhaps we will have to rely on
the file protection facilities available in the host operating systems
to ensure PL integrity.

>...non-monotonic program modifications...

Another advantage to the M/U pardigm is the ability to have a "debug"
or "test" modset that is not kept in the PL.  The Pascal Group at UCC
had a modset that introduced all sorts of good writelns into the compiler
source.  We used this whenever there was a bizarre code generation error.
All we had to do was apply the modset and recompile the compiler.
Unfortunately, when we made MAJOR changes to the compiler, we had to
resequence the modset.  Fortunately, this was an infrequent occurrence.
We could have made the modset a permanent part of the PL, but it would
have blurred the otherwise clear boundary between the compiler and the
debug code.  As part of the PL, the modset would have required updating
whenever a compiler change affected it.  If we put the debug modset
corrections in the compiler change, we could no longer simply "yank"
the debug modset to remove the debug code.  If we made the corrections in
a second compiler modset, we would have been unable to "yank" the compiler
changes without also yanking the debug code modset.  This sounds like
a good issue to address; perhaps what we need is an ability to group
modsets: when mod1 is yanked, also yank mod2, and when mod3 is yanked,
also yank mod2.  Then again, maybe this is a load of rubbish.

I recently found myself longing for this capability here at CFA.  We have
two versions of a printer driver: production and test [I know, I know...
it sounds like an IBM shop.  sorry.].  In the beginning of this year, we
drastically changed the production version.  As we all know, test versions
of software hang around forever.  This printer driver is no exception.
If I had a modset that could flip between the two versions, I'd be ecstatic.
However, this is CFA...  stone knives and bearskins, remember?  At least we
take offside dumps every now and then...  but source control?  Forget it...
we come as close to good source control as Bork comes to having a real beard.

>what else would you call the other Revise feature-vestige but "Common
>Deck?"

Associate(d) Deck?

>Observe, however, that if the logic of Revise changes too drastically,
>the source will not be recoverable.  (It also helps to take executable
>versions of Revise on field trips.)  This is the bootstrapping problem.

Gee, that's a good point.

>Virtual OPLs (*OPLFILE)...

Maybe we should reassess the need for this feature.  Why was it there
(in Modify) in the first place?  As near as I can tell, the only reason
for *OPLFILE was for an ABS assembly, where you had to *CALL all the
subroutines you used into the source fed to COMPASS so there would be
no external references.  Why did we have ABS assemblies?  Another good
question.  Answers include being able to create multiple entry point
programs, having the "good" loader tables (needed by Cyber loader if
you ran under RFL,0), and being able to fix inter-program communication
areas at specified addresses (did we REALLY do that?).  I'm not sure if
any of these reasons applies to what we're doing today.  We all use
language processors that generate relocatable object files, and don't
complain too much about having to link everything.  OK, so we use
Turbo Pascal, too.  But we've already complained about that.  Maybe they
will fix these annoyances in 518 ... er ... version 4.0.  At any rate,
do we really need to include subroutines with *CALL if we have pre-compiled
versions of them in a library somewhere?  Just think, if we don't have to
compile them every time, we can cut down the compile-test-edit cycle time.

>Revise, as Dan convinced me with great eloquence and amplitude, is in the
>past.  It is rather too fixed to bother spending much time reworking it...

>>Strange though this sounds, I think I now understand Revise
>>well enough to abandon it.  I embarked on my translation project
>>because I believed that Revise contained the Final Wisdom about
>>library management (well, sort of) if I could only get the oracle
>>to speak.  Having dug deep enough to understand Revise quite
>>completely, I can now see its shortcomings clearly.
>
>Abandoning it is perhaps a bit drastic.  Revise does embody some useful
>concepts.  It's just that they are in a form that is not abstract
>enough for our discussions.

There are many good things present in Revise; it's just that, along with
all the useful features and familiarity (with the M/U-like interface),
there's a lot of deadwood.  Jeff P. keeps telling me about the source
maintenance utilities they use at Cray.  I beleive he said it was a collection
of three or four C-language programs, one to three pages each (how long is
Revise?).  He also said that most of the good features in Modify were
implemented.  Jeff, I'd sure like to hear more about those programs.

Other reasons for starting from scratch include having the freedom to
come up with a package (which, from the sounds of things, is heading toward
a DBMS...  there, now we've all said it) capable of performing the functions
we deem necessary, without having to work around an existing inadequate
framework, and being able to do with the resultant package as we please.

>The difference here is that we want transactions that can be backed out,
>and we want transactions that are independent of one another.

Well said, Jeff.  However, does this necessarily imply that a change to
the database, consisting of multiple interrelated transactions, will result
in a consistent update?  What I'm talking about here is the "classic" OPL
and JPL system used at UCC.

>>I think the edit-debug cycle followed by the
>>process of making a modset to describe what you've just been
>>through is valid.
>
>What that does is solidify (freeze) a set of changes.  By making them
>more permanent you're supposed to think about them more.  But the existence
>of COMPARE along with text editors subverts that intent.

I don't agree that Compare is responsible for sloppy work.  It does make
modset creation easier, but aren't we all looking for ways to make source
maintenance easier?  It is up to us to continue to be responsible and
professional about our changes; being able to easily generate a modset from
an edited source file is no excuse for failure to do a thorough job.

>The more I think about this, the less I know.

One thing is for certain: by the time we have a product, we'll all know
what the pitfalls of source maintenance are.  That, itself, may be the
most valuable thing each of us will carry away from this project.

>P.S.  How come I haven't heard anything from anyone but JLW?  Am
>I not getting through to faraway parts?  Can you HEAR ME IN THE
>BACK, THERE?  (This is the second note I have sent to
>library-people).

I don't think Mark has logged in yet, and I'm terminally disorganized.
Steve Oyanagi was mentioning a new Unicos release due out soon, so JLP
is probably up to his ears in testing.


From tsec!nsc!nsc.NSC.COM!woolsey Mon Sep 28 22:44:45 1987
Received: by nsc.NSC.COM; Thu Oct  1 14:39:28 1987
Received: by pubs.nsc.com; Thu Oct  1 14:39:21 1987
Date: Thu, 1 Oct 87 14:39:21 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8710012139.AA04751@pubs.nsc.com>
To: deg@kksys.UUCP, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        pyramid!crayamid!cray!jlp, tsec!nsc-nca!aga
Subject: more ramblings

>To: jxh, tsec!nsc!amdahl!meccts!kksys!deg, tsec!nsc!amdahl!meccts!kksys!msr,
>        tsec!nsc!amdahl!meccts!umn-cs!cray!jlp, tsec!nsc!woolsey

Well, it ought to get to those other sites.  I just hope that tsec is smart
enough to realize that all the recipients for the copies it got are going
the same direction.

>Subject: Response to JLW's First Down

I see you learned how to do article quoting.  Usually that is the
majority of any article over 50 lines.  Not in this case.  How
unusual.

>what else would you call the other Revise feature-vestige but "Common
>Deck?"

Other extant names are "include file" and "header file", neither of
which I like very much.  Somewhere out there there is a gem of a term
for describing this concept ("subroutine"?  "macro"?) but it hasn't
presented itself yet.

>Not at *no* cost.  I adapted Revise (not to say hosed over) to
>take better advantage of my environment.  Your statement does
>not imply respect for the authors.  I must defend them (having
>seen their code most closely): they implemented tradeoffs in
>favor of portability, but they implemented them well.  Revise is
>a beautiful Pascal work-of-art. Alas, art is seldom utilitarian. 

>Among other things, Pascal's character I/O was leaned on heavily;
>it is not implemented well in the Pascal compilers available to me.  
>Borland's "product" (for all its good traits, it is *NOT* a Pascal compiler
>if it can't compile Revise) fails to implement character I/O *at all*, 

Your statement does not imply respect for the authors.  I must defend
them (having used their code most closely): I contend that they DID
implement character I/O.  I base my statement on the presence of big,
ugly kludges like BLOCKREAD and the untyped FILE to provide block
I/O.  Granted, the character I/O of which you speak is characterized
[sorry] by file POINTERS, but that wasn't blindingly obvious in your
tirade.

As for the original authors of Revise, I intended not to impugn their
abilities as Pascal programmers--indeed, some, if not all, of them had
their fingers in the P-6000 pie.  I'd be rather surprised [why does
that word have two r's in it? -- never mind] if Revise was not already
fairly optimal from P-6000's point of view.  Rather, I meant to have
my recollection of Jim's efforts clarified, as was obviously needed.

Incidentally, I sure hope that our [NSC's] forthcoming Pascal compiler
has file pointers.  I think it does, owing to reports of efforts to pass
the Tasmanian test suite.

>>>...OPLFILE...
>
>>It could be difficult to restore this feature efficiently...
>
>It is not my intention to *restore* features of MODIFY to Revise. 

It would be easier to add it to something else.

>Rather we should concentrate on specifying a new program which
>has these desirable features.  Revise, as Dan convinced me with
>great eloquence and amplitude, is the past.  It is rather too
>fixed to bother spending much time reworking it until it no
>longer resembles its former self.  That was just my approach
>during the early stages of my work in this realm precisely
>because I could not afford to change the logic of Revise for fear
>of destroying the only *documentation* of its behavior on other
>systems: the code itself.  

There's a paradox here.  Fear of destroying this "documentation" can
be alleviated by making a copy.  Or by applying your changes as modsets.
Observe, however, that if the logic of Revise changes too drastically,
the source will not be recoverable.  (It also helps to take executable
versions of Revise on field trips.)  This is the bootstrapping problem.

>Strange though this sounds, I think I now understand Revise
>well enough to abandon it.  I embarked on my translation project
>because I believed that Revise contained the Final Wisdom about
>library management (well, sort of) if I could only get the oracle
>to speak.  Having dug deep enough to understand Revise quite
>completely, I can now see its shortcomings clearly.  

Abandoning it is perhaps a bit drastic.  Revise does embody some useful
concepts.  It's just that they are in a form that is not abstract
enough for our discussions.

>>>... They operate by controlling the "checking out" and
>>>subsequent "checking in" of modules.
>
>>Can any of these programs operate correctly if the directory and
>>files where the "library" are are read-only?
>
>Perhaps you phrase your question too narrowly.  How about:  Can
>these programs utilize existing file protection mechanisms to
>protect the library from accidental or unauthorized modification?
>I believe that PVCS has a network *version* ( <-irony ) which
>can work with, e.g. PC-Network-compatible thingies.

I have a great deal of difficulty believing that the mere presence of
a network version of something like PVCS prevents me from using a non-
network version to modify the library, or from going at the library
with ordinary DOS commands.  

Protection of the library is important, and UPDATE was able to take
advantage of the simple scheme available with 1/2" tape: write-rings.
(UPDATE (and I guess Revise, too) could process a library sequentially)
The trouble is that existing file protection mechanisms vary widely, and
may not even exist in some places. 

>But even if some available program does this, it probably won't do it to our
>liking.  PVCS is awfully tightly coupled to the architecture of
>the "network", in that the application program interface for
>file- and record-locking and -sharing is quite specific to PC
>DOS.  I would hesitate to try to write a "portable" program which
>assumed that this interface existed everywhere.

Oh, you could go ahead and write such a program.  It would only be
portable to systems that supported that interface.  All you have to do,
then, is select an interface which is available everywhere that you
would want to take your program.  This is what standards committees are
for, and what they occasionally accomplish.  (e.g. POSIX. (P = Portable))

>I think we should simply say that our ideal program must have
>this capability.
      ^^^^^^^^^^
Property, please.  This should not be optional.

>Thinking further about this: MODIFY could use
>read-only PLs without telling anyone about it (i.e. writing in
>the PL some indication that I am making a modset); that
>information really just prevents two people making conflicting
>modsets; 

Actually, it just keeps files consistent.  As applied to MODIFY,
this provides the prevention mentioned.

>in MODIFY, that conflict was resolved later by the
>authority responsible for actually applying changes to the
>system-wide PL.  I think it would be neat if we could somehow
>automatically avoid such conflicts or at least automatically
>resolve them.

Avoiding them is easier than resolving them.

>Perhaps modsets being applied at "the same time"
>can be sorted out by a program and adjusted.  One such conflict
>that seems a good candidate is that of one modset naming a line
>just deleted by another; a program that is "co-applying" these
>modsets would "know" about this line until all modsets were
>applied.  There.  I just opened THE big can of worms.  Come and
>get it!

What an appealing metaphor.  Yecch.

In the MODIFY universe, there are very names for each type of
problematic combination of modsets.  I'll describe them here and let
you match them to your ideas above (and below, it turns out).

Dependent modsets relate such that one modset refers to a line that the
other inserted or deleted.  Modsets can refer to deleted lines; actually,
they are inactive rather than deleted.  This kind of reference generated
warnings from UPDATE or some such program.

The trivial case of conflicting modsets would be two identical modsets with
different names.  Applying the second one should have no effect other than
doubling the inserted groups.  These modsets may be yanked or deleted 
indepently of one another.  The generalized case is two modsets that do the
same thing to a line in a third modset or the original text.

Another kind of dependency [is that a word?] is two modsets that insert
after the same card (line).  Here the dependence is upon what order the
modsets are applied, with the newest modset's text appearing first.

Some or all of these conditions could be detected automatically, although
MODIFY never bothered.  UCC had a program called DEPEND to address some
of these issues.

Obviously, conflicting modsets are the product of disorganization among
programmers (or a deranged mind if the same programmer produced both 
conflicting modsets).  Dependent modsets were solved by producing a
GEN modset whose sole purpose in life was to be depended upon.

>Perhaps we are asking too much in an age when many language
>processors have, indeed, conditional preprocessors.  But using
>them in lieu of the feature at hand doesn't solve any problems of
>source control: it merely takes them out of the hands of the
>library management program and puts them squarely back onto the
>broad shoulders of the programmer, which is precisely where we
>are trying to AVOID putting such things.  If a library management
>program did not introduce horrendous delays in the
>edit-compile-debug cycle, I think it could be relied upon to take
>over this task from the language processor.  There are two sides
>to this coin, though: 1) Although it makes such a feature uniform
>across all language processors in use (notably those which have
>no such facility already), 2) it probably will not have exactly
>those features of the preprocessor in use that the programmer
>likes and uses.  Discussion is indicated.

The conditional "pre"processing available with various languages ranges
from none, to simple (include files, *CALL) but unconditional, to conditional
( IF ELSE ENDIF ), to include macro processing, and to include kitchen
sink processing.  Typically your overgrown assembler is of this last
variety ( remember COMPASS' DUP, IRP, and other pseudo ops?)

Another avenue for complicated preprocessing is to divorce such activity
from any language definition.  This yield things like m4, cpp (used by
languages other than C, and things known at cpp-time are unknown at 
compile-time), and things like sed, awk, & lex.  I may be wandering a bit,
as preprocessors should be transparent to anything that isn't directives.

I.E. I'm not sure such general preprocessing belongs in a library tool,
except as warranted for include files that are in the library.

Another point worth considering is that selecting between using a
preprocessor and a source-control tool to provide feature code depends
upon which is more widely available among the recipients.  Back in the
CDC days, everyone who received NOS had MODIFY, thus NOS was distributed
in source-control form.

>Perhaps MODIFY/UPDATE caused more problems, that then needed
>solving, but I think not.  M/U really did attempt to solve
>problems, but not in the normal fashion: that of so automating a
>task that the programmer could not hurt himself.  It would be
>nice if automation of this sort of thing took the form: the easy
>way is also the good way.

So what, then, is the easy way?

>What, then, is that problem that needs solving?  Seems to me that
>*the problem* can be simply stated as: multiple, simultaneous,
>dependent and independent modifications of a database, with the
      ^^^^^^^^
Yes, I think we are heading into that realm.  The more generalized
we want things, the more general must the tools be.  For example, the
notion of OPLFILE, which is really a virtual OPL, abstracts into creating
a virtual database by logically combining two or more smaller (an uncommon
word in the database world) ones.

>special case that the database is a set of source files subject
>to an iterative edit-compile-debug cycle.

Peanuts compared to transaction processing on a large database.

>"Dependent and independent" means that two kinds of conflicts can
>exist among any given set of modifications.  Independent mods
>(to the same file) are those which do not overlap; that is they
>do not modify the same part of a file.  They are tantamount to
>modsets to separate files, with the line between files being
>rather arbitrary.

The difference here is that we want transactions that can be backed out,
and we want transactions that are independent of one another.

I'm not yet sure whether the database model of program library
maintenance needs any extensions.  Every database that I can think of
is a model of some physical phenomena, such as the existence of people
or instruments.  People are unique (though their names might not be; I
don't think we are that interested in this problem) and financial
instruments must be unique or we quickly find ourselves in financial
ruin.  We can also think of these instruments as transactions, and one
would not want (in the interest of correctness) the same transaction
applied multiple (or zero) times!

>>There was even some talk at UCC about a text editor that would
>>spit out a modset when you were done.
>
>Well, let's not get into the holy war over "your favorite text
>editor" here.

That's not the point. Any editor could be encased in a procedure file
or shell script which would run COMPARE or diff between the before and
after files.

>I think the edit-debug cycle followed by the
>process of making a modset to describe what you've just been
>through is valid.  

What that does is solidify (freeze) a set of changes.  By making them
more permanent you're supposed to think about them more.  But the existence
of COMPARE along with text editors subverts that intent.

>>I like to make virtual OPLs (*OPLFILE)...
>
>Let's generalize that: This is really an artifact of the line
>between one "file" and the next.  It may be that some environment
>blurs this line (PLATO comes to mind, where one "lesson" had many
>different types of "blocks" that were related by being part of
>the same lesson.  Sort of a subdirectory.)  An implementation
>must not allow arbitrary lines between "files" to define the
>boundary of the "library."

Paradoxically, on Cybers you can concatenate files full of REL records
and get one file of REL records (and the loader will like it), while on
UNIX you can concatenate two tar files and get one file that tar won't
like.

>>The biggest difference between MODIFY and REVISE/UPDATE is
>>random access, and everything you can do with it.

Even in an environment without random access, the library maintenance
tool can still maintain an index, for use if the library is ever taken
to an environment that has random access, or is such an environment
should arrive where the library is.

This is far too long.  I think a summary is in order, next time. The
more I think about this, the less I know.


From tsec!nsc!nsc.NSC.COM!woolsey Sun Oct  4 18:17:12 1987
Received: by nsc.NSC.COM; Tue Oct  6 16:28:22 1987
Received: by pubs.nsc.com; Tue Oct  6 16:27:57 1987
Date: Tue, 6 Oct 87 16:27:57 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8710062327.AA20770@pubs.nsc.com>
To: deg@kksys.UUCP, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        pyramid!crayamid!cray!jlp, tsec!nsc-nca!aga
Subject: good seats

(I say, these subject lines are getting pretty stupid.  We all know
what we're talking about:  general ramblings on the subject of source
control.  When (if?) we start getting more specific, they'll make more
sense.)

>He (actually, SCCS probably did this by
>default) would set the file permissions so that only he would be able
>to write the SCCS versions of the files.  The project team members would
>be in a unix group that had permission to read the SCCS files.

Boy, were we spoiled with the PERMIT command.  There was a discussion
raging in netnews many moons ago about grafting Access Control Lists
onto Unix.  Not much became of it, except lament for the systems that
people had left that had had them.

I mean, the method above sounds like a kludge, and I still don't know
whether the SCCS tools need to be setgid to the project group, or
setuid to the project leader, or what.  And if so, how does one use
it on multiple projects?

>I beileve [sic] that RCS works similarly, except that the commands are
>"ci" (check in) [don't mistype your editor name or the file disappears!]

In my case, ci is an alias for vi.  So there.  cu used to conflict, too,
but tip, and later telnet, came along.

>In either case, if
>the SCCS/RCS files were not protected BY THE OPERATING SYSTEM, any person
>with write permission to the files could apply any change at any time.

CDC manuals suggested that you access the system OPL with COMMON(OPL).
The system allowed certain people to create or access these files; a
deadstart (or judicious CM editing from the console followed by a DTKM)
was required to delete them.

(Incidentally, how many access word bits were required to make an ECS
file COMMON?  ECS files were an ancestor of RAM-disk, though we did not
know it back then.)

>perhaps we will have to rely on
>the file protection facilities available in the host operating systems
>to ensure PL integrity.

I think that that is a tautology.

>>...non-monotonic program modifications...
>
>Another advantage to the M/U pardigm [sic] is the ability to have a "debug"
>or "test" modset that is not kept in the PL.

I had such a modset for USERS/DSDSIM.  It used 1DS and QFM to do
dangerous things during system time.

>The Pascal Group at UCC
>had a modset that introduced all sorts of good writelns into the compiler
>source.  We used this whenever there was a bizarre code generation error.
>All we had to do was apply the modset and recompile the compiler.

And I bet that the reason it was done that way instead of

CONST DEBUG = TRUE;

IF DEBUG THEN 
WRITELN('DAG NODE = ', P^.LEFT^.DAG[PI].STYPE^.TOKEN: 6 OCT);

was code size in the production compiler.  I don't generally count on
HLL compilers to recognize this as a form of conditional compilation,
though an increasing number do.  Besides, this way you still have to
recompile.  Using DEBUG as a variable that is turned on by a switch is
probably a good idea in early stages of compiler building, but as the
bug density diminishes, this gets to be a drag.

>This sounds like
>a good issue to address; perhaps what we need is an ability to group
>modsets: when mod1 is yanked, also yank mod2, and when mod3 is yanked,
>also yank mod2.  Then again, maybe this is a load of rubbish.

So you want to teach the source control system about change dependency.
Sounds like a directed graph to me.

What if you yank mod3, but not mod1?  What happens to mod2?

>However, this is CFA...  stone knives and bearskins, remember?  At least we
>take offside dumps every now and then...  but source control?  Forget it...
      ^^^^^^^
A five-yard penalty.  Hey, this football metaphor sure is pervasive!

>we come as close to good source control as Bork comes to having a real beard.

Have you been reading SU-BBOARD???  Nothing like turning a technical
discussion into a political one.  It's happened to SDI, and
disarmament, and 55 mph, and ....

Does the fact that half this list's members are bearded have anything
to do with that remark?

>>what else would you call the other Revise feature-vestige but "Common
>>Deck?"
>
>Associate(d) Deck?

But it's still a DECK!  Next thing you know, we'll have portholes,
gangways, bulkheads...

>>Observe, however, that if the logic of Revise changes too drastically,
>>the source will not be recoverable.  (It also helps to take executable
>>versions of Revise on field trips.)  This is the bootstrapping problem.
>
>Gee, that's a good point.

I thought so.  Has this been rubbed far enough in yet?

>>Virtual OPLs (*OPLFILE)...
>
>Maybe we should reassess the need for this feature.  Why was it there
>(in Modify) in the first place?  As near as I can tell, the only reason
>for *OPLFILE was for an ABS assembly, where you had to *CALL all the
>subroutines you used into the source fed to COMPASS so there would be
>no external references. 

You're forgetting your history.  The thing that was *CALLed is called a
COMMON deck.  It was intended for use in FORTRAN programs, to contain
the COMMON declarations for the program.

>Why did we have ABS assemblies?  Another good
>question.  Answers include being able to create multiple entry point
>programs, having the "good" loader tables (needed by Cyber loader if
>you ran under RFL,0), 

... to make sure that no unneeded trash (like CMM) got hauled in at 
link-time ...

>and being able to fix inter-program communication
>areas at specified addresses (did we REALLY do that?).

Sure.  ARGR, CDDR, FWPR, uh...  Gee, this stuff evaporates quickly
without an Instant.  Actually, those didn't need ABS; the loader just
started the text high enough that these constants could be used for IPC.

>I'm not sure if
>any of these reasons applies to what we're doing today.  We all use
>language processors that generate relocatable object files, and don't
>complain too much about having to link everything.

Oh yeah?  I miss having 1AJ call the loader when it didn't know _what_
it was looking at.

>OK, so we use
>Turbo Pascal, too.  But we've already complained about that.  Maybe they
>will fix these annoyances in 518 ... er ... version 4.0.

I doubt it.

>At any rate,
>do we really need to include subroutines with *CALL if we have pre-compiled
>versions of them in a library somewhere?  Just think, if we don't have to
>compile them every time, we can cut down the compile-test-edit cycle time.

It becomes compile-link-test-edit cycle time.  Linkers can be slow, too.

I'll give you one very good reason for having done ABS assemblies.
Confidence.  ABSolute confidence.  If I know that it was I who
assembled (or compiled) every last line of code in a program, I'll have
all of the information available about how it was created and what's in
there, so that there are no surprises.  All of the binaries
(relocatable, even) that go into making something are built from
sources that I can actually see.  There's even a little motivation for
disassemblers here, too, in case you must use a library without source
(e.g. Wreckage Mangler, or CTI).

"If you want something done right, do it yourself."

Still, that can be a lot of work, mitigated by the machine speed...

>There are many good things present in Revise; it's just that, along with
>all the useful features and familiarity (with the M/U-like interface),
>there's a lot of deadwood.  Jeff P. keeps telling me about the source
>maintenance utilities they use at Cray.

Perhaps he should tell _us_.

>I beleive he said it was a collection
>of three or four C-language programs, one to three pages each (how long is
>Revise?).  He also said that most of the good features in Modify were
>implemented.  Jeff, I'd sure like to hear more about those programs.

I wonder just how portable are sources maintained with these tools.
(Not that they needed to be; they're proprietary, aren't they?)

>Other reasons for starting from scratch include having the freedom to
>come up with a package (which, from the sounds of things, is heading toward
>a DBMS...  there, now we've all said it) capable of performing the functions
>we deem necessary, without having to work around an existing inadequate
>framework, and being able to do with the resultant package as we please.

but not anytime soon.

>>The difference here is that we want transactions that can be backed out,
>>and we want transactions that are independent of one another.
>
>Well said, Jeff.  However, does this necessarily imply that a change to
>the database, consisting of multiple interrelated transactions, will result
 must
>in a consistent update?  What I'm talking about here is the "classic" OPL
>and JPL system used at UCC.

"Consistency is the mother of strange hobgoblins."

There's a parallel here between updating a database and managing a semaphore.

>>>I think the edit-debug cycle followed by the
>>>process of making a modset to describe what you've just been
>>>through is valid.
>>
>>What that does is solidify (freeze) a set of changes.  By making them
>>more permanent you're supposed to think about them more.  But the existence
>>of COMPARE along with text editors subverts that intent.
>
>I don't agree that Compare is responsible for sloppy work.  It does make
>modset creation easier, but aren't we all looking for ways to make source
>maintenance easier? 

Not only easier, but more reliable.

>>The more I think about this, the less I know.

...until pretty soon I know nothing about everything.

>One thing is for certain: by the time we have a product, we'll all know
>what the pitfalls of source maintenance are.  That, itself, may be the
>most valuable thing each of us will carry away from this project.

Platitudes, eh?  OK.  Source maintenance is an attempt to cure the
too-many-cooks disease.  (Somtimes one is too many.)

>>P.S.  How come I haven't heard anything from anyone but JLW?  Am
>>I not getting through to faraway parts?  Can you HEAR ME IN THE
>>BACK, THERE?  (This is the second note I have sent to
>>library-people).
>
>I don't think Mark has logged in yet, and I'm terminally disorganized.
>Steve Oyanagi was mentioning a new Unicos release due out soon, so JLP
>is probably up to his ears in testing.

I think someone's implying that Jim and I have little to do...  Too much
free time on our hands...

We can continue this discussion in person (Jim can debrief me) this weekend.
-- 
LERMINATING PREVIOUS SESSION.  PQEASE RETRY.

Jeff Woolsey  National Semiconductor  woolsey@nsc.UUCP  woolsey@umn-cs.EDU


From tsec!nsc!pyramid!crayamid!cray!jlp Mon Oct  5 17:17:55 1987
Received: by nsc.NSC.COM; Thu Oct  8 04:58:58 1987
Received: by pyramid.UUCP (5.51/OSx4.0b-870424)
id AA20821; Thu, 8 Oct 87 04:34:57 PDT
Received:  by vax2.cray.uucp (4.12/25-eef)
id AA19316; Wed, 7 Oct 87 22:23:08 cdt
Date: Wed, 7 Oct 87 22:23:08 cdt
From: cray!jlp (Jeff Pomeroy)
Message-Id: <8710080323.AA19316@vax2.cray.uucp>
To: crayamid!!pyramid!nsc!tsec!nsc-nca!aga,
        crayamid!!pyramid!nsc!tsec!nsc-nca!jxh, crayamid!!pyramid!nsc!woolsey,
        umn-cs!meccts!kksys!deg, umn-cs!meccts!kksys!msr
Subject: Is it too late to buy a round trip ticket to ...


Hi gang.  I have been reading the messages as they go by.

I think that i should start off by explaining what i have been up to for the
past two and a half years.  I work at Cray handling the source code of the 
UNICOS operating system.  Cray has something called UPDATE that was based on 
CDC's UPDATE and has run under COS for years and UNICOS for about a year.
UPDATE is written in CFT, and thus runs only on the Cray mainframes.
Cray also has something call 'scm', which is a locally designed and written
thingy that was intended to put UPDATE out of business.  Scm runs on the Crays,
the VAX and the Suns.  Some people within Cray also use SCCS on the VAX.  
I have to deal with all of them.

We have had many battles over source control, which has shown one thing:
there is no winner.

So, what can i add to this discussion?

I really do not want to start complaining about my day-to-day problems, and 
have you guys waste your time trying to think up some solution(s) which i am
sure have little or no impact on what we do around here.  But, i could relate
to you some of the things that have happened around here in the past, just as
examples, to show how these things work in the real world.

I will start with SCCS.  It is owned and operated by AT&T.  They only let 
people use it.  You may think i am kidding.  We have both Suns running 4.2 
and VAXes running System V, side by side, on the same network. The Sun SCCS
was based on UNIX version 7 (or PWB) .  Guess what?  AT&T changed SCCS around 
System III so they are not compatible.  OK, so some things did stay the same.  
Quick, what are they?
(Strike One)

SCCS has magic cookies.  (do i need to explain this?)  A magic cookie in this
case means in-band data. (That should wake jim up)  SCCS will do different 
things depending on the content of the file you are putting under source 
control.  Another way to say this is: "what goes in does not come out".  (if 
this gives you a bad feeling deep in your gut, good)  The SCCS magic cookies 
are in the f percent, letter, percent.  Like '%D%', which i think changes 
to the current date.  Nice, eh?  What about:

printf(" %D%D ", i, j);

(I think that %D is the double precision decimal format...)  This is silently
distroyed by SCCS.  This really happened in the Berkeley 4.2 release where 
under the user contributed software, some guy had developed a program without
SCCS, but Berkeley passed it through SCCS before releasing it.

At Cray, some people use SCCS to hold there assembly code.   What about:

IDENTSIN
ENTRYSIN
ENTRYSIN%
ENTRYSIN%R%

The entry for SIN saves and restores all registers.  The entry for SIN%
does not save or restore, this is used by the compiler with it knows it 
is 'safe' to bash registers.  And lastly SIN%R% means that operands are
passed in registers and not on the stack.  As i recall, SCCS changed %R%
to the current release level of that module.
(Strike Two)


SCCS is made up of many commands.  One of them is called 'help'.  This command
seems to be there for the sole purpose of almost but not quite helping the
user.  Here i will use proof by example...  

VAX running AT&T System V:
Type:
% help
Output:
%


Cray running UNICOS:
type:
% help
Output:

        The help command is unfortunately not what it would seem to be.
It provides a limited amount of help and only for SCCS commands.  Its
arguments are the small alpha-numeric strings that accompany the error
messages from those commands.  If you are looking for more general help
in using the rest of the system try the on-line manual that is available
through the man command.  Try: man man


Sun running Sun UNIX 4.2 Release 3.2:
Type:
% help
Output:
msg number or comd name?
Type return

Output:
ERROR:  not found (he1)
Type:
% help
Output:
msg number or comd name?
Type:
he1
Output:
he1:
"not found"
No helpful information associated with your argument was found.
If you're sure you've got it right, do a "help stuck".
Type:
help stuck
Output:
stuck:
First, if you know the value of the system error number (errno),
you can either look up a description of it in INTRO(II), or execute
"help err<number>" (e.g., if the error number is 1 execute "help err1").
If you don't know the error number, or you don't understand what's going on -
Try the following, in order:
1. Make sure the answer isn't in the documentation.
2. Try to write(I) to anyone logged in as "adm".
3. Contact your PWB/UNIX counsellor.
4. File an MR (see System Administrator for instructions).

You call this help?  Just for a laugh, i tryed 'help he2' and got:
he2:
"argument too long"
Dost thou jest? Wilst thou mock HELP??
Please limit your blitherings in arguments to less than
fifty (50) characters.

(Strike Three, you're out!)


Lastly, here is a quote from the Sun man page for the SCCS get command:
(Lets see if anyone in the home audience can figure out why)

BUGS
     If the effective user has write  permission  (either  expli-
     citly  or  implicitly)  in the directory containing the SCCS
     files, but the real user doesn't, only one file may be named
     when the -e option is used.


The list goes on and on...  I will never put any of my own programs under SCCS.

As deg would say...  "SCCS is a 'FINE' piece of software"

It is getting late.

franka@mmintl.UUCP (Frank Adams) (05/14/88)

In article <5291@cup.portal.com> jxh@cup.portal.com writes:
>I have been sort of working on Revise, which is a descendent of UPDATE/MODIFY
>from the Cybers, in that it has some of the same basic assumptions, namely
>that there are Decks containing Lines (they got rid of the word Card at
>this point), and that lines may be Active or Inactive, and that they may b
>inserted, and deactivated, but not actually deleted, by a modset.  A
>modset can also reactivate "deleted" lines.
>
>There are those in Unix land who use a program called SCCS, and [others].
>They operate by controlling the "checking out" and subsequent "checking
>in" of modules.  One checks out a module, edits it, and checks it in,
>identifying a "version" as the label for this instance of that module.

I have considerable familiarity with this latter class of code control
systems (including having written one), but I have never before encountered
the former kind.

I am having some difficulty understanding just how it is supposed to work.
Everyone involved in the discussion apparently was quite familiar with them,
so the above is the best description of them supplied.  It leaves quite a
bit unanswered.

I will attempt to describe the system based on my understanding of it; I
would appreciate it if the original poster or someone equally competant
would review this, note any misconceptions, and answer my questions.  I will
be concentrating on "how to use the system", not "how the system works"; the
lines quoted above seem to cover that pretty well.

It appears that the main editing done by programmers using such a system is
the creation of "modsets".  These, in general, specify that certain lines
are to be inserted into a particular piece of source code.  (I gather some
systems allow more than one piece of source (source file or "deck") to be
updated with the same modset.  Maybe they all do?)  In addition, a modset
may specify that certain lines of code be deleted (deactivated), or that
certain lines which were previously deleted be restored.

I don't know how the lines to be inserted or deleted are identified.  I
would guess that each line has a line number, and that new lines are
inserted so that their line numbers remain in order.

It appears that traditionally, programmers directly created modsets, and
that it is a relatively new and far from universal thing for them to edit
the entire source file, and create the modfile mechanically.

It appears that the compilation step (in the development cycle) with such a
system is preceded by combining the programmer's modsets with the current
state of the code control system to produce the actual source to be compiled.

When a programmer is satisfied with his changes, he "signs in" his
modfile(s).  There may(?) be a review by someone before this actually
becomes part of the standard.

--------
It seems to me that the main problem with this kind of system is sorting out
simultaneous changes to the same piece of code.  It seems to me that the
advocates of this approach have become so used to dealing with this, that
they simply accept it as part of the system.  (In the comments I saw, there
were several suggestions for *mitigating* the problem, but no hint that it
was something one might want to *eliminate*.  It is a key advantage of the
SCCS-style approach that it does avoid the problem.)

I should note my own preference (not fully shared by my current co-workers).
I prefer an SCCS-style code control system, in conjuction with a convention
that there is only one entry point per file.  If, as is good for other
reasons as well, the functions are all kept relatively small, then all the
source files are small.  This means that two programmers trying to access
the same file at the same time are really trying to change the same code,
and one of them should wait for the other to finish.  (The inability for two
people to modify the same module at the same time is the characteristic
problem of this style of code control system, as integrating simultaneous
changes is the characteristic problem of the other.)

One *must* support this with some kind of include file system, so that
declarations can be consistent across modules.  The inclusion process need
not be regarded as the responsibility of the code control system, however.
(The include files themselves are source files to be maintained; but that is
another matter.)
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

jxh@cup.portal.com (05/17/88)

In <2846@mmintl.UUCP>, Frank Adams (franka@mmintl.UUCP) writes:
>>...a descendent of UPDATE/MODIFY from the Cybers...
>It leaves quite a bit unanswered.

Yes, sorry about that.  Your observation that the original parties to the
discussion were familiar with UPDATE/MODIFY is quite true.  I will try to
summarize the relevant facts about that universe (or perhaps Dan Germann or
Jeff Woolsey, who used it much more than did I); but your "understanding"
will stand for the moment.  It's pretty good.

>I will attempt to describe the system based on my understanding of it.

>I don't know how the lines to be inserted or deleted are identified.  I
>would guess that each line has a line number, and that new lines are
>inserted so that their line numbers remain in order.

Lines were identified by a deckname or modname, followed by a sequence number,
e.g. DECK.1, DECK.2, ..., DECK.999.  Sequence numbers were assigned beginning
with 1 for each different name.  Thus: DECK.1, DECK.2, MODSET.1, DECK.3 ... .
The order of these lines in the Program Library (PL) defined the sequence of
lines in the actual source.  Modsets consisted of directives, such as
  *DELETE DECK.1,DECK.4    <deletes range of lines, inclusive>
  *INSERT DECK.2           <inserts following lines after DECK.2, assigning
  insertion text           sequence numbers to each: MODSET.1, MODSET.2 >
  more insertion text
  *more directives

>It appears that traditionally, programmers directly created modsets, and
>that it is a relatively new and far from universal thing for them to edit
>the entire source file, and create the modfile mechanically.

Just so.  A separate text-comparison program was given the ability to
express differences as a set of directives suitable for feeding to
UPDATE for just this purpose.  Actually, this may be more universal than I
thought, as more-capable editors become available under NOS.

>It appears that the compilation step (in the development cycle) with such a
>system is preceded by combining the programmer's modsets with the current
>state of the code control system to produce the actual source to be compiled.

You're batting 1.000.  This step takes the PLs and modsets, and creates the
COMPILE file, which is the modified source.  COMPILE was typically an 
alternate default filename on, e.g., the assembler, to eliminate steps from
your batch job.  (Egad!)

>When a programmer is satisfied with his changes, he "signs in" his
>modfile(s).  There may(?) be a review by someone before this actually
>becomes part of the standard.

Someone else (Messrs. Woolsey or Germann) should elaborate on the procedural
aspects of source control that developed in their shop.  For the moment,
let me simply say "code review" and "proposed modset" and hope that gives
the right impression.  I think the fact that modsets were "hard" to
introduce into the database permanently had a positive effect on quality,
as they were subjected to tremendous scrutiny.
--------
>It seems to me that the main problem with this kind of system is sorting out
>simultaneous changes to the same piece of code.  It seems to me that the
>advocates of this approach have become so used to dealing with this, that
>they simply accept it as part of the system.  (In the comments I saw, there
>were several suggestions for *mitigating* the problem, but no hint that it
>was something one might want to *eliminate*.  It is a key advantage of the
>SCCS-style approach that it does avoid the problem.)

Well.  Here we go.  I would settle for *mitigating* if it meant I could get
named modsets.  *Eliminating* would be nice, but implies the (imho) too-
restrictive locking of source files.  How about *automating* the process of
identifying (and perhaps resolving) conflicts?

>I prefer an SCCS-style code control system, in conjuction with a convention
>that there is only one entry point per file.  If, as is good for other
>reasons as well, the functions are all kept relatively small, then all the
>source files are small.  This means that two programmers trying to access
>the same file at the same time are really trying to change the same code,
>and one of them should wait for the other to finish.  (The inability for two
>people to modify the same module at the same time is the characteristic
>problem of this style of code control system, as integrating simultaneous
>changes is the characteristic problem of the other.)

It is true that, on the Cybers, programs tended to be huge and monolithic
(not because of bad programming practice but because of institutional biases
such as *LOCAL FILE LIMIT*); whereas small-computer programs tend to have
lots of small parts.  I applaud modularity when it is APPROPRIATE FOR THE
PROGRAM, not imposed by outside forces such as making source control easier,
or making things compile faster (however worthy those goals certainly are).
I detect a bias toward C programs, where all functions are at the same scope
level; in this case, putting them into separate files makes sense.  However,
Modula-2 programs tend to have many routines within a module; and this tends
to make single modules rather monolithic themselves.  (I have broken a single
module into several because of an implementation restriction, when I would
have preferred keeping it in one piece to ensure data integrity of private
objects.)

Making simultaneous changes to one module might easily mean implementing
two completely different and, conceptually, independent modifications.  A
source control program should allow me to modify the top while you're
working on the bottom; if there is no real conflict, then file-locking
should not preclude my getting some useful work done while you,
presumably, do the same.  PVCS tries to allow this by "branching," which
seems simply to be a cop-out, the program recognizing that a conflict is
(potentially) being created that will have to be sorted out later when the
two branches "merge."

Furthermore, my change might remain unrelated to yours forever.  Perhaps I
want to apply a temporary change to see what would happen.  I could, of
course, do this by making a local copy of the source file in question, and
do as I please with it, but if my prattlings in my sandbox become worthwhile
I would like to be able to put them into the real source for everyone's
benefit without having to reconstruct what I did.  A modset from the common
base source describes my actions succinctly, and can be stored in rather less
space.  (Flame off.  Sorry.  I really don't know the first thing about SCCS,
so if all this is possible there simply by sleight of hand, please let me
know).

Oh, this is exciting!  I thought this newsgroup would be a good place for
this discussion!

P.S.  My boss just walked into my office brandishing a copy of the glossy
for PVCS and, in a moment of candor, I told him that we should get it and
use it; that it is, at least, a giant step in the right direction, even
if it isn't quite all I hoped for.  Of course, we should all get Suns and
avoid the PC problems altogether, but he's not prepared to hear that.
Not yet.
--
Jim Hickstein, VSAT Systems, Inc., San Jose CA
jxh@cup.portal.com   ...!sun!portal!cup.portal.com

woolsey@nsc.nsc.com (Jeff Woolsey) (05/17/88)

In article <2846@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>I have considerable familiarity with this latter class of code control
(SCCS/RCS/PVCS et. al. (S/R/P hereinafter))
>systems (including having written one), but I have never before encountered
>the former kind.
(MODIFY/UPDATE/REVISE et. al. (M/U/R hereinafter))

>I am having some difficulty understanding just how it is supposed to work.
>Everyone involved in the discussion apparently was quite familiar with them,
>so the above is the best description of them supplied.  It leaves quite a
>bit unanswered.

Let me provide a simple, yet complete run-down of the M/U/R model.
The differences between the three are implementation issues, or small
variances within the paradigm.

There exists a Program Library (PL) consisting of source program
modules (decks), include files (common decks), named changes (modsets),
and a directory (for random access).  The PL is a single file.  Its
appeal is its integrity.  Some effort is required to delete part of a
PL.  If you have the PL, you have the whole source.

When you want to compile a program in a PL, you direct M/U/R to make
a compile file.  This file is fed to the assembler or compiler as is.
Each line in it consists of the source line and a sequence number.  The
sequence number is the name of the deck or modset followed by the
ordinal of that line within the deck or modset.

Common decks are the same as source decks except that they have a bit
set meaning that they can be included (at compile-file time) in another
source deck.  Their original purpose was to hold all the COMMON
declarations in FORTRAN programs.  You cannot generate a compile file
from just a common deck.

A modset consists of its name, the name of the deck to modify, and
insertion or deletion directives each followed by abritrarily long
sections of text.  The text is inserted at the point the insertion or
deletion directives refer to.  The insertion points are desginated as
modname.seqno or deckname.seqno, or just seqno if an original card in
the deck is desired.  Deletion points can specify a range of lines.

The resemblance to drawers of punched card decks is the obvious
ancestry.  It's probably obsolete, although lines are still pretty
pervasive as the unit of change in source control.

Modset creation generally means that you take a nice assembly/compiler
listing (made from the compile file with its sequence numbers) and go
off into a corner and mark it all up.  Then you figure out which lines
are the insertion and deletion points, and type up your new text in
between directives.  Then you reenter the edit-compile-debug loop.
Eventually the process was automated with tools analogous to diff -e.

If a source deck contained enough modsets such that not more than some
arbitrary number of lines were original code, it was said to need
resequencing.  This is a fairly major event in the life of a deck, and
probably only happens twice.  It destroys all the modset information
while retaining the net changes.  The OS vendor would resequence decks
occasionally with new releases, and installation-specific modsets would
have to be converted because the line numbers changed.  Locally-
developed programs could suffer resequencing, too.

A set of miscellaneous tools rounded things out with the ability to
extract everything back into source form and expunge modsets and other
mundane operations.

>It appears that the main editing done by programmers using such a system is
>the creation of "modsets".  These, in general, specify that certain lines
>are to be inserted into a particular piece of source code.  (I gather some
>systems allow more than one piece of source (source file or "deck") to be
>updated with the same modset.  Maybe they all do?) 

For reference purposes, UPDATE/REVISE treat decks and modsets as the
same thing, whereas MODIFY distingushes them.  This just dictates
whether there are "DECK" directives in your modset, and what the modset
sequence numbers are.

However, this ability of one named change to alter several related
source entities simultaneously and indivisibly is one of several
features I miss from the M/R/U paradigm.  Another feature is
independent, non-monotonic changes, any particular collection of which
may be selected to build a particular flavor of product. (analogous to
#ifdef FEATURE )  The third is PL integrity (mentioned above).

>In addition, a modset
>may specify that certain lines of code be deleted (deactivated), or that
>certain lines which were previously deleted be restored.

Restoring deleted lines, though supported, was a no-no as far as OS
support was concerned.  A real good way to cause dependencies and other
conflicts.  You were to accomplish the same net effect by inserting new
copies of the orignally-deleted lines.  COMPARE (diff -e) would do it
this way.

Joe User, in complete control of his own PLs, however, is perfectly
welcome to create conflicts and dependencies, as long as he can deal
with the results.

>I don't know how the lines to be inserted or deleted are identified.  I
>would guess that each line has a line number, and that new lines are
>inserted so that their line numbers remain in order.

By name.number .  "number" is always sequential, and there are no
fractions.  Each change (modset) has a unique name (modname).  The
number of a line never changes, but it could be deactivated, and an
identical line with a different modname and number could replace it.
This is one reason why resequencing is so traumatic.

>It appears that traditionally, programmers directly created modsets, and
>that it is a relatively new and far from universal thing for them to edit
>the entire source file, and create the modfile mechanically.

"relatively" is relative here.  This basically describes the state of
affairs four years ago and more at a large university running
mainframes.  Eventually each programmer would discover that COMPARE
could generate modsets.

>It appears that the compilation step (in the development cycle) with such a
>system is preceded by combining the programmer's modsets with the current
>state of the code control system to produce the actual source to be compiled.

Essentially correct.  It is a special case.  Even in the case of
product installation, a compile file is built.  The PL is really a list
of lines, each of which contains a pointer to each modset that
referenced it and how (active or not).  When applying a modset, 
another set of these pointers is built, and making the compile file
involves looking at the list of pointers to see the net effect.  If the
line is now active, it is copied to the compile file.

>When a programmer is satisfied with his changes, he "signs in" his
>modfile(s).  There may(?) be a review by someone before this actually
>becomes part of the standard.

For the OS with local changes, yes.  The system programmers "submit"
their modsets to a coordinator, who then figured out which (if any)
modsets conflicted or depended upon one another or the same thing.
(This process was eventually automated.)  The submittors were then
asked to reconcile their modsets to resolve these problems.  Then the
whole mess was printed and circulated for code review.

>--------

>It seems to me that the main problem with this kind of system is sorting out
>simultaneous changes to the same piece of code.  It seems to me that the
>advocates of this approach have become so used to dealing with this, that
>they simply accept it as part of the system.  (In the comments I saw, there
>were several suggestions for *mitigating* the problem, but no hint that it
>was something one might want to *eliminate*.  It is a key advantage of the
>SCCS-style approach that it does avoid the problem.)

So what does a programmer do while waiting for the code to be
available?  Go to the listings on the shelf and work out the changes
based on that version.  Then when the source is free, apply the
changes.  Surprise, reality has changed.

But there are usually many ways to subvert the intentions of a source
code control system.

>I should note my own preference (not fully shared by my current co-workers).
>I prefer an SCCS-style code control system, in conjuction with a convention
>that there is only one entry point per file.  If, as is good for other
>reasons as well, the functions are all kept relatively small, then all the
>source files are small.  This means that two programmers trying to access
>the same file at the same time are really trying to change the same code,
>and one of them should wait for the other to finish.  (The inability for two
>people to modify the same module at the same time is the characteristic
>problem of this style of code control system, as integrating simultaneous
>changes is the characteristic problem of the other.)

Usually the integration is painless, if the two programmers are doing
different things and don't bump into each other.

>One *must* support this with some kind of include file system, so that
>declarations can be consistent across modules.  The inclusion process need
>not be regarded as the responsibility of the code control system, however.
>(The include files themselves are source files to be maintained; but that is
>another matter.)

Indeed.  It just so happend that M/U/R provided the include mechanism
because the included portions are part of the program library.  They could
also be included from another such library, if needed (usually the case
for the system common decks).
-- 
Scrape 'em off, Jim!

Jeff Woolsey  National Semiconductor
woolsey@nsc.NSC.COM  -or-  woolsey@umn-cs.cs.umn.EDU

smryan@garth.UUCP (Steven Ryan) (05/19/88)

Gee, I never thought anybody outside of CDC knew of MODIFY (NOS), UPDATE
(NOS/BE, NOS, VSOS), and now SCU (NOS/VE).

All of the source code is stored in what is called a program (or source)
library (called the PL). The program library is divided into decks (as in
punched cards) which are divided into individual lines. A deck can be module,
data structure, or whatever conceptual chunk you wish to use. Each deck
has a unique name. MODIFY maintains last modification date for each deck.

A line is a line of source text, an identifier, and a modification history.
UPDATE line identifiers are unique across the PL. MODIFY identifiers are only
unique within the deck. Lines in the original deck have identifiers like
deckname.1, deckname.2, ... Lines subsequently inserted have identifiers
like id.1, id.2, ... A line can be deactivated (deleted), activated, 
deactivated, et cetera, an arbritrary number of times as the result of a
series idents (changes). The modification history attached to each line
refers to successive activations/deactivations and which ident did it.

Only lines which are currently activated are listed. All the deleted lines
are still there, though, which is useful in the whoops! mode of programming.
Once a group of idents is permanently added to the PL and the system built,
if you discover, whoops! ident F7B022 just broke the entire vectoriser, you
just do a *YANK F7B022 to magically erase F7B022 from the system. Actually,
it goes through the modification history and adds an activation to lines 
deleted by F7B022 and deactivation to lines inserted. As long as subsequent
idents do not refer to the same lines, in principle, F7B022 can be *YANKed
and *UNYANKed as many times as necessary to get it right.

Programmers have to handcode the changes line by line, not very pleasant,
unless SCOOP is working. But having all the line changes in an ident has
the advantage of making the changes very visible, simplifying the code
reviews. Idents contain both editting commands and comments which have the
programmers name, other identification, and explanation of the change and
its reason all in one bundle.

Far from making collisions in the source more burdensome, it usually makes
them less so. Two separate programmers can modify the same deck without error
as long as they modify distinct lines. Generally safe. The project leader
is suppose to review all idents for interferences but this includes
interferences which might be on separate decks.

All idents for a PL are collected and applied in mass for each build cycle.
At this point, if two idents affect the same line, MODIFY/UPDATE squeals
loudly, and the project adjusts for the unexpected overlap. This does 
happen, but usually like once a year and takes a five-minute change.

                                               Hafa an godne daege.
                                                            sm ryan

ps. Control Data/ETA has no access to this network that I know of. For
    this reason, you may never get response from CDC on this or any
    other subject.

jxh@cup.portal.com (05/20/88)

I just got my copy of Polytron's PVCS (Polytron Version Control System).
I will be launching myself into it shortly.  When I surface again, I'll
bring a full report.

-Jim Hickstein, VSAT Systems, Inc,  San Jose, CA
jxh@cup.portal.com   ...!sun!portal!cup.portal.com!jxh

dricej@drilex.UUCP (Craig Jackson) (05/21/88)

This is a discussion that is near and dear to my heart.  While the *CS
programs have many advantages, there are a few things which I sorely
miss from 'mainframe' source code control systems, (of which MODIFY/UPDATE
is one).

The biggest thing that I miss in the Unix world is the ability to easily
have independent development on a common body of source by two sets
of programmers in two locations.  The most common case of this is
a vendor sending out source updates, and a local site making patches.
In the Unix world, each time AT&T comes out with a release, all of the
System V vendors need to re-do their port.  Now Unix is portable, but it
isn't so portable that unnecessary ports are to be desired.  In a 
system such as M/U, there is a unique identifier attached to each line
of the source file.  Two modsets can affect different regions of the file
with no change whatsoever.

In the unix world, diff -c & patch attempt to provide the same utility.
However, if there isn't enough common context, things fall apart.  Also,
diff -c only comes from Berkeley; you're left to the net to pick up
diffc if you're on a USG system.  'patch' only comes from the net in
the first place.

You may denigrate the need for 'line numbers' or 'line identifiers' in
systems such as M/U.  Yes, they are extra baggage.  Yes, they do go
along with such things as fixed-length source lines and source listings.
Yes, they do imply occasional resequencing.  However, by uniquely
identifying each line, it's possible to unabiguously talk about precise
regions of code.  I only wish I could give each token in the source file
a unique identifier, but it isn't really feasible.

-- 
Craig Jackson
UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej
BIX:  cjackson