[comp.software-eng] Soft-Eng-Dig-v5n10

soft-eng@MITRE.ARPA (06/07/88)

Soft-Eng Digest             Mon,  5 Jun 88       V: Issue  10

Today's Topics:
                     Paradigms for source control
----------------------------------------------------------------------

Date: 10 May 88 21:54:30 GMT
From: portal!cup.portal.com!jxh@uunet.uu.net
Subject: Paradigms for source control

Some time ago, there was a discussion among a few friends of mine about
source-control programs.  I thought it was time to revive this issue, and
now that I have a news feed I thought that comp.software-eng was an
appropriate place for it to rage.  What follows is the entire text of the
discussion as I received it over email.  Beware of people's email addresses:
several have changed since then, notably mine.  Also, bear in mind that these
were personal messages originally, and were not composed for net consumption,
so first names are used freely.  Here is a list of the people involved, and
their real names and (I hope) current addresses:

Jim Hickstein (jxh)     jxh@cup.PORTAL.COM [me]
Jeff Woolsey (jlw)      woolsey@nsc.NSC.COM
Jeff Pomeroy (jlp)      jlp@CRAY.COM
Dan Germann (deg)       deg@kksys.UUCP
Mark Ransom (msr)       msr@kksys.UUCP
Alan Arndt (aga)        (not on the net just now: send to care of jxh)

I intended this discussion to revolve around issues of implementing our own
such program.  Now that I open this to the net, I would welcome thoughtful
insights about features of various (perhaps extant) programs; but still with
an eye toward implementing something.  Please, no flames about somebody-or-
other's lousy program that ate your file.  We should work toward specifying
a program that will solve problems for those involved.
============================================================================
-
>From tsec!nsc!amdahl!meccts!kksys!deg Fri Sep 11 08:13:22 1987
Received: by nsc.NSC.COM; Thu Sep 10 21:44:44 1987
Received: by amdahl.UUCP (4.12/UTS580_/\o-/\)
id AA07489; Thu, 10 Sep 87 16:36:09 PDT
Received: by meccts.MECC.MN.ORG (smail2.5)
id AA16368; 10 Sep 87 04:57:37 CDT (Thu)
Received: by kksys.UUCP (smail2.3)
id AA01692; 10 Sep 87 03:36:15 CDT (Thu)
To: deg, meccts!amdahl!nsc!tsec!nsc-nca!jxh, meccts!amdahl!nsc!woolsey, msr,
        umn-cs!cray!jlp
Subject: Welcome to library-people mailing list
Date: 10 Sep 87 03:36:15 CDT (Thu)
From: deg@kksys.UUCP
Message-Id: <8709100336.AA01692@kksys.UUCP>

A week or so ago, Jim and I were talking about Revise, and noted that
it would be nice to have all our discussions down on paper.  We also
thought it would be nice if our discussions were not just between the two of
us, given that there are several people who are interested in source library
maintenance.  Therefore, I have created this mailing list.  Our intention is
to discuss the functions and features desirable in tools which maintain
source programs.

Anyone care to kick things off?

-Dan


>From jxh Fri Sep 11 18:31:15 1987
To: jxh, tsec!nsc!deg@kksys, tsec!nsc!jlp@cray, tsec!nsc!msr@kksys, tsec!nsc!woolsey
Subject: library-people kickoff

Well, I suppose I'll start, but this is probably not the first kickoff, since
we're not synchronized.
Let me first propose that Alan Arndt be added to this group: Al works for
Ultron Labs in San Jose, and uses Polytron PVCS (Polytron Version Control
System).  He has quite a bit of experience with it.  I do not propose that he
be added to sell Polytron to us, but that his experience with its good *and*
bad points may serve us well.  He is not on the net as I speak, but I intend
to help him rectify this situation soon.  Tentatively, send to nsc-nca!aga.

For those of you who just tuned in, let me recap my discussions with Dan.
I have been sort of working on Revise, which is a descendent of UPDATE/MODIFY
from the Cybers, in that it has some of the same basic assumptions, namely that
there are Decks containing Lines (they got rid of the word Card at this point),
and that lines may be Active or Inactive, and that they may be inserted, and
deactivated, but
not actually deleted, by a modset.  A modset can also reactivate "deleted"
lines.  It is different from UPDATE/MODIFY in that
it is a standard Pascal program designed for portability, and it operates on
simple Text files: the PL (program library) format is specified to be so
general as not to cause problems on any known Pascal.  Tradeoffs were made in
favor of portability over efficiency of time and space.  This is a very pure
example of the tenet that the more adaptable a system is, the less well-adapted
it is in each different circumstance.  It is also different from UPDATE/MODIFY
in that it operates on only one PL at a time.  Dan and I identified this is
a major departure from a quite useful feature, namely the *OPL directive,
which indicates that several OPL files be considered as "OldLib" for one
run.

There are those in Unix land who use a program called SCCS, and another called
RCS (not caps).  I am less familiar with these, but I think that PVCS is
related: They operate by controlling the "checking out" and subsequent
"checking in" of modules.  One checks out a module, edits it, and checks it
in, identifying a "version" as the label for this instance of that module.
Basically, these programs run COMPARE over the current and new modules, and
store the differences.  Also, they tend to store the most "recent" "version"
in a form most efficient to access: the stored changes are "inverted" so that
the program applies a change to "go back" to a "previous" version.  All
this is in quotes because, as we know, program modifications are not
necessarily monotonic with time.  An example is the distribution kit for
Revise, which gives a base PL, and two Modsets "CYBER" and "VAX", one of
which is applied to make Revise compile in each environment.
Neither CYBER nor VAX is more "recent" than the other, although one could
certainly say that they are "versions."  I often use the word "flavors" to
express this concept.

The differences between SCCS/RCS and UPDATE/MODIFY are radical.  They are
so different, that I say that each represents a "paradigm" of source
control generally.  The UPDATE/MODIFY paradigm is the one with which I am
most familiar, so I naturally want a program that embodies it to run in my
preferred (enforced?) environment.  Those "brought up" on RCS or SCCS cannot
see any merit in the other paradigm, at least those to whom I have talked.

I invite you to make your own observations about these paradigms, and about
specific programs.  Certainly, there are features conceivable that exist in
neither of these worlds: let us consider them, as well.  Our goal should be
to establish a new paradigm which embodies the best features of all others,
and can be implemented well.  We can then move to design questions about
specific implementations.

Fire away.

Jim Hickstein
...!nsc!tsec!nsc-nca!jxh
VSAT Systems, Inc.
San Jose, CA


>From tsec!nsc!nsc.NSC.COM!woolsey Tue Sep 15 20:23:20 1987
Received: by nsc.NSC.COM; Tue Sep 15 10:32:16 1987
Received: by pubs.nsc.com; Tue Sep 15 10:30:31 1987
Date: Tue, 15 Sep 87 10:30:31 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8709151730.AA13546@pubs.nsc.com>
To: deg@kksys.UUCP, jlp@cray.COM, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        tsec!nsc-nca!aga
Subject: library-people first down

>[Al] uses Polytron PVCS (Polytron Version Control System).

Department of Redundancy Department

I looked into a similar package while at TGC; I think Grady [Davis, now @ VSI]
got it for us to evaluate.  I can't remember the name of the package. That's
some indication of how impressed I was.  Perhaps Jim can remember its name.
It was of the general paradigm rampant among small machines for such packages.
(For these purposes a small machine is anything of VAX power or smaller, or
anything running Unix (sorry, Jeff).)

>it has some of the same basic assumptions, namely that
>there are Decks containing Lines (they got rid of the word Card at this point),

but not the word "Deck"?

>it is a standard Pascal program designed for portability, and it operates on
>simple Text files: the PL (program library) format is specified to be so
>general as not to cause problems on any known Pascal.  Tradeoffs were made in
>favor of portability over efficiency of time and space.

Weren't there some cases in the code for Revise where an order of magnitude
improvement could be had at no cost in portability?  Didn't you find some of
those, Jim?

>It is also different from UPDATE/MODIFY in that it operates on only
>one PL at a time.  Dan and I identified this is [sic]
>a major departure from a quite useful feature, namely the *OPL directive,
>which indicates that several OPL files be considered as "OldLib" for one
>run.

Strike one.  It could be difficult to restore this feature efficiently, judging
from present Revise performance with one PL on machines we can afford to
purchase.

>There are those in Unix land who use a program called SCCS, and another called
>RCS (not caps).

Huh?  RCS is still the name of RCS, even though RCS might be invoked as rcs.

>I am less familiar with these, but I think that PVCS is
>related: They operate by controlling the "checking out" and subsequent
>"checking in" of modules.  One checks out a module, edits it, and checks it
>in, identifying a "version" as the label for this instance of that module.

Can any of these programs operate correctly if the directory and files where
the "library" are are read-only?  The check-out process usually wants to note
somewhere that someone is doing something which could cause inconsistencies
in someone's view of the state of the "library".

M/U users used a property of the NOS file system, namely that a D/A file
was BUSY (write-locked by someone else) or unwritable (someone has it in
READ without ALLOW-MODIFY or ALLOW-APPEND).

>Basically, these programs run COMPARE over the current and new modules, and
>store the differences.  Also, they tend to store the most "recent" "version"
>in a form most efficient to access: the stored changes are "inverted" so that
>the program applies a change to "go back" to a "previous" version.  All
>this is in quotes because, as we know, program modifications are not
>necessarily monotonic with time.  An example is the distribution kit for
>Revise, which gives a base PL, and two Modsets "CYBER" and "VAX", one of
>which is applied to make Revise compile in each environment.

I think the importance of this concept cannot be overstated, but has been
overlooked by all of the "small system" [as above] library tools.  This is
feature code, a sort of conditional compilation/assembly pulled back one
level of processing.

>The differences between SCCS/RCS and UPDATE/MODIFY are radical.

That's putting it mildly.  The only thing in common is that they are
attempts to solve the same problem.  Well, almost.  I think MODIFY/UPDATE
recognized additional sub-problems and tried to solve those, too.

>They are
>so different, that I say that each represents a "paradigm" of source

No need for quotes here...

>control generally.  The UPDATE/MODIFY paradigm is the one with which I am
>most familiar, so I naturally want a program that embodies it to run in my
>preferred (enforced?) environment.  Those "brought up" on RCS or SCCS cannot
>see any merit in the other paradigm, at least those to whom I have talked.

Indeed.  Perhaps we can attack their character by saying that they never saw
the need because they never worked on a VERY large product (such as NOS)
requiring coordinated effort by sizable teams.

I see here also another major difference between M/U and *CS:  The former is
monolithic, while the latter is incremental.  Let me explain.  M/U run on
machines where (at least theoretically) there is enough power available that
it is no great drain on resources to keep editing a modset and creating a
COMPILE file every time you want to assemble/compile something.  You do not
notice how much (possibly redundant) work you are asking the machine to
perform,  because small increments of work are not noticable.  This is true
only up to a point, as I would sometimes go the *CS route: pull out a source
file, edit it for two days, THEN use compare to make a modset.

*CS run on machines without sufficient power to hide these small increments
of work.  Thus the paradigm changed to permit the elimination of most of
the MODSET -> COMPILE file operations in an edit cycle.  The time required
for the edit cycle remains on the not-enough-time-to-get-coffee side of the
line, whereas with M/U (and Revise, as we have seen) it retreats past coffee
and on into time-enough-to-read-War-and-Peace territory.  Nothing like
breaking a train of thought to introduce errors and reduce engineer
effectiveness.  So small increments of change are evaluated.  Other examples
of this technique are incremental compilers, and Chess 0.5 (as featured in BYTE
some time ago).  There was even some talk at UCC about a text editor that would
spit out a modset when you were done.

>I invite you to make your own observations about these paradigms, and about
>specific programs.  Certainly, there are features conceivable that exist in
>neither of these worlds: let us consider them, as well.  Our goal should be
>to establish a new paradigm which embodies the best features of all others.

Dream on, then?  OK.

I like named modsets.  I like independent modsets.  I like being protected
from disaster (significant effort required to make changes stick).  I like
to remove modsets.  I like to make virtual OPLs (*OPLFILE) (Often I'd use
four such PLs in building the Cray Station.).  I would like to minimize
work and maximize speed using incremental techniques.  I'm not completely
comfortable with the smallest unit of change being a line, as the significance
of lines diminishes in modern languages. For that matter, we aren't always
maintaining programs.  Soon library-people shall enter the world of the DBMS.

The biggest difference between MODIFY and REVISE/UPDATE is random access, and
everything you can do with it.

Your turn.



>From tsec!nsc!amdahl!meccts!kksys!deg Sat Oct  3 18:19:31 1987
Received: by nsc.NSC.COM; Tue Oct  6 00:28:21 1987
Received: by amdahl.UUCP (4.12/UTS580_/\o-/\)
id AA25752; Mon, 5 Oct 87 23:39:04 PDT
Received: by meccts.MECC.MN.ORG (smail2.5)
id AA06036; 6 Oct 87 00:04:28 CDT (Tue)
Received: by kksys.UUCP (smail2.3)
id AA10528; 5 Oct 87 22:30:41 CDT (Mon)
To: deg, meccts!amdahl!nsc!tsec!nsc-nca!aga,
        meccts!amdahl!nsc!tsec!nsc-nca!jxh, meccts!amdahl!nsc!woolsey, msr,
        umn-cs!cray!jlp
Subject: Ramblings from one of the guys "in the back"
Date: 5 Oct 87 22:30:41 CDT (Mon)
From: deg@kksys.UUCP
Message-Id: <8710052230.AA10528@kksys.UUCP>

>SCCS, RCS, etc...

SCCS attempted to enforce module integrity by allowing only one user to
gain access to the "source" of the module in "edit" mode at one time.
There were several commands in the SCCS package, "admin" to administrate
the SCCS files, "delta" to apply a change to a SCCS file, "get" to obtain
the source to a SCCS file, etc.  I believe that "get -e" (or something)
was used to declare that you intended to make changes to the file, rather
than just look at or compile it.  You were not allowed to do a "get -e" on
a file that someone else had interlocked by their "get -e".  The interlock
was cleared when the "delta" to the SCCS file was posted.  The "teeth"
in SCCS were due to the files being owned by a project leader, who created
and modified the SCCS files.  He (actually, SCCS probably did this by
default) would set the file permissions so that only he would be able
to write the SCCS versions of the files.  The project team members would
be in a unix group that had permission to read the SCCS files.
I beileve that RCS works similarly, except that the commands are
"ci" (check in) [don't mistype your editor name or the file disappears!]
and "co" (check out).  I have no idea how they work.  In either case, if
the SCCS/RCS files were not protected BY THE OPERATING SYSTEM, any person
with write permission to the files could apply any change at any time.
In fact, they could trash the files completely.  What an interesting
implementation of "source code control".  Actually, this is no different
from Modify/Update/Revise/yournamehere; perhaps we will have to rely on
the file protection facilities available in the host operating systems
to ensure PL integrity.

>...non-monotonic program modifications...

Another advantage to the M/U pardigm is the ability to have a "debug"
or "test" modset that is not kept in the PL.  The Pascal Group at UCC
had a modset that introduced all sorts of good writelns into the compiler
source.  We used this whenever there was a bizarre code generation error.
All we had to do was apply the modset and recompile the compiler.
Unfortunately, when we made MAJOR changes to the compiler, we had to
resequence the modset.  Fortunately, this was an infrequent occurrence.
We could have made the modset a permanent part of the PL, but it would
have blurred the otherwise clear boundary between the compiler and the
debug code.  As part of the PL, the modset would have required updating
whenever a compiler change affected it.  If we put the debug modset
corrections in the compiler change, we could no longer simply "yank"
the debug modset to remove the debug code.  If we made the corrections in
a second compiler modset, we would have been unable to "yank" the compiler
changes without also yanking the debug code modset.  This sounds like
a good issue to address; perhaps what we need is an ability to group
modsets: when mod1 is yanked, also yank mod2, and when mod3 is yanked,
also yank mod2.  Then again, maybe this is a load of rubbish.

I recently found myself longing for this capability here at CFA.  We have
two versions of a printer driver: production and test [I know, I know...
it sounds like an IBM shop.  sorry.].  In the beginning of this year, we
drastically changed the production version.  As we all know, test versions
of software hang around forever.  This printer driver is no exception.
If I had a modset that could flip between the two versions, I'd be ecstatic.
However, this is CFA...  stone knives and bearskins, remember?  At least we
take offside dumps every now and then...  but source control?  Forget it...
we come as close to good source control as Bork comes to having a real beard.

>what else would you call the other Revise feature-vestige but "Common
>Deck?"

Associate(d) Deck?

>Observe, however, that if the logic of Revise changes too drastically,
>the source will not be recoverable.  (It also helps to take executable
>versions of Revise on field trips.)  This is the bootstrapping problem.

Gee, that's a good point.

>Virtual OPLs (*OPLFILE)...

Maybe we should reassess the need for this feature.  Why was it there
(in Modify) in the first place?  As near as I can tell, the only reason
for *OPLFILE was for an ABS assembly, where you had to *CALL all the
subroutines you used into the source fed to COMPASS so there would be
no external references.  Why did we have ABS assemblies?  Another good
question.  Answers include being able to create multiple entry point
programs, having the "good" loader tables (needed by Cyber loader if
you ran under RFL,0), and being able to fix inter-program communication
areas at specified addresses (did we REALLY do that?).  I'm not sure if
any of these reasons applies to what we're doing today.  We all use
language processors that generate relocatable object files, and don't
complain too much about having to link everything.  OK, so we use
Turbo Pascal, too.  But we've already complained about that.  Maybe they
will fix these annoyances in 518 ... er ... version 4.0.  At any rate,
do we really need to include subroutines with *CALL if we have pre-compiled
versions of them in a library somewhere?  Just think, if we don't have to
compile them every time, we can cut down the compile-test-edit cycle time.

>Revise, as Dan convinced me with great eloquence and amplitude, is in the
>past.  It is rather too fixed to bother spending much time reworking it...

>>Strange though this sounds, I think I now understand Revise
>>well enough to abandon it.  I embarked on my translation project
>>because I believed that Revise contained the Final Wisdom about
>>library management (well, sort of) if I could only get the oracle
>>to speak.  Having dug deep enough to understand Revise quite
>>completely, I can now see its shortcomings clearly.
>
>Abandoning it is perhaps a bit drastic.  Revise does embody some useful
>concepts.  It's just that they are in a form that is not abstract
>enough for our discussions.

There are many good things present in Revise; it's just that, along with
all the useful features and familiarity (with the M/U-like interface),
there's a lot of deadwood.  Jeff P. keeps telling me about the source
maintenance utilities they use at Cray.  I beleive he said it was a collection
of three or four C-language programs, one to three pages each (how long is
Revise?).  He also said that most of the good features in Modify were
implemented.  Jeff, I'd sure like to hear more about those programs.

Other reasons for starting from scratch include having the freedom to
come up with a package (which, from the sounds of things, is heading toward
a DBMS...  there, now we've all said it) capable of performing the functions
we deem necessary, without having to work around an existing inadequate
framework, and being able to do with the resultant package as we please.

>The difference here is that we want transactions that can be backed out,
>and we want transactions that are independent of one another.

Well said, Jeff.  However, does this necessarily imply that a change to
the database, consisting of multiple interrelated transactions, will result
in a consistent update?  What I'm talking about here is the "classic" OPL
and JPL system used at UCC.

>>I think the edit-debug cycle followed by the
>>process of making a modset to describe what you've just been
>>through is valid.
>
>What that does is solidify (freeze) a set of changes.  By making them
>more permanent you're supposed to think about them more.  But the existence
>of COMPARE along with text editors subverts that intent.

I don't agree that Compare is responsible for sloppy work.  It does make
modset creation easier, but aren't we all looking for ways to make source
maintenance easier?  It is up to us to continue to be responsible and
professional about our changes; being able to easily generate a modset from
an edited source file is no excuse for failure to do a thorough job.

>The more I think about this, the less I know.

One thing is for certain: by the time we have a product, we'll all know
what the pitfalls of source maintenance are.  That, itself, may be the
most valuable thing each of us will carry away from this project.

>P.S.  How come I haven't heard anything from anyone but JLW?  Am
>I not getting through to faraway parts?  Can you HEAR ME IN THE
>BACK, THERE?  (This is the second note I have sent to
>library-people).

I don't think Mark has logged in yet, and I'm terminally disorganized.
Steve Oyanagi was mentioning a new Unicos release due out soon, so JLP
is probably up to his ears in testing.


>From tsec!nsc!nsc.NSC.COM!woolsey Mon Sep 28 22:44:45 1987
Received: by nsc.NSC.COM; Thu Oct  1 14:39:28 1987
Received: by pubs.nsc.com; Thu Oct  1 14:39:21 1987
Date: Thu, 1 Oct 87 14:39:21 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8710012139.AA04751@pubs.nsc.com>
To: deg@kksys.UUCP, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        pyramid!crayamid!cray!jlp, tsec!nsc-nca!aga
Subject: more ramblings

>To: jxh, tsec!nsc!amdahl!meccts!kksys!deg, tsec!nsc!amdahl!meccts!kksys!msr,
>        tsec!nsc!amdahl!meccts!umn-cs!cray!jlp, tsec!nsc!woolsey

Well, it ought to get to those other sites.  I just hope that tsec is smart
enough to realize that all the recipients for the copies it got are going
the same direction.

>Subject: Response to JLW's First Down

I see you learned how to do article quoting.  Usually that is the
majority of any article over 50 lines.  Not in this case.  How
unusual.

>what else would you call the other Revise feature-vestige but "Common
>Deck?"

Other extant names are "include file" and "header file", neither of
which I like very much.  Somewhere out there there is a gem of a term
for describing this concept ("subroutine"?  "macro"?) but it hasn't
presented itself yet.

>Not at *no* cost.  I adapted Revise (not to say hosed over) to
>take better advantage of my environment.  Your statement does
>not imply respect for the authors.  I must defend them (having
>seen their code most closely): they implemented tradeoffs in
>favor of portability, but they implemented them well.  Revise is
>a beautiful Pascal work-of-art. Alas, art is seldom utilitarian.

>Among other things, Pascal's character I/O was leaned on heavily;
>it is not implemented well in the Pascal compilers available to me.
>Borland's "product" (for all its good traits, it is *NOT* a Pascal compiler
>if it can't compile Revise) fails to implement character I/O *at all*,

Your statement does not imply respect for the authors.  I must defend
them (having used their code most closely): I contend that they DID
implement character I/O.  I base my statement on the presence of big,
ugly kludges like BLOCKREAD and the untyped FILE to provide block
I/O.  Granted, the character I/O of which you speak is characterized
[sorry] by file POINTERS, but that wasn't blindingly obvious in your
tirade.

As for the original authors of Revise, I intended not to impugn their
abilities as Pascal programmers--indeed, some, if not all, of them had
their fingers in the P-6000 pie.  I'd be rather surprised [why does
that word have two r's in it? -- never mind] if Revise was not already
fairly optimal from P-6000's point of view.  Rather, I meant to have
my recollection of Jim's efforts clarified, as was obviously needed.

Incidentally, I sure hope that our [NSC's] forthcoming Pascal compiler
has file pointers.  I think it does, owing to reports of efforts to pass
the Tasmanian test suite.

>>>...OPLFILE...
>
>>It could be difficult to restore this feature efficiently...
>
>It is not my intention to *restore* features of MODIFY to Revise.

It would be easier to add it to something else.

>Rather we should concentrate on specifying a new program which
>has these desirable features.  Revise, as Dan convinced me with
>great eloquence and amplitude, is the past.  It is rather too
>fixed to bother spending much time reworking it until it no
>longer resembles its former self.  That was just my approach
>during the early stages of my work in this realm precisely
>because I could not afford to change the logic of Revise for fear
>of destroying the only *documentation* of its behavior on other
>systems: the code itself.

There's a paradox here.  Fear of destroying this "documentation" can
be alleviated by making a copy.  Or by applying your changes as modsets.
Observe, however, that if the logic of Revise changes too drastically,
the source will not be recoverable.  (It also helps to take executable
versions of Revise on field trips.)  This is the bootstrapping problem.

>Strange though this sounds, I think I now understand Revise
>well enough to abandon it.  I embarked on my translation project
>because I believed that Revise contained the Final Wisdom about
>library management (well, sort of) if I could only get the oracle
>to speak.  Having dug deep enough to understand Revise quite
>completely, I can now see its shortcomings clearly.

Abandoning it is perhaps a bit drastic.  Revise does embody some useful
concepts.  It's just that they are in a form that is not abstract
enough for our discussions.

>>>... They operate by controlling the "checking out" and
>>>subsequent "checking in" of modules.
>
>>Can any of these programs operate correctly if the directory and
>>files where the "library" are are read-only?
>
>Perhaps you phrase your question too narrowly.  How about:  Can
>these programs utilize existing file protection mechanisms to
>protect the library from accidental or unauthorized modification?
>I believe that PVCS has a network *version* ( <-irony ) which
>can work with, e.g. PC-Network-compatible thingies.

I have a great deal of difficulty believing that the mere presence of
a network version of something like PVCS prevents me from using a non-
network version to modify the library, or from going at the library
with ordinary DOS commands.

Protection of the library is important, and UPDATE was able to take
advantage of the simple scheme available with 1/2" tape: write-rings.
(UPDATE (and I guess Revise, too) could process a library sequentially)
The trouble is that existing file protection mechanisms vary widely, and
may not even exist in some places.

>But even if some available program does this, it probably won't do it to our
>liking.  PVCS is awfully tightly coupled to the architecture of
>the "network", in that the application program interface for
>file- and record-locking and -sharing is quite specific to PC
>DOS.  I would hesitate to try to write a "portable" program which
>assumed that this interface existed everywhere.

Oh, you could go ahead and write such a program.  It would only be
portable to systems that supported that interface.  All you have to do,
then, is select an interface which is available everywhere that you
would want to take your program.  This is what standards committees are
for, and what they occasionally accomplish.  (e.g. POSIX. (P = Portable))

>I think we should simply say that our ideal program must have
>this capability.
      ^^^^^^^^^^
Property, please.  This should not be optional.

>Thinking further about this: MODIFY could use
>read-only PLs without telling anyone about it (i.e. writing in
>the PL some indication that I am making a modset); that
>information really just prevents two people making conflicting
>modsets;

Actually, it just keeps files consistent.  As applied to MODIFY,
this provides the prevention mentioned.

>in MODIFY, that conflict was resolved later by the
>authority responsible for actually applying changes to the
>system-wide PL.  I think it would be neat if we could somehow
>automatically avoid such conflicts or at least automatically
>resolve them.

Avoiding them is easier than resolving them.

>Perhaps modsets being applied at "the same time"
>can be sorted out by a program and adjusted.  One such conflict
>that seems a good candidate is that of one modset naming a line
>just deleted by another; a program that is "co-applying" these
>modsets would "know" about this line until all modsets were
>applied.  There.  I just opened THE big can of worms.  Come and
>get it!

What an appealing metaphor.  Yecch.

In the MODIFY universe, there are very names for each type of
problematic combination of modsets.  I'll describe them here and let
you match them to your ideas above (and below, it turns out).

Dependent modsets relate such that one modset refers to a line that the
other inserted or deleted.  Modsets can refer to deleted lines; actually,
they are inactive rather than deleted.  This kind of reference generated
warnings from UPDATE or some such program.

The trivial case of conflicting modsets would be two identical modsets with
different names.  Applying the second one should have no effect other than
doubling the inserted groups.  These modsets may be yanked or deleted
indepently of one another.  The generalized case is two modsets that do the
same thing to a line in a third modset or the original text.

Another kind of dependency [is that a word?] is two modsets that insert
after the same card (line).  Here the dependence is upon what order the
modsets are applied, with the newest modset's text appearing first.

Some or all of these conditions could be detected automatically, although
MODIFY never bothered.  UCC had a program called DEPEND to address some
of these issues.

Obviously, conflicting modsets are the product of disorganization among
programmers (or a deranged mind if the same programmer produced both
conflicting modsets).  Dependent modsets were solved by producing a
GEN modset whose sole purpose in life was to be depended upon.

>Perhaps we are asking too much in an age when many language
>processors have, indeed, conditional preprocessors.  But using
>them in lieu of the feature at hand doesn't solve any problems of
>source control: it merely takes them out of the hands of the
>library management program and puts them squarely back onto the
>broad shoulders of the programmer, which is precisely where we
>are trying to AVOID putting such things.  If a library management
>program did not introduce horrendous delays in the
>edit-compile-debug cycle, I think it could be relied upon to take
>over this task from the language processor.  There are two sides
>to this coin, though: 1) Although it makes such a feature uniform
>across all language processors in use (notably those which have
>no such facility already), 2) it probably will not have exactly
>those features of the preprocessor in use that the programmer
>likes and uses.  Discussion is indicated.

The conditional "pre"processing available with various languages ranges
from none, to simple (include files, *CALL) but unconditional, to conditional
( IF ELSE ENDIF ), to include macro processing, and to include kitchen
sink processing.  Typically your overgrown assembler is of this last
variety ( remember COMPASS' DUP, IRP, and other pseudo ops?)

Another avenue for complicated preprocessing is to divorce such activity
from any language definition.  This yield things like m4, cpp (used by
languages other than C, and things known at cpp-time are unknown at
compile-time), and things like sed, awk, & lex.  I may be wandering a bit,
as preprocessors should be transparent to anything that isn't directives.

I.E. I'm not sure such general preprocessing belongs in a library tool,
except as warranted for include files that are in the library.

Another point worth considering is that selecting between using a
preprocessor and a source-control tool to provide feature code depends
upon which is more widely available among the recipients.  Back in the
CDC days, everyone who received NOS had MODIFY, thus NOS was distributed
in source-control form.

>Perhaps MODIFY/UPDATE caused more problems, that then needed
>solving, but I think not.  M/U really did attempt to solve
>problems, but not in the normal fashion: that of so automating a
>task that the programmer could not hurt himself.  It would be
>nice if automation of this sort of thing took the form: the easy
>way is also the good way.

So what, then, is the easy way?

>What, then, is that problem that needs solving?  Seems to me that
>*the problem* can be simply stated as: multiple, simultaneous,
>dependent and independent modifications of a database, with the
      ^^^^^^^^
Yes, I think we are heading into that realm.  The more generalized
we want things, the more general must the tools be.  For example, the
notion of OPLFILE, which is really a virtual OPL, abstracts into creating
a virtual database by logically combining two or more smaller (an uncommon
word in the database world) ones.

>special case that the database is a set of source files subject
>to an iterative edit-compile-debug cycle.

Peanuts compared to transaction processing on a large database.

>"Dependent and independent" means that two kinds of conflicts can
>exist among any given set of modifications.  Independent mods
>(to the same file) are those which do not overlap; that is they
>do not modify the same part of a file.  They are tantamount to
>modsets to separate files, with the line between files being
>rather arbitrary.

The difference here is that we want transactions that can be backed out,
and we want transactions that are independent of one another.

I'm not yet sure whether the database model of program library
maintenance needs any extensions.  Every database that I can think of
is a model of some physical phenomena, such as the existence of people
or instruments.  People are unique (though their names might not be; I
don't think we are that interested in this problem) and financial
instruments must be unique or we quickly find ourselves in financial
ruin.  We can also think of these instruments as transactions, and one
would not want (in the interest of correctness) the same transaction
applied multiple (or zero) times!

>>There was even some talk at UCC about a text editor that would
>>spit out a modset when you were done.
>
>Well, let's not get into the holy war over "your favorite text
>editor" here.

That's not the point. Any editor could be encased in a procedure file
or shell script which would run COMPARE or diff between the before and
after files.

>I think the edit-debug cycle followed by the
>process of making a modset to describe what you've just been
>through is valid.

What that does is solidify (freeze) a set of changes.  By making them
more permanent you're supposed to think about them more.  But the existence
of COMPARE along with text editors subverts that intent.

>>I like to make virtual OPLs (*OPLFILE)...
>
>Let's generalize that: This is really an artifact of the line
>between one "file" and the next.  It may be that some environment
>blurs this line (PLATO comes to mind, where one "lesson" had many
>different types of "blocks" that were related by being part of
>the same lesson.  Sort of a subdirectory.)  An implementation
>must not allow arbitrary lines between "files" to define the
>boundary of the "library."

Paradoxically, on Cybers you can concatenate files full of REL records
and get one file of REL records (and the loader will like it), while on
UNIX you can concatenate two tar files and get one file that tar won't
like.

>>The biggest difference between MODIFY and REVISE/UPDATE is
>>random access, and everything you can do with it.

Even in an environment without random access, the library maintenance
tool can still maintain an index, for use if the library is ever taken
to an environment that has random access, or is such an environment
should arrive where the library is.

This is far too long.  I think a summary is in order, next time. The
more I think about this, the less I know.


>From tsec!nsc!nsc.NSC.COM!woolsey Sun Oct  4 18:17:12 1987
Received: by nsc.NSC.COM; Tue Oct  6 16:28:22 1987
Received: by pubs.nsc.com; Tue Oct  6 16:27:57 1987
Date: Tue, 6 Oct 87 16:27:57 PDT
From: Jeff Woolsey <woolsey@nsc.NSC.COM>
Message-Id: <8710062327.AA20770@pubs.nsc.com>
To: deg@kksys.UUCP, msr@kksys.UUCP, nsc!tsec!nsc-nca!jxh,
        pyramid!crayamid!cray!jlp, tsec!nsc-nca!aga
Subject: good seats

(I say, these subject lines are getting pretty stupid.  We all know
what we're talking about:  general ramblings on the subject of source
control.  When (if?) we start getting more specific, they'll make more
sense.)

>He (actually, SCCS probably did this by
>default) would set the file permissions so that only he would be able
>to write the SCCS versions of the files.  The project team members would
>be in a unix group that had permission to read the SCCS files.

Boy, were we spoiled with the PERMIT command.  There was a discussion
raging in netnews many moons ago about grafting Access Control Lists
onto Unix.  Not much became of it, except lament for the systems that
people had left that had had them.

I mean, the method above sounds like a kludge, and I still don't know
whether the SCCS tools need to be setgid to the project group, or
setuid to the project leader, or what.  And if so, how does one use
it on multiple projects?

>I beileve [sic] that RCS works similarly, except that the commands are
>"ci" (check in) [don't mistype your editor name or the file disappears!]

In my case, ci is an alias for vi.  So there.  cu used to conflict, too,
but tip, and later telnet, came along.

>In either case, if
>the SCCS/RCS files were not protected BY THE OPERATING SYSTEM, any person
>with write permission to the files could apply any change at any time.

CDC manuals suggested that you access the system OPL with COMMON(OPL).
The system allowed certain people to create or access these files; a
deadstart (or judicious CM editing from the console followed by a DTKM)
was required to delete them.

(Incidentally, how many access word bits were required to make an ECS
file COMMON?  ECS files were an ancestor of RAM-disk, though we did not
know it back then.)

>perhaps we will have to rely on
>the file protection facilities available in the host operating systems
>to ensure PL integrity.

I think that that is a tautology.

>>...non-monotonic program modifications...
>
>Another advantage to the M/U pardigm [sic] is the ability to have a "debug"
>or "test" modset that is not kept in the PL.

I had such a modset for USERS/DSDSIM.  It used 1DS and QFM to do
dangerous things during system time.

>The Pascal Group at UCC
>had a modset that introduced all sorts of good writelns into the compiler
>source.  We used this whenever there was a bizarre code generation error.
>All we had to do was apply the modset and recompile the compiler.

And I bet that the reason it was done that way instead of

CONST DEBUG = TRUE;

IF DEBUG THEN
WRITELN('DAG NODE = ', P^.LEFT^.DAG[PI].STYPE^.TOKEN: 6 OCT);

was code size in the production compiler.  I don't generally count on
HLL compilers to recognize this as a form of conditional compilation,
though an increasing number do.  Besides, this way you still have to
recompile.  Using DEBUG as a variable that is turned on by a switch is
probably a good idea in early stages of compiler building, but as the
bug density diminishes, this gets to be a drag.

>This sounds like
>a good issue to address; perhaps what we need is an ability to group
>modsets: when mod1 is yanked, also yank mod2, and when mod3 is yanked,
>also yank mod2.  Then again, maybe this is a load of rubbish.

So you want to teach the source control system about change dependency.
Sounds like a directed graph to me.

What if you yank mod3, but not mod1?  What happens to mod2?

>However, this is CFA...  stone knives and bearskins, remember?  At least we
>take offside dumps every now and then...  but source control?  Forget it...
      ^^^^^^^
A five-yard penalty.  Hey, this football metaphor sure is pervasive!

>we come as close to good source control as Bork comes to having a real beard.

Have you been reading SU-BBOARD???  Nothing like turning a technical
discussion into a political one.  It's happened to SDI, and
disarmament, and 55 mph, and ....

Does the fact that half this list's members are bearded have anything
to do with that remark?

>>what else would you call the other Revise feature-vestige but "Common
>>Deck?"
>
>Associate(d) Deck?

But it's still a DECK!  Next thing you know, we'll have portholes,
gangways, bulkheads...

>>Observe, however, that if the logic of Revise changes too drastically,
>>the source will not be recoverable.  (It also helps to take executable
>>versions of Revise on field trips.)  This is the bootstrapping problem.
>
>Gee, that's a good point.

I thought so.  Has this been rubbed far enough in yet?

>>Virtual OPLs (*OPLFILE)...
>
>Maybe we should reassess the need for this feature.  Why was it there
>(in Modify) in the first place?  As near as I can tell, the only reason
>for *OPLFILE was for an ABS assembly, where you had to *CALL all the
>subroutines you used into the source fed to COMPASS so there would be
>no external references.

You're forgetting your history.  The thing that was *CALLed is called a
COMMON deck.  It was intended for use in FORTRAN programs, to contain
the COMMON declarations for the program.

>Why did we have ABS assemblies?  Another good
>question.  Answers include being able to create multiple entry point
>programs, having the "good" loader tables (needed by Cyber loader if
>you ran under RFL,0),

... to make sure that no unneeded trash (like CMM) got hauled in at
link-time ...

>and being able to fix inter-program communication
>areas at specified addresses (did we REALLY do that?).

Sure.  ARGR, CDDR, FWPR, uh...  Gee, this stuff evaporates quickly
without an Instant.  Actually, those didn't need ABS; the loader just
started the text high enough that these constants could be used for IPC.

>I'm not sure if
>any of these reasons applies to what we're doing today.  We all use
>language processors that generate relocatable object files, and don't
>complain too much about having to link everything.

Oh yeah?  I miss having 1AJ call the loader when it didn't know _what_
it was looking at.

>OK, so we use
>Turbo Pascal, too.  But we've already complained about that.  Maybe they
>will fix these annoyances in 518 ... er ... version 4.0.

I doubt it.

>At any rate,
>do we really need to include subroutines with *CALL if we have pre-compiled
>versions of them in a library somewhere?  Just think, if we don't have to
>compile them every time, we can cut down the compile-test-edit cycle time.

It becomes compile-link-test-edit cycle time.  Linkers can be slow, too.

I'll give you one very good reason for having done ABS assemblies.
Confidence.  ABSolute confidence.  If I know that it was I who
assembled (or compiled) every last line of code in a program, I'll have
all of the information available about how it was created and what's in
there, so that there are no surprises.  All of the binaries
(relocatable, even) that go into making something are built from
sources that I can actually see.  There's even a little motivation for
disassemblers here, too, in case you must use a library without source
(e.g. Wreckage Mangler, or CTI).

"If you want something done right, do it yourself."

Still, that can be a lot of work, mitigated by the machine speed...

>There are many good things present in Revise; it's just that, along with
>all the useful features and familiarity (with the M/U-like interface),
>there's a lot of deadwood.  Jeff P. keeps telling me about the source
>maintenance utilities they use at Cray.

Perhaps he should tell _us_.

>I beleive he said it was a collection
>of three or four C-language programs, one to three pages each (how long is
>Revise?).  He also said that most of the good features in Modify were
>implemented.  Jeff, I'd sure like to hear more about those programs.

I wonder just how portable are sources maintained with these tools.
(Not that they needed to be; they're proprietary, aren't they?)

>Other reasons for starting from scratch include having the freedom to
>come up with a package (which, from the sounds of things, is heading toward
>a DBMS...  there, now we've all said it) capable of performing the functions
>we deem necessary, without having to work around an existing inadequate
>framework, and being able to do with the resultant package as we please.

but not anytime soon.

>>The difference here is that we want transactions that can be backed out,
>>and we want transactions that are independent of one another.
>
>Well said, Jeff.  However, does this necessarily imply that a change to
>the database, consisting of multiple interrelated transactions, will result
 must
>in a consistent update?  What I'm talking about here is the "classic" OPL
>and JPL system used at UCC.

"Consistency is the mother of strange hobgoblins."

There's a parallel here between updating a database and managing a semaphore.

>>>I think the edit-debug cycle followed by the
>>>process of making a modset to describe what you've just been
>>>through is valid.
>>
>>What that does is solidify (freeze) a set of changes.  By making them
>>more permanent you're supposed to think about them more.  But the existence
>>of COMPARE along with text editors subverts that intent.
>
>I don't agree that Compare is responsible for sloppy work.  It does make
>modset creation easier, but aren't we all looking for ways to make source
>maintenance easier?

Not only easier, but more reliable.

>>The more I think about this, the less I know.

...until pretty soon I know nothing about everything.

>One thing is for certain: by the time we have a product, we'll all know
>what the pitfalls of source maintenance are.  That, itself, may be the
>most valuable thing each of us will carry away from this project.

Platitudes, eh?  OK.  Source maintenance is an attempt to cure the
too-many-cooks disease.  (Somtimes one is too many.)

>>P.S.  How come I haven't heard anything from anyone but JLW?  Am
>>I not getting through to faraway parts?  Can you HEAR ME IN THE
>>BACK, THERE?  (This is the second note I have sent to
>>library-people).
>
>I don't think Mark has logged in yet, and I'm terminally disorganized.
>Steve Oyanagi was mentioning a new Unicos release due out soon, so JLP
>is probably up to his ears in testing.

I think someone's implying that Jim and I have little to do...  Too much
free time on our hands...

We can continue this discussion in person (Jim can debrief me) this weekend.
--
LERMINATING PREVIOUS SESSION.  PQEASE RETRY.

Jeff Woolsey  National Semiconductor  woolsey@nsc.UUCP  woolsey@umn-cs.EDU


>From tsec!nsc!pyramid!crayamid!cray!jlp Mon Oct  5 17:17:55 1987
Received: by nsc.NSC.COM; Thu Oct  8 04:58:58 1987
Received: by pyramid.UUCP (5.51/OSx4.0b-870424)
id AA20821; Thu, 8 Oct 87 04:34:57 PDT
Received:  by vax2.cray.uucp (4.12/25-eef)
id AA19316; Wed, 7 Oct 87 22:23:08 cdt
Date: Wed, 7 Oct 87 22:23:08 cdt
From: cray!jlp (Jeff Pomeroy)
Message-Id: <8710080323.AA19316@vax2.cray.uucp>
To: crayamid!!pyramid!nsc!tsec!nsc-nca!aga,
        crayamid!!pyramid!nsc!tsec!nsc-nca!jxh, crayamid!!pyramid!nsc!woolsey,
        umn-cs!meccts!kksys!deg, umn-cs!meccts!kksys!msr
Subject: Is it too late to buy a round trip ticket to ...


Hi gang.  I have been reading the messages as they go by.

I think that i should start off by explaining what i have been up to for the
past two and a half years.  I work at Cray handling the source code of the
UNICOS operating system.  Cray has something called UPDATE that was based on
CDC's UPDATE and has run under COS for years and UNICOS for about a year.
UPDATE is written in CFT, and thus runs only on the Cray mainframes.
Cray also has something call 'scm', which is a locally designed and written
thingy that was intended to put UPDATE out of business.  Scm runs on the Crays,
the VAX and the Suns.  Some people within Cray also use SCCS on the VAX.
I have to deal with all of them.

We have had many battles over source control, which has shown one thing:
there is no winner.

So, what can i add to this discussion?

I really do not want to start complaining about my day-to-day problems, and
have you guys waste your time trying to think up some solution(s) which i am
sure have little or no impact on what we do around here.  But, i could relate
to you some of the things that have happened around here in the past, just as
examples, to show how these things work in the real world.

I will start with SCCS.  It is owned and operated by AT&T.  They only let
people use it.  You may think i am kidding.  We have both Suns running 4.2
and VAXes running System V, side by side, on the same network. The Sun SCCS
was based on UNIX version 7 (or PWB) .  Guess what?  AT&T changed SCCS around
System III so they are not compatible.  OK, so some things did stay the same.
Quick, what are they?
(Strike One)

SCCS has magic cookies.  (do i need to explain this?)  A magic cookie in this
case means in-band data. (That should wake jim up)  SCCS will do different
things depending on the content of the file you are putting under source
control.  Another way to say this is: "what goes in does not come out".  (if
this gives you a bad feeling deep in your gut, good)  The SCCS magic cookies
are in the f percent, letter, percent.  Like '%D%', which i think changes
to the current date.  Nice, eh?  What about:

printf(" %D%D ", i, j);

(I think that %D is the double precision decimal format...)  This is silently
distroyed by SCCS.  This really happened in the Berkeley 4.2 release where
under the user contributed software, some guy had developed a program without
SCCS, but Berkeley passed it through SCCS before releasing it.

At Cray, some people use SCCS to hold there assembly code.   What about:

IDENTSIN
ENTRYSIN
ENTRYSIN%
ENTRYSIN%R%

The entry for SIN saves and restores all registers.  The entry for SIN%
does not save or restore, this is used by the compiler with it knows it
is 'safe' to bash registers.  And lastly SIN%R% means that operands are
passed in registers and not on the stack.  As i recall, SCCS changed %R%
to the current release level of that module.
(Strike Two)


SCCS is made up of many commands.  One of them is called 'help'.  This command
seems to be there for the sole purpose of almost but not quite helping the
user.  Here i will use proof by example...

VAX running AT&T System V:
Type:
% help
Output:
%


Cray running UNICOS:
type:
% help
Output:

        The help command is unfortunately not what it would seem to be.
It provides a limited amount of help and only for SCCS commands.  Its
arguments are the small alpha-numeric strings that accompany the error
messages from those commands.  If you are looking for more general help
in using the rest of the system try the on-line manual that is available
through the man command.  Try: man man


Sun running Sun UNIX 4.2 Release 3.2:
Type:
% help
Output:
msg number or comd name?
Type return

Output:
ERROR:  not found (he1)
Type:
% help
Output:
msg number or comd name?
Type:
he1
Output:
he1:
"not found"
No helpful information associated with your argument was found.
If you're sure you've got it right, do a "help stuck".
Type:
help stuck
Output:
stuck:
First, if you know the value of the system error number (errno),
you can either look up a description of it in INTRO(II), or execute
"help err<number>" (e.g., if the error number is 1 execute "help err1").
If you don't know the error number, or you don't understand what's going on -
Try the following, in order:
1. Make sure the answer isn't in the documentation.
2. Try to write(I) to anyone logged in as "adm".
3. Contact your PWB/UNIX counsellor.
4. File an MR (see System Administrator for instructions).

You call this help?  Just for a laugh, i tryed 'help he2' and got:
he2:
"argument too long"
Dost thou jest? Wilst thou mock HELP??
Please limit your blitherings in arguments to less than
fifty (50) characters.

(Strike Three, you're out!)


Lastly, here is a quote from the Sun man page for the SCCS get command:
(Lets see if anyone in the home audience can figure out why)

BUGS
     If the effective user has write  permission  (either  expli-
     citly  or  implicitly)  in the directory containing the SCCS
     files, but the real user doesn't, only one file may be named
     when the -e option is used.


The list goes on and on...  I will never put any of my own programs under SCCS.

As deg would say...  "SCCS is a 'FINE' piece of software"

It is getting late.

------------------------------

End of Soft-Eng Digest
******************************