[comp.sys.apollo] DSEE Benefits / Converting to DSEE

bobr@zeus.TEK.COM (Robert Reed) (06/23/87)

In article <3596c60e.b0a1@apollo.uucp> nazgul@apollo.UUCP (Kee Hinckley) writes:

        DSEE manages binaries.  This means that if I'm working on one part of
    a system, and someone else is working on another, we can share binaries.
    DSEE keeps track of what binaries have been built using what compiler
    options.  The model description tells it which options are critical and
    which are not, so it knows whether or not you can use the result of
    someone elses build.  This is very nice on large projects when multiple
    people are working on different branches of a few files, but need to
    share the rest.  Furthermore you can tell DSEE whether or not the tools
    are critical, so if you get a new compiler, or translator or whatever
    tool you are using, DSEE will know whether it should recompile
    everything.

We could certainly reduce our disk requirements by sharing common binaries
and/or source files.  It's never been a problem for us, though.  Each of us
on the design team has a 1-2 MB directory containing private copies of all
the sources and objects, and our make dependency graph connects the sources
to a common RCS directory.  We have the option of including updated sources
for a particular make or not, which allows us to isolate ourselves from the
changes of others.  I would be concerned with a system that uses common
binaries which did not provide this isolation.  If a member of the team
releases a change which involves several binaries, one or more of which I am
also making changes to, is there some way to automatically prevent the
incorporation of the rest of the binaries involved in this change?  

For example, if a system consists of modules a, b and c, suppose someone
else has checked out b and c when I proceed to check out b.  I begin to make
and test changes to b, but suddenly a new version of b and c are released to
the world.  If these foreign changes to b and c are dependent on one
another, is DSEE flexible enough not to include the new object for c in my
subsequent builds of the system?  My guess is that since I have not checked
out c, I would get the common version, which may mean that my bind will fail
(if for example, my version of b had a vestigial external reference,
formerly resolved by c).

Management of binaries may be a very economic feature, but there are
pitfalls (which DSEE may handle adequately).  I know that private working
directories with Make/RCS can do the right thing.  I am curious whether DSEE
can do the same.  This is not meant as a challenge, just a plea for
information.

        DSEE can automatically perform actions when something happens (such as
    the replacement of an item).  So you can tell it to watch a file for you
    and tell you when it changes.

Change notification is certainly a useful service in a multi-person project.
We handle change notification by requiring a release note whenever any set
of files is returned to the common directory.  This gives an opportunity for
the developer to summarize the changes as well as give notice of them.  The
release note is mailed to each of the developers, to be examined at their
leasure.  I expect that DSEE uses a similar mechanism.  Though I rarely have
need for an explicit notice on the release of a particular file, I'll not
deny that it may have some use.  Providing a server for that purpose does
seem a bit of overkill (I am assuming that this function is handled by the
DSEE server), so I hope the server has more purpose than simple notification.

        DSEE supports threads.  These allow you to tell DSEE what versions of
    what files to use when you build, and what options to use on what files
    (either specifically or by wildcard).

    Threads allow you to do things like (as I just did now) tell DSEE that I
    want a different version of a header file and have it promptly know that
    as a result it has to recompile five other programs that depend on that
    file.

This appears to provide some features not available in Make.  I would have
to get a better description of them to acertain that.  I've felt for a long
time that the option selection features of Make are underpowered.  Regarding
the specific example, I would probably never use such a "thread."  We keep
two (actually more) versions of our program around, one with no debug and
one with debug in every module.  This avoids the hassle of chasing down a
bug, only to find that the path leads through a module which does not have
the debug symbol tables loaded.

        DSEE supports multiple builders.  For instance I have a version of
    the ICON programming language under DSEE.  It consists of 4 libraries
    (insert files, translator, linker, runtime) under control of a single
    dsee model.  There are about 80 files in all.  I'll start off a rebuild
    of everything right now and see how long it takes.  This will take a
    little longer than a normal build, since it purges the old builds in the
    process, but it should still give you and idea of the advantage.

I don't understand what you mean by "multiple builders."  Is it something
different from multiple Make targets?  You say that this should give us an
idea of the advantage.  I'm having trouble understanding what it is that
this gives an advantage over.  Are you comparing multiple builders versus a
single builder under DSEE?  Or, are you comparing features of DSEE and Make?
-- 
Robert Reed, Tektronix CAE Systems Division, bobr@zeus.TEK

Erstad@HI-MULTICS.ARPA (06/24/87)

This is mostly response to Robert Reed's comments regarding DSEE and
software development.  Two things should probably be made clear.  First,
I have no direct experience with RCS and will not make a direct
comparison based on what little I know.  Second, I am not trying to say
anyone "should" need any of the DSEE capability.  What I am saying is
that it has proved useful in our environment.  I'm not out to sell
anyone on anything.

Shared binaries save us LOTS of disk space.  I did a rough estimate
based on our common binaries and the number of people which would/might
be replicating them and at a minimum our disk space requirements would
increase by 100 MB.  Even though we have 7+ Gig, this is a lot to us.
This also saves a lot of build time when new releases are made.

There are several ways to isolate a development effort from unwanted
changes.  One is to use a command which will identify all the current
versions of source modules at the start of an effort.  In one's thread,
a statement of the form

   [dave_wants_these_versions]

will automatically use only those versions.  In general, I don't bother
since we tend not to make non-upward compatible changes (in theory at
least...)

Threads are very useful if, for reasons beyond your control, you are
forced to field multiple versions of code either temporarily or
permanently.  To date, we have never wanted to provide build control we
have not been able to do within DSEE.  The naming capability mentioned
earlier is one example of a useful capability of threads.  After
performing a build which is to be released, a command like

   NAME VERSION build_name  this_build_went_to_freds_group

allows me to readily rebuild the same build by using a meaningful name.
Additionally, I can use all of these items EXCEPT grab one module from a
different version, and so forth.  Multiple lines of descent are also
supported by the threads.

The multiple builder capability refers to having multiple nodes compile
in parallel.  This hasn't been released to me yet, so I can't comment on
the implementation, but for us this is a nice-to-have rather than a
critical item.  I can see where it is useful for sites with millions of
lines of code (like Apollo) rather than 100s of thousands of lines (like
us)

Dave Erstad

Honeywell SSED

nazgul@apollo.uucp (Kee Hinckley) (06/25/87)

In article <1907@zeus.TEK.COM> bobr@zeus.UUCP (Robert Reed) writes:
> to a common RCS directory.  We have the option of including updated sources
> for a particular make or not, which allows us to isolate ourselves from the
> changes of others.  I would be concerned with a system that uses common
> binaries which did not provide this isolation.  If a member of the team
> releases a change which involves several binaries, one or more of which I am
> also making changes to, is there some way to automatically prevent the
> incorporation of the rest of the binaries involved in this change?  
> 
> For example, if a system consists of modules a, b and c, suppose someone
> else has checked out b and c when I proceed to check out b.  I begin to make
> and test changes to b, but suddenly a new version of b and c are released to
> the world.  If these foreign changes to b and c are dependent on one
> another, is DSEE flexible enough not to include the new object for c in my
> subsequent builds of the system?  My guess is that since I have not checked
> out c, I would get the common version, which may mean that my bind will fail
> (if for example, my version of b had a vestigial external reference,
> formerly resolved by c).

Good question.  I'm having a little trouble explaining things today, so
bear with me, this is probably a bit longer than it need be.

Presumably you checked out B by making a branch off the mainline (since
the other person already had the mainline reserved).  (Alternatively of course
they may have made the branch and you have the mainline, it doesn't really
make any difference to this issue.)

In your thread you can specify any number of constraints.  The default is
simply to take any reserved files it finds in the current directory, and
then to take the most recent version of the common stuff.  If this is the
case then you won't run into any problems until someone changes something
in the mainline that you depended on.  In the case you describe if the new
versions of B and C are off on a branch then you don't have to worry, you'll
never see the binaries.  On the other hand if they are in the mainline (or
have been merged back into it) you could have problems.

The way to avoid this problem is to specify in your thread what version of
the common files you want to use.  I think this is probably best explained
by an example:

Here's what exists now (bracketed items are version numbers):
        a[1]    -> named SR9.2
        a[2]    -> named SR9.5

        b[1]    -> named SR9.2 and SR9.5

        c[1]
        c[2]    -> named SR9.2
        c[3]
        c[4]
        c[5]    -> named SR9.5

Now, P1 and P2 or both doing development based on SR9.5.  So your threads look
like this (we try and make sure that everyone working on a project uses the
same base thread so we know exactly what's going into the things we release):
    -for ?*.c -reserved -use -g
    -reserved
    -for ?*.c -use -O
    [sr9.5] -when_exists
    []

Read in order that says that:
o   All reserved C files are compiled with "-g".
o   Any other files that are reserved are also used.  (If we took out these
    two lines then we wouldn't use any reserved versions at all, useful when
    you want to make a pure release without including the stuff you are
    currently working on.)
o   All non-reserved files are compiled with "-O".
o   If there's a non-reserved file that has a version named "SR9.5", use that one.
o   Otherwise take the most recent version of the it.

Okay, now P1 reserves file "b" and file "c" from the mainline to do some work.

P2 comes along and wants to make a change to file "b".  Since it's already
reserved P2 has to create a branch, let's call it "/P2".

Now P1 replaces file "b" and file "c" with changes that are incompatible to
what P2 is doing.  As it happens P1 names these versions, but it doesn't really
matter for our purposes.  Now the state looks like this (ignoring file "a").
        b[1]    -> named SR9.2 and SR9.5
        b[2]    -> named SR9.6_BL1

        c[1]
        c[2]    -> named SR9.2
        c[3]
        c[4]
        c[5]    -> named SR9.5
        c[6]    -> named SR9.6_BL1

If P2's thread hadn't had the line "[SR9.5] -when_exists" then P2 would now
be getting the most recent version of "c" and would be in trouble.  However
as it stands P2 will continue getting version 5 (SR9.5) of "c" and will still be
getting the reserved copy of "b".  So everyone's happy.

Now let's take the scenario a little further.  P2 decides to check "b" back
in, but not to merge it into the mainline yet since it still needs some testing.
Now the state of "b" looks something like:
        b[1]    -> named SR9.2 and SR9.5
            b/P2
        b[2]    -> named SR9.6_BL1

Now if P2 wants to specify that in P2's compiles that branch is always used,
the thread has to be changed to look like:
    -for ?*.c -reserved -use -g
    -reserved
    -for ?*.c -use -O
    /P2 -when_exists
    [sr9.5] -when_exists
    []

Note that because of the order the "/P2" branch will be used whenever their
might be a conflict between "/P2" and "[sr9.5]".  And all throughout this
there will still be no confusion between the binaries.  In fact, since
DSEE knows what compile options were used, it won't even confuse binaries
that have different compile options, so if you specified that all non-reserved
binaries were also compiled with "-g" you would only get those common binaries
which had been compiled with that option.

> Change notification is certainly a useful service in a multi-person project.
...
> deny that it may have some use.  Providing a server for that purpose does
> seem a bit of overkill (I am assuming that this function is handled by the
> DSEE server), so I hope the server has more purpose than simple notification.

I had to run off and ask about this one.  Basically there are two kinds of
monitors.  One executes arbitrary commands when you perform a specified action 
(e.g. send mail to everyone telling them that a file has been checked in).  The
other adds "tasks" to a "task-list".  I'm not going to go into any detail on
that, since I've never used it.  The basic idea is that you can set up a task
list of things to do, and the act of checking a file in or some similar
action can be made to trigger a task which will remind you of what you have
to do next.  As it turns out neither of these use a server, both are executed
by DSEE at the time you perform the action.  (You can also specify whether
a monitor is to be executed for everyone, just for you, or just for everyone
else.  And obviously there are protections concerning *who* can create monitors.)

> I don't understand what you mean by "multiple builders."  Is it something

Sorry, I should have explained the terminology.  

A "builder" is the machine ("node") on which the compile ("build") actually takes
place.  What DSEE allows you to do is give it a list of up to 20 different
machines which can be used to build things.  Since DSEE has a clear idea of
who depends on what, it knows which things can be built in parallel and which
must wait for the results of some other build.  It then spawns off each 
task to a different machine.  Two things make this even more useful.  

First of all it checks the load on the machine and won't use it unless it's 
relatively idle.  That way you seldom notice that someone is using your machine 
for a build.  Occasionally you will notice if you start something while a build 
is underway, but DSEE checks for activity before each individual build, so once 
the current task is done it won't come back until your node is idle again.

Secondly it allows you to specify a reference pathname which the "/" directory
will resolve to.  In other words, if my model makes references to tools in
"/bin" and sources in "/usr/nazgul/src" I don't have to worry about whether
the tools are the same on other machines, or whether the pathname (or link)
"/usr/nazgul/src" exists on the remote machine.  As far as the remote process
is concerned the directory structure looks like my machine.  That means I
don't have to specify absolute network pathnames anywhere and then worry about
what happens if I change the name of my machine or move the sources somewhere
else.

All of this is done automatically, so long as you somewhere tell DSEE what
machines to use.  For instance, here's the line from my dsee startup file.

set builder -reference //morgul //anarres //morgul //faerie //caruso //northpeak //marvin

That tells it that the reference machine (to which "/" will refer) is mine
("//morgul") and that when doing builds it should look for CPU time on "//anarres",
then my node, and then the others, so on down the line.  (Actually, looking at
this I just realized that I only have 6 builders listed, I think I'm going to
go hunt down some more fast machines. :-)

....

And THAT is more than enough about DSEE for now.  Hope it helps.

                                                    Kee Hinckley
                                                    User Environment
-- 
UUCP: {mit-erl,yale,uw-beaver}!apollo!nazgul  ARPA: apollo!nazgul@eddie.mit.edu
I'm not sure which upsets me more; that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate 
everyone else's.

mishkin@apollo.uucp (Nathaniel Mishkin) (06/26/87)

I think Kee's message did a fine job of explaining the subtleties of
DSEE.  One important point about sources and binaries is getting lost
in the noise though:  DSEE users always deal with sources and source
specifications, not binaries.  DSEE completely manages binaries (and
assuming the user hasn't lied about the relationships among the pieces
of his system, DSEE doesn't ever get confused about what binaries
correspond to what sources and options.)  The DSEE user does not say
"use these binaries", s/he says "I want the thing that results from using
such-and-such versions of sources".  "such-and-such versions of sources"
is a function of the "thread" (described in Kee's message) and the "system
model", a block-structured description of the relationships among sources
and results of compilations.  A sample model fragment:

    aggregate ls =
        depends_result
            element ls.c =
                depends_source
                    sort.h;
                end;
        
            element sort.c =
                depends_source
                    sort.h;
                end;
        end;

This says that there's an aggregate (read "program") "ls", the building
of which depends on the result of building "ls.c" and "sort.c", and that
both "ls.c" and "sort.c" depend on the source (i.e. textually) of "sort.h".
(Note: there more to the model than I've shown.) The model says nothing
about particular versions or binaries.

When a user asks to build "ls", the model plus the user's current thread
yields a tree-structured description (analogous to the model tree
structure, with the addition of particular version numbers for every
source element) of what the user wants.  DSEE then searches its "binary
pool" (a directory that the user never looks at where DSEE keeps binaries;
binaries are tagged with the tree-structured description that describes what
source versions were used to build the binaries).  If it finds matching
entries in the pool, it just uses them; if it doesn't, it creates them
(by running the correct "translate" rule from the model).
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {wanginst,yale,mit-eddie}!apollo!mishkin

bobr@zeus.TEK.COM (Robert Reed) (06/30/87)

Thanks for the exposition about DSEE, Kee.  It's clear that DSEE can handle
the source parallax problem I previously described.  I still like the system
we have with Make and RCS better.  We buy convenience, and pay for it with
disk space.

From your explanation, I see that the difference between the two systems is
that where we have an implicit selection of version by choosing which
source/object pair is sitting in our individual development directory, the
canonical approach in DSEE is to maintain an explicit enumeration of all the
versions which comprise a particular "build" of the system.  Where we just
copy a new source file into our development directory, the DSEE approach is
to edit the "thread" to redefine the configuration through explicitly named
releases of system components.

Not being a DSEE user, I can only speculate on how well this works, but if
you have a complicated system with many (say a hundred) source files and
many (say more than 5) developers, either the threads must get quite
complicated or a lot of time is spent assigning new version numbers to the
tops of trunks of all the components.  Do you find either of these fit your
operating environment, and do either of them pose a problem?  Do you have
any mechanism which automatically updates your thread to adjust the build
for:

    1) Sealing your particular set of versions to prevent such side effects
	as have been described,

    2) Incorporating changes made by other developers, performing source
	level merges where necessary, while preserving your changes as
	nonreleased reservations to prevent possibly buggy development code
	from reaching other developers,

    3) Release your private changes to the world, making your next build be
	strictly "top of trunk".

I suspect such a utility should be possible, though it would probably
require a lot of global version number assigment to manage the bookkeeping.
All these functions are of course quite easy using our existing system.

We do pay other penalties for our approach.  Since we have separate copies
of binaries, we get a lot of duplication in compilation.  Include file
changes can propogate a lot of identical and unnecessary build steps.  Most
people have cheater scripts that do a simple compile and link for making
small changes to the system, and periodically these scripts must be updated
to reflect the new set of components and compile time options required to
build the system.

    Basically there are two kinds of monitors.  One executes arbitrary
    commands when you perform a specified action (e.g. send mail to everyone
    telling them that a file has been checked in).  The other adds "tasks"
    to a "task-list".  I'm not going to go into any detail on that, since
    I've never used it.  The basic idea is that you can set up a task list
    of things to do, and the act of checking a file in or some similar
    action can be made to trigger a task which will remind you of what you
    have to do next.  As it turns out neither of these use a server, both
    are executed by DSEE at the time you perform the action.

Okay, so the server is not used for notification.  I've heard that DSEE does
have a server.  Why?  What functions DOES it perform?

    A "builder" is the machine ("node") on which the compile ("build")
    actually takes place.  What DSEE allows you to do is give it a list of
    up to 20 different machines which can be used to build things.  Since
    DSEE has a clear idea of who depends on what, it knows which things can
    be built in parallel and which must wait for the results of some other
    build.  It then spawns off each task to a different machine.

I've been intrigued with this sort of facility since I first heard it
described a year or more ago.  A friend who works at Sequent described their
"parallel Make" which takes (mostly) a standard makefile and performs the
same function.  Unfortunately, we have no such facility here.  (As Apollo
migrates towards a more UNIX-like SR10, it would be great if they built a
similar function into the Make they provide :-)
-- 
Robert Reed, Tektronix CAE Systems Division, bobr@zeus.TEK

joelm@apollo.uucp (Joel Margolese) (07/01/87)

In article <1938@zeus.TEK.COM> bobr@zeus.UUCP (Robert Reed) writes:
>    From your explanation, I see that the difference between the two systems is
>    that where we have an implicit selection of version by choosing which
>    source/object pair is sitting in our individual development directory, the
>    canonical approach in DSEE is to maintain an explicit enumeration of all the
>    versions which comprise a particular "build" of the system.  Where we just
>    copy a new source file into our development directory, the DSEE approach is
>    to edit the "thread" to redefine the configuration through explicitly named
>    releases of system components.

You only need to edit a thread when you want rebuild a particular version
of something or select (and lock) the versions that you use.  The "default"
thread is:
 -reserved
 []
Which means, any element that I have reserved or the most recent version of
all other elements.  Therefore, whenever any source files are replaced,
(or "put back into your development directory") they will be picked up.  Since
that behaviour is what most people (around here anyways) seem to want most
of the time, it just works.

If you are working on a specific project, the project may have a thread
such as:
-reserved
/sr9.5 -when_exists
[sr9.5] -when_exists
[sr9.0]

Which says: give me the latest version of anything on a branch names /sr9.5,
or give me a version named [sr9.5], which is a fixed name, or just give
me the version that was used in sr9.0.  The selection is done by first match.

i.e.
       element foo[1]       bar[1]                   fred[1]
               foo[2]       bar[2]->bar/sr9.5[1]     fred[2][sr9.5]
               foo[3][sr9.0]          bar/sr9.5[2]
                            bar[3]       

For element foo we select version 3, for bar we select bar/sr9.5[2]
and for fred we select version 2.

Note that I never change my thread again for the life of the project, nor
do I have to know what versions change or have to maintain separate
development directories.                

>Not being a DSEE user, I can only speculate on how well this works, but if
>you have a complicated system with many (say a hundred) source files and
>many (say more than 5) developers, either the threads must get quite
>
>    1) Sealing your particular set of versions to prevent such side effects
>	as have been described,

We periodically take make baselevels.  To do this an administrator will 
take a snapshot by building the object (assuming that it is tested, stable
whatever) and creating a release.  The release contains a description of
what versions were used to build the object and all of its subcomponents.
(It can contain more, but that is all we bother to save.)  Note that
we do NOT snapshot binaries.  All you need is the description of how
to build anything and the binaries are regenerable if they happen to get
flushed out of the pools.  If the baselevel is used as a reference point,
the binaries stick around, if not, say as the baselevel ages and becomes
obsolete, the binaries get flushed out of the pool automatically.  Therefore
the cost of a baselevel (release) is very low.  (a few disk blocks.)
To build off of such a baselevel the thread just needs to say:
  -reserved     (if you want this)
   foo!/release/pathname -versions

>    2) Incorporating changes made by other developers, performing source
>	level merges where necessary, while preserving your changes as
>	nonreleased reservations to prevent possibly buggy development code
>	from reaching other developers,
>
DSEE also provides quite a good 3 way merge system, (vastly improved: i.e. now
usable) at version 3.0) that does source level merges.  It does not work that
well if the changes are *really* extensive, but for routine changes, it
generally does the whole merge automatically.  (And usually gets it right!)
Because of the history files, DSEE can see the common ancestor and the two
changed files and figure out which file changed what.  It only asks for your
help if both versions changed the same code, then you get to pick which version
you like better, or edit it yourself.  It also keeps track of what has
been merged, so that if you change 30 modules you can say something like:
 show elements -missing -merge -with foobar
The changes are not replaced until the developer issues a replace command.  This
is useful for several developpers who want to stay in sync.

>    3) Release your private changes to the world, making your next build be
>	strictly "top of trunk".
>                           
Answered above, I think.

>All these functions are of course quite easy using our existing system.

Our experience here (which does not necessarily apply everywhere) was it was
no easier for just doing mainline development under dsee, but once you
started doing anything in parallel, or wanted to experiment with versions
DSEE became invaluable.  It is the old story: You don't miss it until you've
tried it and lost it.

>
>We do pay other penalties for our approach.  Since we have separate copies
>of binaries, we get a lot of duplication in compilation.  Include file
>changes can propogate a lot of identical and unnecessary build steps.  Most
>people have cheater scripts that do a simple compile and link for making
>small changes to the system, and periodically these scripts must be updated
>to reflect the new set of components and compile time options required to
>build the system.

This sounds like much more work that we do to keep track of the correct thread,
and a lot more error prone.  DSEE also gives us the ability to look at a
build in a DSEE pool and determine exactly what versions of each file were
used so that we can have a high degree of confidence in what we've built.
DSEE also allows you to declare "equivalences", it means saying: I've just
changed foo.ins.pas, and there are 72 modules that depend on it.  But I
know that that only 3 need to be rebuilt.  For all the others, assume
that foo.ins.pas[3] is equivelent to foo.ins.pas[2].  Specifing that is
not very hard, DSEE will prompt you for everything.

>
>    Basically there are two kinds of monitors.  One executes arbitrary
>    commands when you perform a specified action (e.g. send mail to everyone
>    telling them that a file has been checked in).  The other adds "tasks"
>    to a "task-list".  I'm not going to go into any detail on that, since
>    have to do next.  As it turns out neither of these use a server, both
>    are executed by DSEE at the time you perform the action.
>
>Okay, so the server is not used for notification.  I've heard that DSEE does
>have a server.  Why?  What functions DOES it perform?

DSEE does not (in general) have a server.  There is a d3m server which runs
on nodes that have DSEE libraries, d3m is the database manager.  DSEE also uses
the alarm_server (which runs on most nodes anyways) as a way of sending some
notifications.  (One of the possible monitor actions is to send an alarm to
somone that some element has been modified or whatever.)  DSEE itself can run
as a server and will then accept input via mailboxes.  This of course works
across the network the same as on individual nodes.  This allows you to
design a custom interface or a set of shell level commands or whatever
to communicate with DSEE.  Most DSEE functions are also available via a
released (*some* things get released! :-)) set of calls to the dseelib
global library, so you could write your own server if you really want to.

I am not a DSEE developper, just a heavy user who likes to tell people about
neato things they can do by moving up to newer tools.  (Ok, so I'm biased,
why not?)

                                                      Joel
-- 
Joel Margolese          UUCP:      ...{attunix,uw-beaver,decvax!wanginst}!apollo!joelm
Apollo Computer         ARPA:     apollo!joelm@eddie.mit.edu
                        INTERNET: joelm@apollo.com

bobr@zeus.TEK.COM (Robert Reed) (07/02/87)

In article <35cd8d38.86ca@apollo.uucp> joelm@apollo.UUCP (Joel Margolese) writes:

    > versions which comprise a particular "build" of the system.  Where we just
    > copy a new source file into our development directory, the DSEE approach is
    > to edit the "thread" to redefine the configuration through explicitly named
    > releases of system components.

    You only need to edit a thread when you want rebuild a particular
    version of something or select (and lock) the versions that you use.
    The "default" thread is:
     -reserved
     []

    Which means, any element that I have reserved or the most recent version
    of all other elements.  Therefore, whenever any source files are
    replaced, (or "put back into your development directory") they will be
    picked up.  Since that behaviour is what most people (around here
    anyways) seem to want most of the time, it just works.

This simplified approach of course breaks the scheme that Kee so carefully
outlined to isolate simultaneous source modifiers from partial release side
effects, and so it doesn't answer the question I was posing.


    > 1) Sealing your particular set of versions to prevent such side effects
    >    as have been described,

    We periodically take make baselevels.  To do this an administrator will
    take a snapshot by building the object (assuming that it is tested,
    stable whatever) and creating a release.

Perhaps I failed to explain myself adequately.  Perhaps you failed to read
the discourse between Kee and myself.  In any case, I don't believe you
understand my questions.  The members of our engineering group have
interchangable responsibilities for code and so operate in an environment
where it is desirable to allow multiple simultaneous reservations for
sources.  One of the side effects of such an environment is a sort of update
parallax error, where the simple priority scheme you outline for determining
which version to use just won't work.

    > 2) Incorporating changes made by other developers, performing source
    >	level merges where necessary, while preserving your changes as
    >	nonreleased reservations to prevent possibly buggy development code
    >	from reaching other developers,

    DSEE also provides quite a good 3 way merge system, (vastly improved:
    i.e. now usable) at version 3.0) that does source level merges.  It does
    not work that well if the changes are *really* extensive, but for
    routine changes, it generally does the whole merge automatically.  (And
    usually gets it right!) Because of the history files, DSEE can see the
    common ancestor and the two changed files and figure out which file
    changed what.  It only asks for your help if both versions changed the
    same code, then you get to pick which version you like better, or edit
    it yourself.  It also keeps track of what has been merged, so that if
    you change 30 modules you can say something like:

     show elements -missing -merge -with foobar

    The changes are not replaced until the developer issues a replace
    command.  This is useful for several developers who want to stay in
    sync.

These seem on par with the facilities available through RCS.

    > We do pay other penalties for our approach. ...

    This sounds like much more work that we do to keep track of the correct
    thread, and a lot more error prone.  DSEE also gives us the ability to
    look at a build in a DSEE pool and determine exactly what versions of
    each file were used so that we can have a high degree of confidence in
    what we've built.  DSEE also allows you to declare "equivalences", it
    means saying: I've just changed foo.ins.pas, and there are 72 modules
    that depend on it.  But I know that that only 3 need to be rebuilt.  For
    all the others, assume that foo.ins.pas[3] is equivelent to
    foo.ins.pas[2].  Specifing that is not very hard, DSEE will prompt you
    for everything.

No, there is very little work involved since the list of linker modules
changes very infrequently these days, and when it does, there is a file
which always has the proper list, so editing involves deleting the old list
and incorporating a new one.  I do that once every couple of months.  The
equivalency notion of DSEE sounds intriguing.  Do such equivalencies ever
get obsoleted?  Are they easy to keep track of?  Are they based on analysis
or hunch?

This has been a great discussion.  Unfortunately (or fortunately), I'm going
on vacation for a couple of weeks, and so will have to drop out now.  See
you when I get back.
-- 
Robert Reed, Tektronix CAE Systems Division, bobr@zeus.TEK

billj@zaphod.UUCP (07/04/87)

Although I've read the published papers on DSEE, and seen it demoed in
toy situations, I still have some uncertainty about the system's
capabilities.  Though the questions below deal with more demanding
situations, they are not intended as potshots, just requests for
information on DSEE's current practical limits.  All the situations
below, BTW, are everyday real-life concerns in our own environment.

- is there support for maintaining, as Bob Reed mentioned, parallel
  systems of derived objects, say one with debugging turned on and one
  not?  Or will most of the binaries have to be recreated on each build
  as the pool cache capacity is reached?

- in the above environment, what support is there for variant
  implementations of the same module?  Must the elements be named
  differently, and the system model preprocessed, to avoid conflict, or
  can they be distinguished automatically in the directory structure?

- can a bug-fix or feature branch to several elements be treated
  (e.g. named or merged) globally, or must the operations be repeated
  for each element affected?

- can a system be composed of subsystems each maintained in its own
  directory structure, or must there be a single system model (with
  absolute pathnames) for the whole?

The next few questions deal with evolutionary systems where not only
the module code, but the module structure itself may change over time.

- the extracts quoted from system models indicate that include
  dependencies must be specified manually.  I understand that there is
  a make_model program to determine these automatically when the model
  is first written, but has there been any support added to generate
  them dynamically at build time?

- if the structure of the system is changed at some point in a way
  which significantly affects the system model, will subsequent
  modifications to earlier configurations see the new or old model?

My thanks to the long-suffering correspondents at Apollo for any
answers they can provide.
-- 
Bill Jones, Develcon Electronics, 856 51 St E, Saskatoon S7K 5C7 Canada
uucp:  ...ihnp4!sask!zaphod!billj               phone:  +1 306 931 1504

oj@apollo.UUCP (07/07/87)

In article <1745@zaphod.UUCP> billj@zaphod.UUCP (Bill Jones) asked
a bunch of questions about DSEE.  I'm another Apollo in-house DSEE
user, and fan, but I'm not a DSEE developer.  Here are my answers
to Bill's questions (which I've trimmed a bit):

>- is there support for maintaining parallel
>  systems of derived objects...? Or will most of the binaries
>  have to be recreated on each build
>  as the pool cache capacity is reached?

Parallel systems are supported.  Different pools can
be used for different variants(like debugging on and off).
Options (typically, compiler options like CPU variants, debugging,
optimization) can be specified as either
 -- "critical" meaning if the binary pool doesn't contain an
               an object that was made with the option, DSEE will
               remake it with the option.
 -- "noncritical" meaning that DSEE will accept the same source
                  file, but compiled with a different options.
(Note that all dependencies, including insert files, can be
critical or noncritical.)
"Pool capacity" means three things:
  (1) disk storage capacity--limited by the disk a pool is on;
      however, pools can be split among disks and nodes.
  (2) a -limit option to set the number of versions of a derived object
      that are retained in a pool.  
  (3) an -age option to specify the number of hours that  a  derived 
      object  version  must  have  been  sitting in the pool to be a
      candidate for purging.  We keep this just large enough to give
      us a chance to actually use the objects.

>- what support is there for variant
>  implementations of the same module?

Variant implementations are kept on branches in the source library.
For example, I might do the following:
   DSEE>  set library /gpr/src                  #to access the GPR source library
   DSEE>  create branch sixteen text.pas        #to make the branch "sixteen" for text
            .... edit....
   DSEE> replace text.pas/sixteen               #check in my branch
   DSEE> create branch kanji text.pas/sixteen   #make a branch off a branch
            .... edit....
   DSEE> replace text.pas/sixteen/kanji         #check in my kanji branch

>  Must the elements be named
>  differently...?

No, not really.  Once I create and check in my kanji branch, I can
read it using the path name /gpr/src/text.pas/sixteen/kanji 

>  and the system model preprocessed...?

No need to "preprocess" the system model explicitly.  The configuration
thread settings can be used to select which branch or version of each module
get compiled when DSEE does a build.  For example, one might use this
configuration thread:

    -reserved
    -for vector.pas -use_options -opt 2
    -for '?* @ gpr' /sixteen/kanji -when_exists
    -for '?* @ gpr' /sixteen       -when_exists
    [sr9.6]                        -when_exists
    []

This means: 1.  Use my reserved version of any module if I have one.
                In this case a module is either a source or header
                (insert) file.   Compilations run under DSEE automatically
                locate the thread-selected versions and branches of header files.
            2.  Compile vector.pas with "-opt 2"
            3.  Use the /sixteen/kanji branch for any module for which
                that branch exists.
            4.  Use the /sixteen branch for any module for which that exists.
            5.  Use the version named sr9.6 for all modules that have such
                a version name.
            6.  For all other modules, just use the latest stuff.

>  can they be distinguished automatically in the directory structure?

Yes.

>- can a bug-fix or feature branch to several elements be treated
>  (e.g. named or merged) globally, or must the operations be repeated
>  for each element affected?

I'm not sure whether branches can be created or merged globally.  The branch and
configuration thread system is flexible enough that there's no need to 
take branches of "everything," so I've never tried it.
Version names can be applied either globally or module-by-module.

>- can a system be composed of subsystems each maintained in its own
>  directory structure?

Yes.  This is standard practice here.  A system model can refer to several
DSEE source library areas.

>  Must there be a single system model (with
>  absolute pathnames) for the whole?

You can have as many system models as you like.  However, we
ordinarily use just one for each major component, and maintain it as a DSEE element, with
branches and version names.
As for absolute pathnames, what we do is create a standard set of links in the root
directory of each developer's node.  For example, here are some of the links in the root
structure of my node:

dl               "//id/dl"
dm               "//guess/what/dm"
gmr              "//ice/cream/gmr"
gpr              "//tribble/gpr"
gsr              "//ice/cream/gsr"
os               "//os/os"
uet              "//guess/what/uet"
us               "//us/us"
x10              "//morgul/x"

Most (not all) of these directories have a "src" and a "ins" DSEE structure under them.
Thus, I (and everybody else) can find the gpr file text.pas by using the path name
/gpr/src/text.pas ... this resolves, through my link to //tribble/gpr/src/text.pas .

Things do change...
Once in a while, somebody sends around mail saying, for example
     "/dm" is moving from //ice/cream to //guess/what .
At the same time, they do a copy_tree ($ cpt or % cp -r) of the DSEE structures
to the new node / volume, and put a symbolic link in place of the old structure.
Thus, until developers make the change, they traverse an extra symbolic link.

So yes, you need absolute pathnames for multiple libraries, but using links
mitigates the problem.

>The next few questions deal with evolutionary systems where not only
>the module code, but the module structure itself may change over time.
>- the extracts quoted from system models indicate that include
>  dependencies must be specified manually.  I understand that there is
>  a make_model program to determine these automatically when the model
>  is first written, but has there been any support added to generate
>  them dynamically at build time?

Not that I know of...  but builds do succeed, with a DSEE warning message
when you omit a dependency.  I use this feature to "debug" my system
model.  Here's a transcript from a DSEE session showing this.

Completed "gsr_vector.pas" on //VERMONT (6BAD)
Build gsr_vector.pas!6-Jul-1987.22:34:02
?(DSEE/Streams Manager) Element "//us/us/ins/gpr.ins.pas" was read but was not declared as an Element dependency.
?(DSEE/Streams Manager) Element "//us/us/sys/ins/gpr.ins.pas" was read but was not declared as an Element dependency.
No errors, no warnings, Pascal Rev 7.2891

>- if the structure of the system is changed at some point in a way
>  which significantly affects the system model, will subsequent
>  modifications to earlier configurations see the new or old model?

Models can be placed on branches, and snapshotted as part of
the release-creation process, so the answer is that modifications
see the appropriate model.

>My thanks to the long-suffering correspondents at Apollo for any
>answers they can provide.

Thank you for reading this far.  This turned out to be sort of long.

Ollie Jones.  Apollo Computer.   I'm responsible for my own statements
                                 and misstatements.  My knowlege of DSEE
                                 comes through (a) using it and (b) reading
                                 the published manuals.

joelm@apollo.UUCP (07/07/87)

In article <1745@zaphod.UUCP> billj@zaphod.UUCP (Bill Jones) writes:
>
>- is there support for maintaining, as Bob Reed mentioned, parallel
>  systems of derived objects, say one with debugging turned on and one
>  not?  Or will most of the binaries have to be recreated on each build
>  as the pool cache capacity is reached?

The whole point of the caching scheme is to maintain prallel systems of
derived objects.  Remember, DSEE views the world from the source point of
view.  It decides (forgive me some anthropomorphisms) what sources and
translator options are necessary and then tries to find a build in the
pools that matches the requested versions and options.  You can specify
some options as "critical" to the build, i.e. debugging information,
or as "non-critical" i.e. produce a map file.  If the pools are not
configured with enough depth (i.e. number of builds of each object)
then you can of course thrash, but that is a problem that is easily
solved and one that we almost never encounter.  The pool can be reconfigured
at any time.

>- in the above environment, what support is there for variant
>  implementations of the same module?  Must the elements be named
>  differently, and the system model preprocessed, to avoid conflict, or
>  can they be distinguished automatically in the directory structure?

There is only one "element".  In fact, it is an extensible streams
object.  (From the Open Systems Toolkit.)  So that the directory structure
appears to be consistant, that is to say the elements and all the
varients (branches) appear to exist.  The reality is that they are created
on the fly by the streams manager when they are requested.  This operation
is surprisingly quick.  One of my favorite demos is to open random versions
of case-history objects and show that it is about as fast as opening
plain ascii.  For example the library "src" contains element "pgm.c".
"pgm.c"  pgm.c could have several lines-of-descent (variations) going on
at once.  Each could be refered to by its branch name.  I.e. pgm.c/bug_fix
pgm.c/risky_changes, pgm.c/add_debug_statments.  Note that no special
version is necessary to compile it with debug information.  That is an
option that is specified at build time.

>
>- can a bug-fix or feature branch to several elements be treated
>  (e.g. named or merged) globally, or must the operations be repeated
>  for each element affected?

Each element must be merged independently.  I don't think that I would
trust any merge system that did the merging w/o my supervision.  Most
of the merge work is automatic, so very little intervention is required,
provided your changes on the branch do not conflict with changes on
the main-line of descent.  All elements (branches or main_line) can be
named globally.  Nameing is pretty flexible, you can name specific
versions or just say: name all versions used in this build.

>
>- can a system be composed of subsystems each maintained in its own
>  directory structure, or must there be a single system model (with
>  absolute pathnames) for the whole?
>
The directory structure is independent of the models.  All "elements"
(progams, modules, insert files etc...) live in libraries.  Any system
model can refer to any library, and more then one system model can
refer to the same library.  Generally, you would group your sources
logically in similar libraries, i.e. src modules, insert files etc...
and then create a model to build what you need.  You can use smaller
models to build subcomponents, and then include them (literally with
a %include statement) in a larger model that incorporates many
smaller models.  Most of the time though, it is simpler just to make
one large model.  You can at build time specify which subcomponents
to build.  i.e. build foo.pas, build foo.system (which includes foo.pas)
or just "build" which builds everything in the model.  (Or finds 
appropriate builds in the pools.)

>- the extracts quoted from system models indicate that include
>  dependencies must be specified manually.  I understand that there is
>  a make_model program to determine these automatically when the model
>  is first written, but has there been any support added to generate
>  them dynamically at build time?

There is *sigh* no such support.  Though, you will get a warning if
your program reads a DSEE library element that was not declared.  In
general it is not a great idea to include non-DSEE elements.  It works
just fine, but you lose all of the history information.  Make_model
still needs a bit of work.

>
>- if the structure of the system is changed at some point in a way
>  which significantly affects the system model, will subsequent
>  modifications to earlier configurations see the new or old model?

Part of specifying which versions go into a build is specifying
a version of the model to use.  (Models are generally DSEE library
elements themselves.)  So, if you want to build an older version of
your modules, you would use and older version of the model.

>Bill Jones, Develcon Electronics, 856 51 St E, Saskatoon S7K 5C7 Canada
>uucp:  ...ihnp4!sask!zaphod!billj               phone:  +1 306 931 1504

                                             Joel

-- 
Joel Margolese          UUCP:      ...{attunix,uw-beaver,decvax!wanginst}!apollo!joelm
Apollo Computer         ARPA:     apollo!joelm@eddie.mit.edu
                        INTERNET: joelm@apollo.com