rh@smds.UUCP (Richard Harter) (11/15/90)
This is really a request for comment (and a digression from your regularly scheduled pointer wars.) Background: I am working on a language called Lakota. This is an interpreted language with procedures and functions. It is feasible and natural to write moderately large programs in this language. In so far as is feasible the features of the language are very simple; i.e the idea is that you don't have to be a language guru to use it. The Problem: The issue at hand is globals. In C there are three levels of scope -- program global, source file global, and block. Fortran also has three, blank common, labelled common, and subroutine/function. In many languages with block structure inner blocks inherit variables from outer blocks. And so on. The common (you should excuse the expression) thread is that one wants to share access to data across procedures. In the course of doing this one wants to avoid nasty things like name space pollution and ensure nice things like data hiding and restricted access. The Request: What I am looking for is ideas. What are some of the approaches that can be used, and what are the pros and cons of these approaches. Actually what I am fishing for is an approach that combines elegance and simplicity with the constraint that it should not be confusing or daunting to a naive user. It seems to me that this is a worthy topic for this group. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
chl@cs.man.ac.uk (Charles Lindsey) (11/16/90)
In <242@smds.UUCP> rh@smds.UUCP (Richard Harter) writes: >The issue at hand is globals. In C there are three levels of scope -- >program global, source file global, and block. Fortran also has >three, blank common, labelled common, and subroutine/function. In >many languages with block structure inner blocks inherit variables >from outer blocks. And so on. The common (you should excuse the >expression) thread is that one wants to share access to data across >procedures. In the course of doing this one wants to avoid nasty >things like name space pollution and ensure nice things like data >hiding and restricted access. What you want is variables whose "extent" (i.e. lifetime) is forever, but whose "scope" is restricted to the bodies of the procedures which need to share them. What you want, therefore, is a modules facility, as in Modula-2 or Ada (where they are called packages) - not that those particular languages have necessarily made a perfect job of modules. My feeling is that, with a decent modules system, you can do away with classical block structure altogether.
pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/17/90)
On 15 Nov 90 08:43:27 GMT, rh@smds.UUCP (Richard Harter) said: [ ... on the issue of scoping rules ... ] rh> What I am looking for is ideas. What are some of the approaches that rh> can be used, and what are the pros and cons of these approaches. rh> Actually what I am fishing for is an approach that combines elegance rh> and simplicity with the constraint that it should not be confusing rh> or daunting to a naive user. It seems to me that this is a worthy rh> topic for this group. Oh yes. And I have decided to come out of the closet with a profound and utterly revolutionary secret that I have been holding in my conscience for so many years. Alas, no more. I have to pull this weight off my chest (I am also pulling your leg a bit here :->). Traditional Algol inspired scope rules are completely wrong. Most of the evils and difficulties in modularization and reuse are because of this catastrophic mistake. This is that lower level modules can be made to depend on the details of higher level ones (e.g. global variables) which is crazy. In other words, the mistake is that currently scopes nest in the wrong direction; the right idea is that an higher level entity should be able to see the names in a lower level one, not viceversa. In other words, the right model is not contours and visibility from inside to outside, but a tree and visibility from top to bottom. It must also be possible for an higher level module to manipulate a lower level one's name table. For example argument passing should be defined in these terms; but also this must include the ability to append naming subtrees under a lower level entity. In other words a module should be able to manipulate instances of lower level modules both as to the values that names assume in that module and to the shape of the naming tree beneath it. Modules should be able to be parametric with respect to non only the values of their names, but also those beneath them. Also, it must be possible to create several instances of a module, and bind names and subtrees differently in each instantiation (and this subsumes generators, closures, generics, overloading, polymorphism, ...). There is one possible objection to top bottom (instead of inside outside) scoping and it is that a module can then access unitialized variables of lower lever modules. Well, if it correct, it will not -- it is an error to do so. Please note that top bottom is *not* exactly the reverse of inside outside! A top module can have many bottom modules at the same level of nesting, but an inside module can have only one outside modules at the same level of nesting (multiple inheritance is a poor attempt to get around this). I do believe that this view of scoping is as "powerful" as the other one and is the most elegant and flexible one -- and its has obvious and large modularity advantages. In particular it subsumes inheritance etc... Similar research? That I know of, only BETA, which has a notion of 'pattern' with very similar properties, even if I don't think that the notion that their scopes nest top to bottom is explicit. Disclaimer: the above is an atrocious presentation of some very unusual idea -- plase make allowance for the extremely elliptical style of presentation... -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
peter@ficc.ferranti.com (Peter da Silva) (11/17/90)
I have been facing a similar problem in a series of extension packages for TCL. The technique used in modula, where every "global" object actually has module scope and must be explicitly imported into another module seems attractive, but I've not had occasion to use Modula much to see how well it handles in practice (basically, different Modula compilers all seem to have unique and contradictory runtime libraries... so portable modula code becomes pretty damn hard to write). I've been thinging of doing something like this (in TCL): module modulename { import othermodule symbol... ... export symbol... } Because of compatibility considerations, all existing symbols are assumed to be in a root module "tcl" which is imported implicitly. But this brings up the question of what this means: module modulename { module newmodule { ... } } My first reaction is that in this submodule all symbols in the outer module are defined. This means that the root module is just another example of a module. The downside is that symbol table lookups could become quite hairy, and because this is an interpreted language (though I've got ideas for compiled TCL) this would negatively impact runtime efficiency. Input? -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
forsyth@minster.york.ac.uk (11/17/90)
You might find the following paper interesting. I don't know whether it has been published elsewhere. It supports the view that modules and packages are redundant: use praiseworthy (first-class) procedures instead. %T In Praise of Procedures %A I. F. Currie %I Royal Signals and Radar Establishment %C Malvern %M 3499 %D 1982 %K RSRE,flex
rh@smds.UUCP (Richard Harter) (11/23/90)
In article <Y+=69_G@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > I have been facing a similar problem in a series of extension packages > for TCL... I've been thinging of doing something like this (in TCL): > module modulename { > import othermodule symbol... > ... > export symbol... > } So far, so good. However I don't know how you are using the word "module". Module is one of those words which is used in a number of different senses in different contexts. I like "procedure" and "subroutine" because there is no doubt about what is meant. In any case the principle here seems to be that names within a procedure are strictly local unless explicitly imported or exported. Upon reflection I think that this is a good idea in a typeless language. If one doesn't do this there is a horrible vagueness about what a procedure is doing. > Because of compatibility considerations, all existing symbols are assumed > to be in a root module "tcl" which is imported implicitly. Ouch. Does this mean that there can only be one instance of a symbol? Or does it mean that there can only be one external instance? > But this brings up the question of what this means: > module modulename { > module newmodule { > ... > } > } > My first reaction is that in this submodule all symbols in the outer module > are defined. This means that the root module is just another example of a > module. The downside is that symbol table lookups could become quite hairy, > and because this is an interpreted language (though I've got ideas for > compiled TCL) this would negatively impact runtime efficiency. > Input? If I understand this correctly you are working with a single namespace. Each module can add symbols to the name space dynamically; the names vanish when the module exits. This is sort of like the context idea. Does your export verb mean that symbols are passed down or up? Symbol table lookup may not be all that bad. Here is what I do in Lakota. Symbols are mapped into integers which are indices in a lookup table. When a raw symbol is processed it is hashed into an index into an array of list pointers which, in turn, point into the lookup table. The lookup table structure is something like this: struct symtab { struct symtab *hash_link; int hash_index; char *symtext; int length; int refcount; }; Procedures are memory resident in the form of arrays of integer lists. This means that the symbols in the procedure stay resident for the lifetime of the availability of the procedure whence the reference count stays positive. The result is that symbol table lookup is fast most of the time. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
rh@smds.UUCP (Richard Harter) (11/23/90)
In article <PCG.90Nov16161830@odin.cs.aber.ac.uk>, pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: [ ... on the issue of scoping rules ... ] > Oh yes. And I have decided to come out of the closet with a profound and > utterly revolutionary secret that I have been holding in my conscience > for so many years. Alas, no more. I have to pull this weight off my > chest (I am also pulling your leg a bit here :->). Pull on the right one, please. The left one is already 47 feet long. Piercarlo introduces the interesting notion of inverting the operation of scoping. The idea is, of course, utterly unsettling to those of us for whom the traditional rules are engrained. I am going to have to think about this one. I would be enchanted to see some further development of this idea. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
rh@smds.UUCP (Richard Harter) (11/23/90)
In article <chl.658750215@m1>, chl@cs.man.ac.uk (Charles Lindsey) writes: > What you want is variables whose "extent" (i.e. lifetime) is forever, but > whose "scope" is restricted to the bodies of the procedures which need to > share them. Agreed. [Lifetime need not be forever.] > What you want, therefore, is a modules facility, as in Modula-2 or Ada (where > they are called packages) - not that those particular languages have > necessarily made a perfect job of modules. Well, yes. However I can't say I'm all that happy with what either language does (I am more familiar with ADA than Modula-2.) For example, suppose that in Fortran 2001++ we have the code package foobar public x y z proc1 proc2 private a b c proc3 from another_package d proc4 .... This sort of thing seems plausible. Now come some questions. Can we nest packages? If so, can nested packages at different locations in the nesting tree refer to each other? Can packages contain code which is not in procedures? If so, when is this code executed? What about this situation? package foobar program xyzzy .... use foobar .... program plugh .... use foobar Are these two separate name spaces or do the two programs share the name space? Suppose we have two different libraries, each with a foobar package? One of the complications is that, for my purposes, I want to avoid additional syntax. Solutions that use qualified names with special characters to separate fields lose. .... -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
peter@ficc.ferranti.com (Peter da Silva) (11/24/90)
In article <251@smds.UUCP> rh@smds.UUCP (Richard Harter) writes: > In article <Y+=69_G@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > > I have been facing a similar problem in a series of extension packages > > for TCL... I've been thinging of doing something like this (in TCL): > > module modulename { > > import othermodule symbol... > > ... > > export symbol... > > } > So far, so good. However I don't know how you are using the word "module". Good point. "Module" here refers to a collection of symbols with a common scope. No control flow commonality is implied. Perhaps I should say "package"? Or borrow from forth and call it a "vocabulary". > > Because of compatibility considerations, all existing symbols are assumed > > to be in a root module "tcl" which is imported implicitly. > Ouch. Does this mean that there can only be one instance of a symbol? No, it means that symbols already defined in existing TCL code (such as proc or ErrorInfo) are in the root vocabulary. > If I understand this correctly you are working with a single namespace. Existing TCL has a single namespace. I'm extending this to a group of namespaces. Control flow doesn't come into it. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/27/90)
On 23 Nov 90 07:45:51 GMT, rh@smds.UUCP (Richard Harter) said:
rh> In article <PCG.90Nov16161830@odin.cs.aber.ac.uk>, pcg@cs.aber.ac.uk
rh> (Piercarlo Grandi) writes:
[ ... on the issue of scoping rules ... ]
pcg> Oh yes. And I have decided to come out of the closet with a profound and
pcg> utterly revolutionary secret that I have been holding in my conscience
pcg> for so many years. Alas, no more. I have to pull this weight off my
pcg> chest (I am also pulling your leg a bit here :->).
rh> Pull on the right one, please. The left one is already 47 feet
rh> long.
Ohhhh. I can imagine how badly you must limp (unless you keep it coiled
like a flamingo :->). Hazards of reading News...
rh> Piercarlo introduces the interesting notion of inverting the
rh> operation of scoping. [ ... ] I am going to have to think about
rh> this one. I would be enchanted to see some further development of
rh> this idea.
Ah, a customer! Welcome sir, here we have a full range of interesting
reasons for which top down scope rules are better than inside outside
ones.
Which one would you like to see first? Would you mind "reuse" to start?
If you use top down scope rules you have a tree (picture it as in
traditional CS fashion with the root at the top, like a genealogical
tree) of closures. Each piece of code sees only the closure in whose
context it executes and its descendants.
This means that modules are perfectly reusable; the same subtree can be
linked in transparently in many places in the overall tree (which
becomes a DAG, or even a general directed graph, actually).
Consider the alternative with inside outside; modules that use globals
cannot be reused as easily. In particular, code that uses globals to
fake generators (subroutines that need to remember state between
invocations) is obnoxiously bad to reuse.
WIth top down scoping, two subtrees that need to communicate, instead of
being nested in the same scope, can just share subtrees of their
environment. In a top down scoping language a library is just a subtree
which contains the bindings for all the library entities; it is easy to
reuse the library everywhere without bothering about possible name
clashes at the global level (try to do that in C++ with three libraries
which both define an Object class!).
There is *no* global level. This does not mean that the library or any
functions in it needs to be stateless, because its naming subtree can
well contain bindings for persistent entities -- indeed all the
subroutines in the library will be lumps of code statically bound to
their identifiers.
rh> The idea is, of course, utterly unsettling to those of us
rh> for whom the traditional rules are engrained.
Actually some of top down scope rules is not entirely new. Consider Ada
or many other modular languages; modules nest BOTH inside out and top
down; you can refer to the names of the enclosing module, but also the
enclosing module can refer to the entities in the enclosed ones (dot
notation). I maintain that only the latter should be allowed. If the
enclosing module wants to "show" some names to an enclosed module, this
should be done by the enclosing module appending the relevant naming
subtree under the enclosed one, not the latter importing those names
explicitly (and thus making itself dependent on where it is linked in
the environment tree).
This is the really different aspect of top-down naming. It also implies
that the modules need not be textually nested, and this means that each
module can be compiled on its own, and that shaping the required naming
substrees for a module is the responsibility of the upper (but not
really enclosing) modules. In other words envirionment tree shaping is
totoally decoupled from module implementations, allowing mix and match
between module interfaces and implementations, in a transparent way.
If you want another example in which things may be familiar, consider a
UNIX like filesystem tree in which each directory contains a single
module, which is linked against modules in lower directories, and where
you can use (symbolic) links to share subtrees (or even, and this may
make sense at times, loops in the naming structure).
Unfortunately current linkers enforce a linear name resolution system, so
that if you use a current linker you must produce a preorder traversal
of the tree of module directories, and this may greatly reduce the value
of the tree, because it makes for conflicts and ambiguities.
The above arrangement is often used in actual C programming practice,
because C mercifully essentially disallows nested (inside outside)
scoping (in a C source file the static entities are the module's
closure, the extern references are the roots of lower subtrees and the
extern definitions are the leaves of upper subtrees). What I am saying
is that this (incomplete) arrangement is the one that one wants to use,
not just at the intermodule level, but also thruout the language.
I can easily see possible implementations; they become easy to design
once one gets into the habit of visualizing environment trees (graphs,
actually), whether they are linearized on a stack or not, e.g. see
Baker's landmark paper on shallow binding in Lisp, CACM 1.5.
The fact that most algorithmic languages linearize both the environment
tree and the control tree on a stack has made most people regrettably
assume that it is the only possible way, blinding them to things like
closures, partial application, generators, i.e. fully general (and
orthogonal!) environment and control graph shapes.
As soon as you picture the environment tree you start to realize that
searching for names upwards is madness, because it makes subtrees
context dependent, and kills reuse.
--
Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
Chris.Holt@newcastle.ac.uk (Chris Holt) (12/01/90)
In article <PCG.90Nov27153408@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: [lots of interesting reasons why scoping should be changed from the standard way] Purists have long maintained that anything in one part of a program should be explicitly imported or exported if it is to be visible in another (remember Dijkstra's glovar declarations?). This has never taken off because in real life the number of things that need to be redeclared over and over again is enormous; and really important things can be remembered anyway (at least when writing the code :-). Well, I tend to view objects as 3D structures, e.g. spheres, that have windows in them. Two spheres can be linked together by connecting their windows; then they can see into each other, to a greater or lesser extent (since windows only allow certain things to be seen; they show restricted views). A window may be one-way, rather than two, so you can only see out, or only see in. That is, the user should be able to specify which way scoping works; there may be a default, but it should be under the programmer's control. Furthermore, windows don't have to be just on the surfaces of objects, linking them with their direct (3D) environments. If a module A is linked with B, and B is linked with C, then B should be able to introduce A and C so they each have a window directly to the other, without going through B (this is useful if B decides to terminate before A and C are done). So, once a module has been introduced to a given library object, they can talk without going through the entire hierarchy; they go through the "hyperwindow" connecting them. This approach encourages small interfaces; no object wants another object looking directly at its variables (unless they're in love :-), and each object only sees those windows of other objects that it has been introduced to (what's a nice object like you doing in an environment like this?). The scope of a method becomes a graph, with each method having a different scope. Oh well; just a thought. ----------------------------------------------------------------------------- Chris.Holt@newcastle.ac.uk Computing Lab, U of Newcastle upon Tyne, UK ----------------------------------------------------------------------------- "He either fears his fate too much, or his programming tools are small..."
artg@arnor.uucp (12/08/90)
In article <1990Nov30.191454.29030@newcastle.ac.uk> Chris.Holt@newcastle.ac.uk (Chris Holt) writes: >In article <PCG.90Nov27153408@odin.cs.aber.ac.uk> > pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >[lots of interesting reasons why scoping should be changed from >the standard way] > >Purists have long maintained that anything in one part of a >program should be explicitly imported or exported if it is to >be visible in another (remember Dijkstra's glovar declarations?). >This has never taken off because in real life the number of things >that need to be redeclared over and over again is enormous; and >really important things can be remembered anyway (at least when >writing the code :-). > >Well, I tend to view objects as 3D structures, e.g. spheres, that >have windows in them. Many good, practical reasons argue against global variables. Modules (objects or processes) sharing variables are constrained to run on the same machine unless the distributed system running under them implements shared memory. So sharing variables can prevent a distributed program from being reconfigured, and can make it impossible to do process migration. Sharing, and unconstrained module interface design, explains why computer software functionality and construction has progressed so slowly. My favorite analogy that suggests how software should be decomposed is electronic hardware. Electronics could not have made great strides without the clearly defined INTERFACES between components. Anyone who builds a memory chip that satisfies the interface specs -- physical dimensions (ie pinout positions), voltage levels, and logical behavior -- and improves on PERFORMANCE characteristics of the chip -- power consumption, failure rate, cost and response time -- will get others to use the chip. This lets loads of people and organizations specialize in memory chips and improve their quality. Digital logic's full of widespread and sensible interface requirements, from binary logic levels to addressing that uses a power of two bits. Software, meanwhile, seems to be ruled by the law "First interface to get market share wins", independent of quality. See MS-DOS, OS/360, kermit, Lotus-123, dBase, Unixes, etc. (If I haven't picked on you please dont feel left out. :-) ) My belief is that if programmers and system designers spent more time defining and PROGRAMMING the interfaces between separate modules, (and only shared data specified in the interfaces), then programs would plug together much more easily than they do today and programmers would benefit much more from each other's work. Languages therefore should be able to specify, independently of any program, the data that modules pass between themselves - its type, its degree of initialization, and other representation independent characteristics. Then modules should be composed, much in the way that chips are combined into ALUs, ALUs into boards, and boards into machines, etc., so that each composition can be defined by its set of interfaces with the outside, and the outside doesn't care about the implementation of a given composition. Arthur artg@ibm.com IBM Research Yorktown Heights
turner@lance.tis.llnl.gov (Michael Turner) (12/11/90)
In article <1990Dec7.195140.3022@arnor.uucp> artg@arnor.uucp writes: > >Many good, practical reasons argue against global variables. >Modules (objects or processes) sharing variables are constrained to >run on the same machine unless the distributed system running >under them implements shared memory. So sharing variables >can prevent a distributed program from being reconfigured, and >can make it impossible to do process migration. Sharing, and unconstrained >module interface design, explains why computer software functionality >and construction has progressed so slowly. I've explained to many programmers my view that global variables are to data structures what the GOTO is to control structures (worse, in many ways, if you ask me--I'll use the occasional GOTO, but my references to global variables tend to be to someone else's, not to any that I invented.) A lot of people give me this response: that they only use them when "necessary". When I go look at their code, there are always tons of gratuitous global variables. When is a global necessary? I think the answer is: almost never. I have used them when I needed to speed up code, but in almost every case, I was speeding up code that I hadn't written, and that used algorithms and data structures that I wouldn't have chosen. I never liked what I was doing to the code in the process. To me, the worst part of unconstrained use of global variables is the uncertainty: when reading the code, you find yourself looking at something that has no readily-available context or meaning; you don't know what is going to change it and when. Not only does variable-sharing "prevent a distributed program from being reconfigured", I've found that it can prevent ANY significant program from being reconfigured! It has probably prevented a great many programs from being understood by anyone except the original programmer. Forget about GOTO-phobia, how about "`extern' variables considered harmful? ---- Michael Turner turner@tis.llnl.gov
rang@cs.wisc.edu (Anton Rang) (12/11/90)
In article <1189@ncis.tis.llnl.gov> turner@lance.tis.llnl.gov (Michael Turner) writes: >I've explained to many programmers my view that global variables are to >data structures what the GOTO is to control structures (worse, in many >ways, if you ask me--I'll use the occasional GOTO, but my references to >global variables tend to be to someone else's, not to any that I invented.) Global variables : data structures :: LongJmp : control structures The problem with global variables is that they can affect anything, anywhere in the program. Thus, it's difficult when you see an assignment 'FLAG = 1' to figure out what in the world this might affect. (Comments help, too, but....) >When is a global necessary? I think the answer is: almost never. They're almost never necessary, but they can be convenient when working with languages which don't have nesting. If I read in N, M, and two N-by-M arrays, I'm likely to use a global variable to hold this, instead of passing an extra four parameters to 90% of my procedures. It's sloppy, but.... I prefer working with languages which have nested scope; in that case, I can pass parameters in once, and use them in subprocedures without needing to explicitly pass them. This can be abused, just as globals can, but often it's more clear, IMHO, than explicitly passing parameters--it may be more clear to call a function 'edge(X,Y)' in a test than 'edge(G,G2,N,M,X,Y).' >Forget about GOTO-phobia, how about "`extern' variables considered >harmful? And static variables! And global error statuses! <grin> Anton +---------------------------+------------------+-------------+ | Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison | +---------------------------+------------------+-------------+
rhys@batserver.cs.uq.oz.au (Rhys Weatherley) (12/11/90)
In <RANG.90Dec10164012@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes: >>When is a global necessary? I think the answer is: almost never. > They're almost never necessary, but they can be convenient when [...] Also, in hardware and interrupt programming (pretty vertical area, but necessary) you can almost guarantee that the data you want to operate on will NOT be passed as a parameter to the interrupt handling function's entry point, but you have to get it from somewhere! Globals are convenient here. >>Forget about GOTO-phobia, how about "`extern' variables considered >>harmful? How about "current programming methodologies and programmer thought processes considered harmful" :-) . > And static variables! And global error statuses! <grin> And procedures and functions since they are also globally declared! Rhys. +===============================+==================================+ || Rhys Weatherley | The University of Queensland, || || rhys@batserver.cs.uq.oz.au | Australia. G'day!! || +===============================+==================================+
new@ee.udel.edu (Darren New) (12/13/90)
In article <1189@ncis.tis.llnl.gov> turner@lance.tis.llnl.gov (Michael Turner) writes: >When is a global necessary? I think the answer is: almost never. >To me, the worst part of unconstrained use of global variables is the >uncertainty: when reading the code, you find yourself looking at something >that has no readily-available context or meaning; you don't know what is >going to change it and when. Actually, when you have no readily available context or meaning, a global can be most helpful when properly done. I was managing a few-person project in which globals were almost vital. One person was writing the "main" function and a few levels of nesting, another was writing the lowest levels of nesting, and others were writing various parts of the intermediate levels. It was done this way because the intermediate levels needed to be rewritten (to some extent) for each customer and software product, but the top-level menus would be the same for each type of software product and the bottom layers would be the same for all products. For example, the top level displayed the "print report" option on the main menu, the intermediate level calculated what was to go in the report and which columns went where, and the lowest level got the characters out to the printer. (There were actually four levels, but that's beyond the scope of my point.) Anyway, globals were needed because we could not afford to change the middle layers of every product when something needed to be passed from the top level to the bottom level. For example, if customer 17 needed two different printers supported, and customers 1 thru 16 only needed one, then getting programmers for 1 thru 16 to add the "which printer" parameter to their intermediate levels becomes a maintainance nightmare. My solution was to have all global variables actually be global; i.e., every programmer would know about them, and they would maintain only information that was truely global to all routines. To accomplish this, every global variable had to be in the "globals" header file of the appropriate layer (i.e., no "hidden" globals between only two modules). Also, and most importantly, each global had to have a comment describing what the variable "meant" independant of context. For example: BOOL file_is_open; /* client file is currently open */ BOOL file_is_ro; /* client file is open and readonly. false if not file_is_open */ Note that you don't *need* context to understand these globals (given, of course, that you understand our program model). Whenever you open the client file, you must set file_is_open to true and must set file_is_ro correctly also. Whenever you close the client file, you must set file_is_open to false and file_is_ro to false. Whenever a new global was proposed, the global and its comments were written up and distributed to all programmers. The comments were modified until all programmers thought them unambiguous. For example, the first comment for file_is_ro was /* client file is read only */ and one of the programmers said "What if it's closed?" In conclusion, global variables are just as useful and dangerous as global goto labels. Properly commented and maintained, global variables are helpful in reusability, not harmful. The biggest restriction is that global variables should be *GLOBAL* and not just shared invisibly between some subset of modules. I rarely notice people complaining about the global nature of some truely global variables like file handles, process IDs, file names, userIDs, and so on; I feel that this supports the idea that GLOBAL variables are safer than "invisible" parameters. Global gotos are helpful too, *IF* they are actually global, that is, if you can actually goto them at any time and have the desired effect. Witness, for example, interrupt vectors, the "main()" entry point, and software libraries. Each of these global goto-like mechanisms can be useful but only if calling "sin()" always gives you the expected documented answer. There are problems with "errno", but not because it is global, but rather because the data structure is too simple. -- Darren -- --- Darren New --- Grad Student --- CIS --- Univ. of Delaware --- ----- Network Protocols, Graphics, Programming Languages, Formal Description Techniques (esp. Estelle), Coffee, Amigas ----- =+=+=+ Let GROPE be an N-tuple where ... +=+=+=
rh@smds.UUCP (Richard Harter) (12/15/90)
In article <1189@ncis.tis.llnl.gov>, turner@lance.tis.llnl.gov (Michael Turner) writes: [... retieration of artg's arguments against globals ...] > I've explained to many programmers my view that global variables are to > data structures what the GOTO is to control structures (worse, in many > ways, if you ask me--I'll use the occasional GOTO, but my references to > global variables tend to be to someone else's, not to any that I invented.) > A lot of people give me this response: that they only use them when > "necessary". When I go look at their code, there are always tons of > gratuitous global variables... > To me, the worst part of unconstrained use of global variables is the > uncertainty: when reading the code, you find yourself looking at something > that has no readily-available context or meaning; you don't know what is > going to change it and when. I respectfully disagree with these views. Procedures (which are also externals) have the same demerits. Consider the following code: procedure foo local x global FLAG ...... x = FLAG ...... x = bar() Do we know what FLAG is? No. Do we know what bar does? No. Both are externals supplied from outside the routine. In fact we might reasonably say that getting the value of the global, FLAG, at least does not have any side effects; no such guarantee is given for procedure calls in most languages. A blanket condemnation of "globals" is overly simplistic and fails, in my view, to grasp the nature of the problem. If we return to the widely condemned "goto" for an analogy we (should) ask: What is wrong with using "goto"s. The practical answer is not that the goto is sinful but rather that, in most cases, it is too primitive -- it does not reflect the actual flow logic being used. If one is going to discuss the merits and demerits of globals one should survey the uses that are made of them. Right off hand I can list some: (a) Environment (state) descriptors (b) Data transfer between subsystems (c) Private data with a subsystem I'm sure that others can come up with a much more extensive list. > Not only does variable-sharing "prevent a distributed program from being > reconfigured", I've found that it can prevent ANY significant program from > being reconfigured! It has probably prevented a great many programs from > being understood by anyone except the original programmer. Forget about > GOTO-phobia, how about "`extern' variables considered harmful? Color me a skeptic. The thing that keeps programs from being reconfigurable (to the extent that such a thing is desirable) is that they aren't designed with reconfigurability in mind. If that is one of the design objectives a set of globals which define the configuration can be a very useful thing indeed. Having said this, let me make a couple of points against globals. The first is that they disproportionately expand the name space of a program (as compared to procedures) since one can introduce more names with globals than with procedures for the same number of lines of code. The second point against globals is the lack of natural structuring. Let me use C as an example. In C all globals are in the same single flat name space. It is conventional to break this name space up using include files. However this conventional is easily defeated -- one can always go around the back door by using an "extern" declarations. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.