xtbjh@levels.sait.edu.au (behoffski) (05/10/91)
Several people have asked me to post a code fragment of how code would look using adjectives and algebra. I have two answers: if you want a complete, fully-baked, watertight syntax and semantics, then I haven't got it. If you want some partially-baked ideas with advantages and disadvantages, then I have hundreds. What I will do here is post what my current best guess of how to wrap up the ideas into a language. I'll try to give the factors that I considered at each step. Since I've started from a position of rejecting everything built upon algebra, I've given myself the near- impossible task to look at more or less everything and to decide what I want to use and what to discard. The following discussion shows the point I reached before I posted the idea; the fact that I posted shows that I felt I would never be able to complete this project myself. Incidentally, I'm very happy for anybody to use my ideas, and I put them in the public domain. If you use anything, a simple acknowledgement would be appreciated. ---------------------------------------------------------------------- The process of building systems out of the raw hardware is the process of building a series of languages -- this idea was described by Prof. Vlad Turski in a lecture I attended in 83 or 84. The lowest-level software languages build upon the language defined by the hardware; successive layers of languages move away from the hardware towards the specific application. Other seed ideas included the C macro "is_leaf()" that I saw in the netlist comparator Gemini by Carl Ebling; a few very neat ideas in COBOL (gasp, shock horror), and my own fervent belief that data structures were a critical technology, but were improperly rendered in existing languages. On 16 Sept 1988 I tried to simplify the treatment of data structures by defining a standard set of nouns and verbs that would be common to *all* structures -- and of course saw that this was rediculous. And then I saw that implementing all the keywords that were defined by the structure would be brilliant -- but that this required adjectives and adverbs to be fully-fledged building blocks in the language. So at any instant during development of the system, the designer is working at two levels: - the current environment, as defined by the set of nouns, verbs, adjectives and adverbs that are active, and - the visible behaviour of each of the machines that are currently active. I claim that a fundamental problem of systems engineering -- trading off flexiblity against efficiency -- is well captured in this model. Each machine has an interface consisting of nouns, verbs, adjectives and adverbs. There may be zero or more adjectives per noun; there may be zero or more adverbs per verb. Since flexibility is the goal of the interface, these language components should be totally orthogonal. A crop of issues come up here, here are some: - what is the minimal request -- a noun and a verb? - what order: [adj...] noun [adv...] verb? - is it worth distinguishing plural from singular nouns? - since a token seems to be a word surrounded by whitespace, how do exceptional tokens (escape chars, strings etc) get defined? - how do value adj/adv elements get handled (e.g. twin-cam .vs. 2 litre)? - how do the results get returned? A stack? - what about mutually exclusive adjs -- "leaf" .vs. "non-leaf" Formal .vs. actual parameters crops up here. You may be working with four machines active simultaneously, two separate instances of one machine and two other machines. All four instances use "leaf" and "node", so there is much ambiguity. However, "node" on one machine is really "file"; another is "sprites", and "node" is "person" and "address" in the other two machines. So "leaf files" is unambiguous even where "leaf nodes" is ambiguous. If all the words in the request are ambiguous, then adding something like "of xyz_machine" will be needed to define who is the recipient of the request. So far, I've said nothing about types. What is a type? A bit? An integer? A string? A real? A binary tree? A file system? A sparse matrix? A polynomial? A sprite? I would claim that types are a by-product of defining a machine. The notion "positive" is more accurate and useful than the noun-and-verb formula "x > 0". Languish has no inbuilt types, except being able to reference objects via pointers (or some similar anonymous mechanism). Anything that you want to know about an object that you've been given, you need to refer back to the machine that handed you that object. So in the case of an "integer", for example, there is a single Languish machine that defines integers, and defines the primary language elements that are available to operate on integers. This might include "odd". Another mahine might then inherit "integer" from the parent machine and add language elements, for example "prime". The user of this second machine might say "prime odd integers", without knowing where the individual language elements were derived. Later on, the implementor might choose to implement a single machine that knows about both "prime" and "odd", and has a special code fragment added to optimise the pairing "prime odd". This improved machine can be substituted without any disruption to the users of the interface. As you can see, I'm a Software Revolutionary. I'm also Australian and one-legged, but I'm *definitely* not a programmer (mega - (-:). I would claim that to continue to bolt fixes onto algbraic languages -- C++, Ada and Modula-3 are cases in point -- is fundamentally flawed. Another point about machines and the individual words that are in the macine interface. These words are *not* procedures or functions; they have no meaning outside of the interface. I extend this point to say that allocating storage on a verb-by-verb basis -- which is how existing stack-oriented languages such as Pascal and C work -- is inefficient compared to allocating storage on an interface-by-interface basis. Each instance of an interface can be given a statically allocated interface area. The parameters handed through the interface, and the results and return control point, can be declared in a fixed piece of RAM. This saves the cost of continually grabbing and releasing space off of stacks. Since the requests handled through the interface tools provided by Languish are relatively few, it is worth the expense of maintaining the language set dynamically. As machines are activated and closed, the available set of operators changes accordingly. The discussion now switches from what the interface has, to what the underlying machinery looks like. behoffski very quickly gets very vague here. The Languish code fragment "leaf nodes get", when implemented on a sequential machine, might cause the following code to be generated and then executed: ?? some initialisation as defined by "get" tree.node.StartEnumeration(thistree, context) while tree.node.GetNext(context, this_node) do ?? some per-node operation as defined by "get" if tree.leaf.IsLeaf(this_node) then ?? remember this node as defined by "get" end if end while Since each noun, verb, adjective and adverb is intended to be independent, the natural choice is to define a fragment of code for each language element. In the case of adjectives, this is to give the sequential-verb version, i.e. the code fragment for "leaf" implements the test "IsLeaf()". In the case of all the pieces of a structure (like all the nodes of the tree, or all the bits of the integer, or all the files in the filesystem), there needs to be code to enumerate all the pieces. The order in which nodes are processed might be described by the verb or adverbs, e.g. "pre-traverse" .vs. "in-traverse" .vs. "post-traverse". What language is used to implement the code fragments? Currently I'm restricting myself to two languages: - some sub-level machines with Languish interfaces, or - machine code (or possibly assembly code). The second case is obviously needed when a machine, such as integer, operates directly off of the hardware. The first case is the result of the "languages within languages" idea. I'm currently looking at quite agressive register-based models for storing the noun and the loop context in the machine-code case: this has forced me to abandon most hope of reusing existing languages. Hope the discussion above helps, behoffski -- Brenton Hoff (behoffski) | Senior Software Engineer | My opinions are mine xtbjh@Levels.UniSA.edu.au | AWA Transponder | (and they're weird).