morrison@nucsrl.UUCP (Vance Morrison) (01/23/88)
Hello, These are some of my recient thought on programming languages that I thought I would share with the net. I have been programming for many years now, and to the best of my abilities I have tried to write what I term 'good code', and have found that its has always been a struggle that I don't often win in traditional programming languages (c, pascal etc). My pet project at the moment is to design a programming environment/ language that would make writing 'good code' very easy, in fact easier than writing the equivalent bad code (if possible). Now I realize that this is no small project I undertake, and that many very smart people have tried to solve this problem and had only limited success, so I actually don't expect success, but I do hope to add my insight to theirs, and thus come that much closer to the goal of a 'perfect language'. So lets attack this systematicaly, by first describing what I mean by 'good code'. I think THE property that seperates good code from bad code is the ease in which it can be understood and CHANGED. Anyone who has done any real programming will tell you that any software worth writing has to continually be upgraded and extended if it is to be a useful product. Good code allow these changes to be done easily. The key tool we use to design good code is MODULARITY. Everyone knows (or should know) from their first programming courses how to break up a problem into parts with procedures. Unfortuately this is the extent of the tools that many programming languages have for supporting modularity and it simply just isn't enough. What I mean by modularity is something much more broad than its meaning when most people use the word. When a program is designed, many assumtions are made on the format and structure of the data being processed. These assumtions does not make the code bad, in fact, ANY piece of code has SOME assumtions about the format and structure of the data. The KEY is to try to make these assumtions as independent as possible (that is any assumtion can be changed with only minimal impact on any of the other assumtions). I could give many examples of how good code has this 'assumption' independence and how bad code does not have this feature. I also have many rather radical ideas on how to design a language that makes doing to a high degree this easy (or even possible!), but if I include them now this letter will take forever to write, so instead I would like to stop here and check on the responce of what I have written so far. First of all Does everyone with my definition of good code, that is code that can be understood and changed easily? Second does everyone understand my concept of 'assumption independence', and agree that code that has this feature is 'good code'? (now I do not believe total 'assumtion independence' is possible, but we should try the best we can, that is we should make the most likely features that could change in a program independent of one another so that only small portions of the code need to be rewritten if a feature need to be changed.) Third, does everyone agree that present languages do not provide good tools for makeing design assumptions as independent as possible? I will come up with some examples if necessary, but I am far from familiar with many languages, so maybe someone knows of a language that provides good tools. Finally does anyone have any good ideas on what kind of tools are needed? I am particulary interested in the answer to this question, since I have not yet answered it to my satisfaction, but I do have some ideas. I am also interested in common problems a good language should solve cleanly (eg error handling, device independence etc), so let me hear from you. I will write again on my ideas in trying to answer this last question in latter postings if there is interest, but I will give you certain hints now. May I say that I do NOT like the idea with a language with many features. It makes the language impossible to learn and you will undoubtedly find that you forgot something. In some ways the language must be 'programable' so that 'library implentors' can add features to the language as needed for a particular set of problems. This programmability has to be limited, however, otherwise every programmer programs his own varient and no one can read anyone else's code. Well that is enough of a brain teaser. I hope that there is interest in this question, because I think it is a VERY important and practical one. I also hope that I have given many intelegent people on the net some food for thought. So lets hear from you. Vance Morrison Northwestern Univ morrison@northwestern.arpa morrison@accuvax.nwu.edu morrison@nuacc.bitnet
btb@ncoast.UUCP (Brad Banko) (02/01/88)
vance, in response to your posting about 'good' code, i agree with your definition of good code being code that can be understood and changed easily, but there is an implicit assumption about a reader's experience and ability to understand code... clearly, a program doesn't have to document in complete detail the theory and nature of its algorithms, unless you would want to include chapters from Knuth in the header comment of every module. i think that the modularity is important, and this is one place that pascal falls down, but apparently modula2 makes up for it... the really good aspect of pascal/modula2 is that complex data types are very easy to clearly declare... types of types, etc., and this is supported in the parameter definition syntax for subroutines, etc. a major advantage of C is its 'economy of expression', which, when combined with modular programming habits (no module is larger than one page, top down design) makes it easy for a reader to see the 'big picture' of the code. i don't think that these languages lack any tools to write good code, but the responsibility still lies with the programmer to practice modular programming. perhaps a first approximation to enforcing good code is to write a shell for writing source code that enforces a minimum documentation and modular programming standard... enforces a top down approach by limiting all code sections to one page, and enforces documentation by requiring the programmer to complete a minimum required header comment documenting all significant data structures. i am not saying that this needs to be dictated to a programming team, but i am sure that any programming team familiar with large projects is familiar with these types of concepts and could decide on standards by committee. -- Brad Banko Columbus, Ohio (formerly ...!decvax!cwruecmp!ncoast!btb) btb%ncoast@mandrill.cwru.edu "The only thing we have to fear on this planet is man." -- Carl Jung, 1875-1961
morrison@nucsrl.UUCP (Vance Morrison) (02/08/88)
Hello Brad (et al), I am sure that I did not make myself clear when I described my concept of modularity. C, Modula-2, Ada, provide some tools that provide modularity, and as you suggested, it is the programmer, not the language itself that makes code modular. (a REAL PROGRAMMER can write fortran in any language (:-). But I as a programmer, can only make my code modular to the extent that the language lets me. Try to right a modular program in old versions of BASIC and you will see what I mean. Since BASIC does not have the concept of a procedure with parameter passing and local variables, its is quite painful to try to write modular programs in that language. In the same way, I want to write code that is much more modular than common procedural languages (C, ada etc) allow me to do. I feel frustrated and believe that their must be a better way, the language just won't let me do it. That is why I am trying to design this next generation language. I believe an example is in order, because I am sure that most of you are thinking that the present tools are sufficient, if only they are used properly. Suppose I wanted to make a general purpose matrix multiplication routine. My input is a two matrices and the routine will return the matrix product. The only thing my routine needs to know about the matrices is that how to find their dimensions, how to index them, and how to add and multiply the individual elements. I want this routine to work on ANY data structure that has these operations defined. I DON'T want to assume that the matrices are arrays (for sparse or symmetric matrices it may make more sense to store it in other ways). This is impossible in most present languages. (although ada can do this with generic packages, but with ada you run into problems with the strong typing). Now some may say that this is a non-problem, that since no-one would ever need a module that generic, who needs a language that allows you to make such a module. I would respond that I have talked to a few BASIC programmers who have know conception of how useful a procedure is. My concept of modularity boils down to this two simple rules 1) A module should contain in its interface definition ALL the information it needs. All assumptions about the data a module operates on should be stated in the interface definition. This rule makes the module independent of any other module, and allow the programmer to treat the module as a 'black box' where only the interface definition is visible. 2) A module should ONLY make those assumptions that it needs to do its jobs. In the above example restricting the matrices to have only integer elements is a needless assumption, since the operation of matrix multiplication can operate on ANY data type that has a multiplication and addition operation defined. (The matrix elements could be polynomials, and the SAME code could be used to multiply the matrices. There are other 'nice' features that my language should have. For instance, the interfaces should be designed to be easily modifiable, and flexible, so that the human effort needed to design and hook modules together is minimized, but these are 'extra' features. Present languages allow us to meet the first criterion IF THE PROGRAMMER DESIGNS HIS CODE WELL. The second requirement is where present languages are found lacking. I believe ANY 'flat typing' (that a typing that cannot be hierarchically ordered) like ada, is not flexible enough. Object oriented languages do better, but become a problem when the hierarchy of the types need to be changed. My goal is to design a language that meets the above two goals (as well as several others). I hope to get some of you also thinking about these problems and to bounce my ideas off you. ----------------------------------------------------------------------- This passage is unrelated to the above. You mentioned some possibilities for trying to encourage programmers to design good code. I would like to put in my 2 cents worth. I believe that the LANGUAGE should not FORCE any code design practices on the programmer (it may encourage them). That is the job of the programming environment and ONLY at a humans request. For instance, you could implement an editor a a compiler switch that warns or even prohibits code that does not meet certain quality standards. But the decision to use that feature should be in the hands of the programmer (or program team leader etc). The reason, is that for most rules you propose, there is often some time that it is sensible to violate it. For instance, when I write small, one shot programs, I do not comment them and I would become quite mad at any editor/compiler that forced me to do it. My philosophy is that computers should help people, not restrict them. I remember all the time I wasted trying to get around certain typing restrictions in PASCAL. (I had a legal reason, I was interfacing with VMS system services) I found it quite ironic that I was spending a lot of time getting around restrictions that the compiler programmers spent a lot of time putting in place. We have enough trouble overcoming TRUE restrictions, lets not create any more of our own making. Vance Morrison morrison@accuvax.nwu.edu morrison@nuacc.bitnet morrison@northwestern.arpa