smith%eri.DECnet@MGHCCC.HARVARD.EDU ("ERI::SMITH") (08/03/87)
> I am puzzled why VMS, a virtual memory operating system, imposes > limits on the number of command lines that DCL remembers and on > the length of the command prompt. >Gee, this is just like asking "Why does Jake's program not read my file >with 190 byte input lines?" when Jake's program was built to read 80 >character "card images" and has an 80 byte input buffer. There have >to be limits someplace; the people designing/implementing DCL thought >that 20 command lines and 1024 characters per command line were reasonable >limits. A whole area of software engineering, and one that I haven't seen discussed much, is the art of deciding whether there have to be limits and, if so, what they should be. This is an IMPORTANT area because it is a COMMON source of problems in porting programs. It is also a common limitation in the lifetime of computer systems! There do not HAVE to be limits--you can use pointer structures to guarantee that all of available member can be used for any construct, if necessary. For example, ANS Pascal specifies NO limits on identifier length, and ALL characters are significant. At least, Think Technologies says so, and calls out their own limitation to 255 characters as an exception. I dunno what VAX PASCAL does. How far should an implementor go to try to meet the ANS requirement? Should identifiers of more than 255 characters be permitted, at the cost of using a word rather than a byte somewhere? How about more than 65K? Why shouldn't I be able to make use of the full VMS virtual memory address space? Why should I be limited to a measly 4 gigabytes or so--why can't my identifier name be as long as I have disk storage for? Still, it's hard to believe that everyone will use arbitrary-length multi-precision integers for everything (to say nothing of dynamic storage allocation for everything). OK, so if that's silly (and perhaps it is) how do you make a rational decision as to where to draw the line? Would it have been better for the authors of the PASCAL standard to pick a number like 256 or 8192 and get some vendors mad at them ("you DELIBERATELY made it 8192 just to make it hard on 12-bit PDP-8 implementors")? Kernighan and Pike ("The Unix Programming Environment", p. 47) note that "There are implementation limitations with most programs that expect text as input. We tested a number of programs on a 30,000 byte text file containing no newlines, and surprisingly few behaved properly, because most programs make unadvertised assumptions about the maximum length of a line of text." In UNIX, of course, which uses what I call the paper-tape concept rather than the 80-column card concept, a 30,000-column line is SUPPOSED to be acceptable. (I wonder whether any of their programs that worked on 30,000 bytes would fail on 32,769? or 65,540?). Under VMS, of course, lots would break on 81 characters, lots more at 133, and most of the rest at 257... It is interesting to note how conservative architects have been, in the face of memory technology that continues to deliver a factor of two every two years. It seems that typically, the ratio between the smallest memory configuration in the pioneer machine in a family and the place where the architecture hits a wall is usually in the range of 16 or so. That means that the family of machines starts to get kludgey and self-destructs from the weight of accumulated bandaids in just 8 or 10 years. Examples: PDP-8, minimum 4K, hits the wall at 32. PDP-11, minimum 16K, hits a wall at 64 or 256 depending on what kludges you tolerate. IBM PC, minimum 16K (what? you don't remember?), hits the wall at 640K. Macintosh, curiously enough, despite an obvious opportunity to score with the 68000: minimum 128K, hits a wall at 4 meg. I don't know the right figures for the VAX, which comes off looking pretty good, but clearly the thing that will kill off 32-bit machines will be the existence of more than 4 gigabytes of memory and good reasons for wanting to address them cleanly. (Wanting to address that entire CD-ROM from FORTRAN as an array). What everyone does, in a situation where they don't know how much of something will be needed, is to take a hot guess, based on what THEY need NOW, what's expedient, etc. Then you don't document it, because surely 2048 is more than ANYONE is ever going to need. I'll betcha that a clean approach to managing this kind of issue would do as much for program portability and system longevity as goto-less code, -------------------------------------------------------------------- Daniel P. B. Smith ARPA: smith%eri.decnet@mghccc.harvard.edu Eye Research Institute CompuServe: 74706,661 20 Staniford Street Telephone (voice): 617 742-3140 Boston, MA 02114 -------------------------------------------------------------------- "We are in great haste to construct a magnetic telegraph from Maine to Texas; but Maine and Texas, it may be, have nothing important to communicate."--Thoreau or object-oriented programming, or whatever today's buzzword is... ------
RALPH@UHHEPG.BITNET (08/06/87)
Date: 4-AUG-1987 21:22:20.01 From: Ralph Becker-Szendy RALPH AT UHHEPG To: B_INFOVAX,RALPH Subj: Re: Size limitations Hi everyone Even in the danger of creating another "metaphysical" discussion: (BTW, i apologize for even having my own point of view about hackers ...) Daniel Smith is right IN PRINCIPLE: a well designed system should not impose artificial limitations just for the sake of ease of implementation. What the system can do (for you) should be limited only by its architecture. most of the restrictions (like: 6 character identifiers in old FORTRAN, upper case source only for some languages, the infamous 19 continuation lines for IBM FORTRAN compilers) are caused by mentally retired software designers sticking to their old prejudices (where >90% of IBM as a whole is MENTALLY RETIRED, and DEC is on the way down the hill). On the other hand, systems are implemented by people, which are a scarce resource. I agree, 20 commands in the recall-stack is a shame. But, on the other hand, a stack area for 20 commands of 255 bytes each is much easier to implement than a whole pointer structure with all the complications of virtual memory. When i write a program, i usually declare a 80-character string for command input. Yes, in principle i should just declare a varying-length string for it, but that's such a hassle in FORTRAN, and just not worth my time. Think about it the following way: maybe the time the programmer saved by having only 20 commands in stack went into usefull features of the system. Ralph Becker-Szendy University of Hawaii / High Energy Physics Group Disclaimer: The views expressed here are probably not endorsed by my employer. I hardly ever actually speak to my employer. Even our system manager stops smiling when i come by.
minow@decvax.UUCP (Martin Minow) (08/06/87)
In article <smith%eri.decnet@mghccc.harvard.edu> writes: >... >It is interesting to note how conservative architects have been, in the >face of memory technology that continues to deliver a factor of two every >two years. It seems that typically, the ratio between the smallest >memory configuration in the pioneer machine in a family and the place where >the architecture hits a wall is usually in the range of 16 or so. > >Examples: PDP-11, minimum 16K, hits a wall at 64 or 256 depending on what >kludges you tolerate. Ahh, how soon they forget. Quoting from the PDP-11 Programming Handbook (2nd edition, 1969): "The PDP-11 is available in two versions designated as PDP-11/10 and PDP-11/20. The PDP-11/10 contains ... 1,024 words of 16-bit read-only memory, and 128 16-bit words of read-write memory. The basic PDP-11/20 contains ... 4,096 words of 16-bit read-write core memory, a programmer's console, and an ASR-33 Teletype." Note that this was the original PDP-11/10 (I don't know if any were actually manufactured), not the built-like-a-tank model from 1973 or so. [And, yes, the manual is still useful.] Back in 1969, Dec had 36 offices in the United States, 5 in Canada, and 17 in Europe, Japan, and Australia. We've grown a bit since then. Martin Minow decvax!minow
sommar@enea.UUCP (Erland Sommarskog) (08/12/87)
In a recent article "ERI::SMITH" <smith%eri.decnet@mghccc.harvard.edu> writes: >Kernighan and Pike ("The Unix Programming Environment", p. 47) note that >"There are implementation limitations with most programs that expect text >as input. We tested a number of programs on a 30,000 byte text file >containing no newlines, and surprisingly few behaved properly, because >most programs make unadvertised assumptions about the maximum length of a >line of text." In UNIX, of course, which uses what I call the paper-tape >concept rather than the 80-column card concept, a 30,000-column line is >SUPPOSED to be acceptable. (I wonder whether any of their programs that >worked on 30,000 bytes would fail on 32,769? or 65,540?). Under VMS, >of course, lots would break on 81 characters, lots more at 133, and most >of the rest at 257... But that does mean that Unix doesn't impose limitation of what you can do? No. Yes, you can have lines that are million characters, no problem. But if you want to have a line-feed charcter in a text string, thus without the character having the meaning of new line? As far as I know, this is not possible. The conclusion is that no matter what you do, you may get into problems. Since you need *some* convention to indicate new lines, you must introduce *some* restriction. VMS restricts you in size, but permits you having LF:s in text string. Unix does the other way. Which you prefer is a matter of taste. (And by the way, VMS does know of stream-LF format too. Unix knows only of stream-LF.) -- Erland Sommarskog ENEA Data, Stockholm sommar@enea.UUCP