CWatts@BNR.CA (Carl Watts) (06/27/91)
Gray Huggins of Texas Instruments recently sent me mail asking me questions and expressing concerns about Smalltalk scaleability. The example given was the limitation of the number of elements in an IdentityDictionary. My response (which follows), I think, would be of interest to other members of comp.lang.smalltalk and comp.object: Gray, Smalltalk scales better than other other language I know of. The reason is simple, you can model everything from the Space Shuttle to an Integer the same way. As an object (that perhaps uses other objects) that provides an excapsulated interface to some behavior. Now a Space Shuttle object may have thousands of other objects that it uses to provide its behavior, and an Integer may use no other objects to provide its behavior, but the ways in which you treat the two objects are pretty much the same. They both are Objects. And they both have a message interface allowing interaction with it. Smalltalk was designed from the start to not have any problems scaling. What other language do you now of that can as easily handle 2 + 3 as it can evaluate: 958685745856745747465784756948574847584737384877584 + 9293458823458723745734875847574874875748874374373646 Smalltalk tells you the answer to the first is 5 and the answer to the second is 10252144569315469493200660604523449723333611759251230 Smalltalk was the first general purpose language I know of that allowed infinite precision Integer arithmetic. Its thanks to the objects Integer, LargePositiveInteger, and LargeNegativeInteger. The IdentityDictionary problem is extremely easy to solve. Just get out of the mindset that the Classes that come with Smalltalk are somehow holier-than-holy and are by definition perfect and complete. They aren't. Every Class in Smalltalk is implemented to fullfil a certain purpose. And certain implementation compromises and design choices are always made. When IdentityDictionary was implemented some of the compromises and design choices that were made were: 1) Speed over Space. Hashing was used for fast speed at the cost of additional space. 2) A single object with indexable instance variables rather than some 'tree' based representation that costs more objects and where the benefits only become apparent when a large number of items are in the Dictionary. The class was not meant to necessarily be the perfect implementation of IdentityDictionary for all possible uses. The answer to your problem should be obvious now. Just change IdentityDictionary or make a new kind of IdentityDictionary. Simple. Now, of course the question comes up, what new design compromises and design choices do you want to make for this new class? a) Do you want to change the implementation of IdentityDictionary itself to use, say a balanced tree type representation? b) Do you want IdentityDictionarys to turn into LargeIdentityDictionarys when they get a certain number of elements in them (much like a SmallInteger turns into a LargePositiveInteger when it gets big enough). Then the LargeIdentityDictionary can use a different representation for storage that is more appropriate when you have large numbers of elements. Regardless of the first these choices, what new representation do you want? Multiple Arrays? Trees of IdentityDictionarys Self balancing splay trees of node elements? With each decision you are making new design decisions for your new kind of IdentityDictionary. Your new class will serve some particular need. IdentityDictionary was constructed to serve the most common need for a class like this. A fast, relative efficient IdentityDictionary. It didn't promise to be all things to all people. We chose to write a new kind of IdentityDictionary called LargeIdentityDictionary. IdentityDictionarys turned themselves into LargeIdentityDictionarys when they've got move than 16000 elements in them. And the converse also happens, if a LargeIdentityDictionary gets less than 8000 elements in it, it turns itself back into a normal IdentityDictionary. A LargeIdentityDictionary uses a different method of storing elements which has no limit on the number of elements. What you pay for this is slightly slower access times. Access time goes from O(constant) to O(log n). Still vary fast but slower than the hashing scheme employed by normal IdentityDictionarys. Hope this sheds some light.
scott@coyote.trw.com (Scott Simpson) (06/28/91)
In article <1991Jun26.193441.28581@bqnes74.bnr.ca> CWatts@BNR.CA (Carl Watts) writes: >Smalltalk scales better than other other language I know of. The >reason is simple, you can model everything from the Space Shuttle to an >Integer the same way. As an object (that perhaps uses other objects) >[ More comments about how integers and dictionaries are unbounded in > Smalltalk. ] It that sense of scaleability, Smalltalk fares well. But when I hear the term scaleability, I think of bigger issues such as how does the Smalltalk *environment* scale up to large projects of the hundreds of thousands or millions of lines. In this case, Smalltalk fares very poorly. Large projects require multiple user concurrency, distributed data, schema evolution, versioning, support for persistent storage and managability of non-code artifacts, support for testing and maintenance and support for process. In all of these areas the Smalltalk environment performs poorly or not at all. -- Scott Simpson TRW scott@coyote.trw.com
dlw@odi.com (Dan Weinreb) (06/28/91)
In article <1991Jun26.193441.28581@bqnes74.bnr.ca> CWatts@BNR.CA (Carl Watts) writes:
What other language do you now of that can as easily handle 2
+ 3 as it can evaluate:
958685745856745747465784756948574847584737384877584 +
9293458823458723745734875847574874875748874374373646
Since you ask, Lisp. Whether Lisp or Smalltalk was doing arbitrary
precision integer arithmetic first would require some careful
research, but I don't think it's worth worrying about.
None of which has anything to do with the point you were making.
objtch@extro.ucc.su.OZ.AU (Peter Goodall) (06/28/91)
scott@coyote.trw.com (Scott Simpson) writes: >In article <1991Jun26.193441.28581@bqnes74.bnr.ca> CWatts@BNR.CA (Carl Watts) writes: >>Smalltalk scales better than other other language I know of. The >>reason is simple, you can model everything from the Space Shuttle to an >>Integer the same way. As an object (that perhaps uses other objects) >>[ More comments about how integers and dictionaries are unbounded in >> Smalltalk. ] >It that sense of scaleability, Smalltalk fares well. But when I hear >the term scaleability, I think of bigger issues such as how does the >Smalltalk *environment* scale up to large projects of the hundreds of >thousands or millions of lines. In this case, Smalltalk fares very >poorly. Large projects require multiple user concurrency, distributed >data, schema evolution, versioning, support for persistent storage and >managability of non-code artifacts, support for testing and >maintenance and support for process. In all of these areas the >Smalltalk environment performs poorly or not at all. >-- >Scott Simpson TRW scott@coyote.trw.com As distributed the Digitalk and Parcplace Smalltalks certainly don't have the tools for large projects. There is however, no intrinsic problem with multi-user development. Extend the environment to provide the tools. Instantiations, OTI and SoftPert Systems all have Smalltalk environment extensions for team configuration management. -- Peter Goodall - Smalltalk Systems Consultant - objtch@extro.ucc.su.oz.au ObjecTech Pty. Ltd. - Software Tools, Training, and Advice 162 Burns Bay Rd, LANE COVE, NSW, AUSTRALIA. - Phone/Fax: +61 2 418-7433
Will@cup.portal.com (Will E Estes) (06/28/91)
<It that sense of scaleability, Smalltalk fares well. But when I hear <the term scaleability, I think of bigger issues such as how does the <Smalltalk *environment* scale up to large projects of the hundreds of <thousands or millions of lines. In this case, Smalltalk fares very <poorly. Large projects require multiple user concurrency, distributed <data, schema evolution, versioning, support for persistent storage and <managability of non-code artifacts, support for testing and <maintenance and support for process. In all of these areas the <Smalltalk environment performs poorly or not at all. <-- <Scott SimpsonTRWscott@coyote.trw.com I think these are all extremely good points. I have yet to meet a programmer who has worked on a project of any size who would not agree with the statement that the image is far too large a granularity at which to save changes to a system. When developing C programs for MS-DOS or UNIX, would we accept any scheme that required us to save changes to code modules in a monolithic library with core operating system routines and required us to save the entire OS to effect a change? Smalltalk is much like an operating system, and if we would not accept such a large granularity of change for another OS, why do we accept it for Smalltalk? I for one am very concerned about this issue, and I feel it is the one and only impediment to wide-scale commercial use of the Smalltalk language. What concerns me most is that Digitalk and ParcPlace seem to be doing very little to address this issue. On the CompuServe forum, when I and others have raised this issue it either goes without any response at all from Digitalk, or it gets the response "we can't fix these problems without changing the Smalltalk definition." I have two problems with that response: 1) it tells me that Digitalk is more concerned with strict adherence to a standard than it is with solving its customers' business problems; 2) it tells me that Digitalk doesn't have a proper sense of proportion about just how bad this problem really is, and just how much it cuts into their potential sales. I think if someone could prove to them that their sales might increase by factors of five or more if they addressed this issue effectively, then they might start to sing a different tune. I know that there are third-party products to address version control and multi-user, networked use of an image by programmers making changes to that image, but frankly that just isn't good enough. This whole area points to problems with the language definition, and the solution needs to come from the language vendor and be integrated thoroughly into the core product. I do not know any responsible MIS manager who is going to risk a large-scale, long-term project on a third-party utility for a language which is itself considered leading-edge and somewhat risky. What can we do to convince the Smalltalk vendors that this problem merits significant, and immediate, attention? I think the primary reason that the vendors do not see the magnitude of the problem is that most of the early adopters of the Smalltalk technology are single-person shops or small prototyping groups in larger companies that only require a tool that one person can use effectively. But to define the customer as one person working on his own is to miss out on the real mass-market opportunity for Smalltalk: the hordes of large-team Cobol programmers who weigh down most MIS shops. Smalltalk as it stands today is not a substitute for Cobol on very large projects, and that is a damn shame, because it could and should be. Will Estes Internet: Will@cup.portal.com UUCP: apple!cup.portal.com!Will
marti@mint.inf.ethz.ch (Robert Marti) (06/28/91)
In article <1991Jun26.193441.28581@bqnes74.bnr.ca> CWatts@BNR.CA (Carl Watts) writes: >Smalltalk was designed from the start to not have any problems >scaling. What other language do you now of that can as easily >handle 2 + 3 as it can evaluate: >958685745856745747465784756948574847584737384877584 + >9293458823458723745734875847574874875748874374373646 How about - Lisp - Scheme (OK, so Scheme is a Lisp dialect) - some implementations of Prolog - Mathematica - Maple? If you have a C++ library which includes something like class arbint in Tony L. Hanson's "The C++ Answer Book" (pp.276-308), you could add the following member function to class arbint: arbint(char *string); // convert string into a newly created arbint so that you could at least write something like "958685745856745747465784756948574847584737384877584" + "9293458823458723745734875847574874875748874374373646" Robert Marti | Phone: +41 1 254 72 60 Institut fur Informationssysteme | FAX: +41 1 262 39 73 ETH-Zentrum | E-Mail: marti@inf.ethz.ch CH-8092 Zurich, Switzerland |
CWatts@BNR.CA (Carl Watts) (06/28/91)
"... the image is far too large a granularity at which to save changes to a system..." "... one and only impediment to wide-scale commercial use of the Smalltalk language..." "This whole area points to problems with the language definition ..." My, my, my, Will, what an expert you've become on Smalltalk, the problems with it, and what needs to be done considering that only two days ago I had to explain to you what the difference between a String and a Symbol was and how a Dictionary worked. But the concerns you mentioned are very common concerns with neophyte Smalltalk users. I had many of the same problems when I first started using Smalltalk 4 years ago. One has to remember is that Smalltalk is a fundamentally different kind of beast from languages like Fortan, COBOL, Pascal, C, C++, etc. Smalltalk has a different point. Comparing the two is like the proverbial comparing of Apples and Oranges and lamenting the faults of Oranges in that they have a thick outer covering you have to remove before you can eat it and you can't make applesauce with them. Smalltalk has much more in common with Lisp and APL. Both languages have virtual machines and the idea of an image (called a Workspace in APL). The problems you are lamenting with Smalltalk are not things that any other language (that I can think of) integrates solutions into itself. C is perhaps the classic example. It has no integrated version control or multi-user development environment support inherent in the language. It defines no granularity of change inherent in the language. The language itself has no inherent development support at all. All of these must be provided by outside applications (typically in Unix if at all). Smalltalk was meant to be its own operating system, language, development environment, everything! Thats why there is a "Virtual Machine". An image is not an inherent concept of the Smalltalk language. It is what is needed when operating under a foreign operating system. When Smalltalk was developed, no machine could provide the support needed so a Virtual machine was built to emulate the real machine. The virtual image represents a snapshot of what the memory of this virtual machine looked like when it was last running. This is common in other languages like Lisp, APL, etc. It almost makes me want to cry when someone says they want to turn Smalltalk into something like C or Pascal; where you write canned applications that run under a foreign operating system like Unix or OS/2. That was not the point of Smalltalk. While that is the point of something like C. If thats what you want to do, then you can do it but you will be pounding Smalltalk's round peg into a square hole. You can do it if you pound hard enough. And considering the beauty of Smalltalk compared to C, its still usually better than using C's square peg. This from someone whose department supported 50 developers working on the same application, all in Smalltalk with less multiple-developer-conflict problems than I would have thought possible using any development environment. This thanks to the encapsulation that Smalltalk classes support so well, and to some small development environment tools (based on ChangeSets) that we developed in Smalltalk to manage many concurrent developers. As Peter Goodall pointed out there are several excellent commercial products available to provide extremely sophisticated support for large multi-developer groups. And these are far more sophisticated than Unix and all the standard Unix development tools combined.