bmc@argus.UUCP (Bob Czech) (07/03/87)
I've been working on a variation of the smalltalk model and in studying the VM I've found that if you have a string constant in a method and assign it to an instance variable and somewhere down the line you make a modification to that string, that the constant in the method would hence change. Is this correct? And if so, I see it as a major flaw that you would be able to modify a constant!
rentsch@unc.cs.unc.edu (Tim Rentsch) (07/04/87)
In article <938@argus.UUCP> bmc@argus.UUCP (Bob Czech) writes: > I've been working on a variation of the smalltalk model and in > studying the VM I've found that if you have a string constant in a > method and assign it to an instance variable and somewhere down the > line you make a modification to that string, that the constant in the > method would hence change. Is this correct? And if so, I see it as > a major flaw that you would be able to modify a constant! That is correct, even though somewhat surprising when first encountered. The easiest way to correct the undesired behavior is to use iv <- 'foo' copy. rather than iv <- 'foo'. to do the assignment. This feature may indeed be a flaw from the programming language design point of view, but it is not wrong from the programming language definition point of view. The token between single quotes is a string *literal*, not a string *constant*. The distinction is shown clearly as far back as System/360 assembly language, as consider the code fragment LA 1,5 put the number 5 in register 1 ST 1,=F'123456' store the result over the literal L 2,=F'123456' load the value in the literal cell ... Pretty obviously this code will result in the value 5 ending up in register 2. (No argument about whether this is good or bad coding style, the programmer obviously should be shot.) As long as you remember that objects are actually out there for literals, this all makes perfect sense, and applies to other literals such as Symbols and Arrays. All of the problems mentioned so far are basically with at:put:, either on the literal itself or its contained objects in the case of Arrays. But this is not the only problem -- use of become: on a literal will also proceed as usual, and almost certainly wreak havoc of some sort, if not on the program then on the sanity of the person working on the code. My conclusion: not an altogether satisfactory mechanism, literals are still very useful, and I can't immediately suggest anything better. Any ideas? cheers, Tim
steele@unc.cs.unc.edu (Oliver Steele) (07/04/87)
From steele Sat Jul 4 10:29:54 EDT 1987 In article <938@argus.UUCP> bmc@argus.UUCP (Bob Czech) writes: > > I've been working on a variation of the smalltalk model and in studying >the VM I've found that if you have a string constant in a method and assign it >to an instance variable and somewhere down the line you make a modification to >that string, that the constant in the method would hence change. Is this >correct? And if so, I see it as a major flaw that you would be able to modify >a constant! Except for the term "string _constant_", it is correct. Strings and other arrays (you will get the same behavior if you use #(this is a test) in a method and later 'at: 1 put: #that' it) are not constants in Smalltalk. This behavior is true of most languages that I can think of. Try: TRS-80 Disk BASIC: 10 A$ = "Hi" 20 LSET A$ = "Ho" RUN LIST C on any machine without an MMU: char a[] = "Hi"; main() { a[1] = 'o'; puts(a); } PDP-11 FORTRASH: CHARACTER A(2) DATA A/'H','I'/ A(2) = 'O' TYPE 10, A(1), A(2) 10 FORMAT(2A1) CSI FORTH: : t " Hi" ; 121 t 2+ ! t COUNT PRINT Franz Lisp: (defun a () '(this is a test)) (rplaca (a) 'that) (pp a) It may be confusing, but it seems very much the norm among languages. What you want is 'instance _ 'Hello' copy', or 'stream _ WriteStream on: String new'. Many BASICs do this copy automatically for you in the case of an assignment unless the lvalue is an LSET, RSET, or INSTR, so strings really do act as constants for most purposes in that language except when use an LSET. ------------------------------------------------------------------------------ Oliver Steele ...!{decvax,ihnp4}!mcnc!unc!steele steele%unc@mcnc.org "They're directly beneath us, Moriarty. Release the piano!"
stevev@tekchips.UUCP (07/06/87)
In discussing problems have a user modifying an object that is used as a string literal ... In article <740@unc.cs.unc.edu>, rentsch@unc.cs.unc.edu (Tim Rentsch) writes: > > literals > are still very useful, and I can't immediately suggest anything > better. Any ideas? I see two possibilities: * Have the compiler generate code to make a copy of any string literal whenever it is accessed. This obviously slows things down. * Define both mutable and immutable versions of String (and Array) and have the compiler produce immutable ones to represent literals. The latter does not address the problem of a programmer using "instVarAt:put:" or "become:", but I see that as a different issue: the language does not have the facility to hide "dangerous" operations from the programmer. Steve Vegdahl Computer Research Lab Tektronix Labs Beaverton, Oregon
johnson@uiucdcsp.cs.uiuc.edu (07/07/87)
* I've been working on a variation of the smalltalk model and in studying *the VM I've found that if you have a string constant in a method and assign it *to an instance variable and somewhere down the line you make a modification to *that string, that the constant in the method would hence change. Is this *correct? And if so, I see it as a major flaw that you would be able to modify *a constant! That is correct. It is one of several flaws in the language design. It is no real consolation, but this particular problem can be avoided by using symbols instead of strings, since symbols are not modifiable. However, the problem resurfaces with "constant" arrays. There probably need to be more unchangeable classes, and all constants need to be one of them. If you want to use a constant as the initial state of an object then you should make a copy of it. Note that constants like SmallIntegers cannot be modified.
jans@tekchips.UUCP (07/09/87)
> > > >* I've been working on a variation of the smalltalk model and in studying >* the VM I've found that if you have a string constant in a method and assign >* it... I see it as a major flaw that you would be able to modify a constant! > >That is correct. It is one of several flaws in the language design... Note that >constants like SmallIntegers cannot be modified. *That* is the flaw in the language! SmallIntegers are not bona fide objects! >...this particular problem can be avoided by using symbols instead of strings, >since symbols are not modifiable. Wrongo. Try x _ #flubber. x basicAt: 3 put: $i. Symbols are objects, and like all other objects, they may entertain requests to change their contents. The flaw in this case is not with the language, but with the understanding of what is happening. As someone else pointed out, there is no such thing as a constant object in Smalltalk. (SmallIntegers aren't really objects.) One could easily "demonstrate" that English is "flawed" with respect to certain Eskimo tongues, since English lacks 30 odd words for describing frozen water. However, excepting Alaska, English has little use for such a facility, having instead thousands of words for technical terms, words that are borrowed by other languages around the globe. Smalltalk suffers not from it's lack of constants, but rather from biased notions of what a language should be. You want a constant? Subclass and override all the accessing protocol!
johnson@uiucdcsp.cs.uiuc.edu (07/11/87)
I claimed that it was a flaw in Smalltalk that "constant arrays" were not constants, and offered symbols as an example of a real constant. jan@tekchips claimed that I was wrong, since one could change a symbol using basicAt:put:. This is entirely beside the point. If one mistakenly hands a symbol to an object that thinks it is a string, the object will send at:put: to it and find out that it is a "constant", since it doesn't understand at:put:. basicAt:put: is really only for debuggers and the like. If someone comes complaining to me that an object using basicAt:put: changed some other object (like a dictionary or symbol) in unforseen ways, I will NOT be sympathetic. jan@tekchips says that if I want constants, I should make them by subclasses. That is exactly my complaint. The flaw in Smalltalk is that I cannot. So-called constant arrays are by default of class Array, which is not constant. They should instead be in class ConstantArray, which has no at:put: message. It will probably have a basicAt:put: message, but that won't bother me. "Symbols are objects, and like all other objects, they may entertain requests to change their contents." However, symbols are usually very reluctant to change their contents. Other "constants" should be, too.
jans@tekchips.TEK.COM (Jan Steinman) (07/14/87)
johnson@uiucdcsp: >jan@tekchips says that if I want constants, I should make them by subclasses. >That is exactly my complaint. The flaw in Smalltalk is that I cannot. >So-called constant arrays are by default of class Array, which is not >constant. They should instead be in class ConstantArray, which has no >at:put: message. It will probably have a basicAt:put: message, but that >won't bother me. Again, the flaw is not in Smalltalk. Adding this code to your image will add a new class that exhibits array accessing behavior similar to that of Symbols. I don't know what "so-called constant arrays" are (literals?), but now you have "real" ConstantArrays! (I still maintain that there is no such thing as a constant in Smalltalk!) The compiler could be hacked to cause the "#()" notation to generate ConstantArrays, but that is left as an exercise for the reader. This took less than one minute to write in Smalltalk. After adding this code, try evaluating: c _ #('test' #foo 456 1.01) asConstantArray. c at: 2 put: #bar. -------------- ConstantArrayHack.st -------------------- Array variableSubclass: #ConstantArray instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Collections-Hacked'! ConstantArray comment: 'This class removes Array storage protocol for those who feel that Smalltalk constants are needed.'! !ConstantArray methodsFor: 'accessing'! at: index put: anObject "ConstantArrays do not allow modification of their contents." self error: self class name, 's cannot be modified!!'! ! !Array methodsFor: 'converting'! asConstantArray "Return an unmodifiable copy of the receiver." | constant | constant _ ConstantArray new: self size. 1 to: self size do: [:i| constant basicAt: i put: (self at: i)]. ^constant! !
steele@unc.cs.unc.edu (Oliver Steele) (07/15/87)
In article <80500010@uiucdcsp> johnson@uiucdcsp.cs.uiuc.edu writes: > >I claimed that it was a flaw in Smalltalk that "constant arrays" were not >constants, and offered symbols as an example of a real constant. I agree. In an earlier article I tried to show that Smalltalk acted the same as many other languages in this respect, but I think that this just shows that many other languages are flawed too. An example of why it is a flaw is that code such as squares | stream between | stream _ WriteStream on: ''. between _ '('. 1 to: 10 do: [:i | stream nextPutAll: between. i*i printOn: stream. between _ ', '. ] stream nextPutAll: ')'. ^stream contents will only work correctly once, and that after it has been run once the source no longer reflects the compiled method. It's obvious to an experienced Smalltalk programmer what is going on, but a run-time error would preferrable to a subtle piece of self-modifying code. >jan@tekchips says that if I want constants, I should make them by subclasses. >That is exactly my complaint. The flaw in Smalltalk is that I cannot. >So-called constant arrays are by default of class Array, which is not >constant. They should instead be in class ConstantArray, which has no >at:put: message. It will probably have a basicAt:put: message, but that >won't bother me. This isn't the problem. I just created ConstantString and ConstantArray, which work as you would expect, and added conversion messages to go between Constant<x> and <x>. Then I changed Scanner|xStringLit and Scanner|scanVector (I've got the names wrong, but the're pretty easy to find) to return their contents asConstantString and asConstantArray, and everything worked fine, up to a point. Two points. The point was that some Smalltalk code on my system (MacPlus with version 0.3) assumed that it would be able to modify strings passed to it. Evaluating strings such as FileStream fileNamed: 'foo' would give an error when a message tried to modify 'foo'. This is just sloppy programming, and could probably be eliminated without too much trouble (if it's even present in other systems). The other point is that code that does fee _ #(fi fo fum) copy. will expect fee to be mutable. I suspect I would have run into this if I had recompiled much of the system after changing Scanner. The only workaround I can see is to let Constant<x>|copy be the same as Constant<x>|as<x>, but this is very misleading and changes the semantics of copy (copy no longer always returns an object of the same class). Comments? ------------------------------------------------------------------------------ Oliver Steele ...!{decvax,ihnp4}!mcnc!unc!steele steele%unc@mcnc.org "They're directly beneath us, Moriarty. Release the piano!"
allenw@tekchips.TEK.COM (Brock) (07/17/87)
In article <805@unc.cs.unc.edu>, steele@unc.cs.unc.edu (Oliver Steele) writes: > The other point is that code that does > fee _ #(fi fo fum) copy. > will expect fee to be mutable. I suspect I would have run into this if I > had recompiled much of the system after changing Scanner. The only > workaround I can see is to let Constant<x>|copy be the same as > Constant<x>|as<x>, but this is very misleading and changes the semantics > of copy (copy no longer always returns an object of the same class). > Comments? > > ------------------------------------------------------------------------------ > Oliver Steele ...!{decvax,ihnp4}!mcnc!unc!steele > steele%unc@mcnc.org > > "They're directly beneath us, Moriarty. Release the piano!" The use of the "species" message to determine what the class of a copy of a ConstantArray should be might be appropiate. Alternately, copy for a constant could be defined to return self (this is what "immutable" objects (Symbol and Character) in Smalltalk-80 currently do). The latter would result in fee _#(fi ... above not doing what was intended. The simplest answer to the issue raised above may be that Smalltalk already has a syntax for accomplishing what was intended by the "fee" expression. It is: fee _ Array with: #fi with: #fo with: #fum. Allen Wirfs-Brock Software Productivity Technologies Tektronix, Inc allenw@spt.TEK.COM
franka@mmintl.UUCP (Frank Adams) (07/24/87)
In article <805@unc.cs.unc.edu> steele@unc.UUCP (Oliver Steele) writes: >In article <80500010@uiucdcsp> johnson@uiucdcsp.cs.uiuc.edu writes: >>jan@tekchips says that if I want constants, I should make them by subclasses. >>That is exactly my complaint. The flaw in Smalltalk is that I cannot. > >This isn't the problem. I just created ConstantString and ConstantArray, >.... Then I changed Scanner|xStringLit and Scanner|scanVector ... >to return their contents asConstantString and asConstantArray ... This is fine for Smalltalk/80[TM]. Unfortunately, the folks at Digitalk saw fit to hide their compiler. This is the only serious complaint I have about Smalltalk/V[TM]. Not so incidently, I have reverse-engineered their compiler, and now have source code for everything on my system. At some point, I will try to separate this from all the other changes I have made, and maybe even post it (if there is sufficient demand). Don't expect it any time real soon, though. Some hints if you wish to try this yourself: The compiler is in a group of classes whose names are all blank (e.g, "' ' asSymbol"). There are various places in the system which skip over all classes with such names -- notably the Class Hierarchy Browser; also a method in SystemDictionary called (if I remember right) "getSourceClasses". The first 3 bytes of a CompiledMethod are header information. The first byte is the primitive number. The second byte represents the number of local variables, including block arguments. This byte is ones-complemented if the method contains blocks which are not optimized away. The third byte is the number of arguments to the method. The last 3 bytes of a compiled method are the pointer to the source code. They will be zero for those methods for which the source code is unavailable. If the 4th byte from the end is less than 134, it is the size of the literal table immediately preceding; otherwise there is no literal table. The literal table is used only by the memory manager; literals used by the method are encoding directly in the byte code stream as necessary. The Smalltalk/V[TM] virtual machine is similar to that of Smalltalk/80[TM], but uses entirely different byte codes. The byte code set is sparse. Note that different byte codes are used for accessing local variables and for returning in methods which have blocks. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108