[comp.lang.smalltalk] Binary Object Storage in Smalltalk-80

eliot@cs.qmc.ac.uk (Eliot Miranda) (07/31/89)

Here is a Smalltalk-80 package to save + restore arbitrary objects to + from
files.  A description follows below.  But first some notes on the code.
The system was written for QMC's version of Smalltalk-80 2.3 which is very like
ParcPlace Smalltalk-80 V2.3 with some exceptions:

QMC's version
	a) permits temporaries to be declared in blocks
	b) has a class variable in Form called SystemIsBigEndian

If you're filing this in on a Smalltalk without block temporaries you'll need to
correct the code. e.g.

	store: anObject on: aStream
		| manager |
		Cursor wait showWhile: [manager _ (self new: 1024) initialize].
		Cursor write showWhile: [
			(aStream isKindOf: String)
				ifTrue: [
					| fileStream |
					fileStream _ FileStream fileNamed: aStream.

must be changed to

	store: anObject on: aStream
		| manager fileStream |
		Cursor wait showWhile: [manager _ (self new: 1024) initialize].
		Cursor write showWhile: [
			(aStream isKindOf: String)
				ifTrue: [
					fileStream _ FileStream fileNamed: aStream.


Hopefully someone will be able to adapt this for Smalltalk-V (if needed).
I place NO restrictions on the use of this code.

Here it is; Share And Enjoy!



------------------------------ Cut Here --------------------------------
| s |
s _ '

		Binary Structure Copying.

		E. Miranda. QMC 31 July 1989.

This package allows one to store and retreive arbitrary networks of
Smalltalk objects to and from files.  The package copes with circular
structures and with structures of arbitrary size.  Objects are stored
on files in a binary representation giving improved performance over
the textual ''structure copying'' system which it supercedes.

	To store an object (& its contents) use

		anObject storeBinary
		anObject storeBinaryOn: ''myfile.stbin''

	To retreive an object use

		BinaryInputManager readFrom: ''myfile.stbin''

Use of the file suffix ''.stbin'' is strongly recommended for binary files
created with this package.
'.
Transcript show: s.
NotifierView
	openContext: thisContext
	label: 'Binary IO Package. Proceed to continue'
	contents: s!


IdentityDictionary variableSubclass: #BinaryIOManager
	instanceVariableNames: ''
	classVariableNames: 'ClassType FalseType GlobalType IdType NilType ObjectType TrueType TypeTable '
	poolDictionaries: ''
	category: 'System-Support'!
BinaryIOManager comment:
'I am a shared superclass for the binary IO classes BinaryInputManager & BinaryOutputManager.
I define some class variables that define the types of descriptions in binary files, see BinaryIOManager class>>initialize'!


!BinaryIOManager methodsFor: 'accessing'!

codeForFalse
	^FalseType!

codeForNil
	^NilType!

codeForTrue
	^TrueType! !

!BinaryIOManager methodsFor: 'adding'!

grow
	"Must copy instance variables when growing"
	| instVars |
	instVars _ (self class superclass instSize + 1 to: self class instSize) collect: [:i|
					Association key: i value: (self instVarAt: i)].
	super grow.
	instVars do: [:assoc| self instVarAt: (assoc key) put: assoc value]! !

!BinaryIOManager methodsFor: 'private'!

rehash
	"Must copy instance variables when rehashing"
	| instVars |
	instVars _ (self class superclass instSize + 1 to: self class instSize) collect: [:i|
					Association key: i value: (self instVarAt: i)].
	super rehash.
	instVars do: [:assoc| self instVarAt: (assoc key) put: assoc value]! !
"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!

BinaryIOManager class
	instanceVariableNames: ''!


!BinaryIOManager class methodsFor: 'class initialization'!

initialize
	"Initialize the types & type table for binary i/o"

	TypeTable _ #(	getObjectId
						getNil
						getTrue
						getFalse
						getObjectDefinition
						getClassDefinition
						getGlobalDefinition ).
	IdType _ 1.
	NilType _ 2.
	TrueType _ 3.
	FalseType _ 4.
	ObjectType _ 5.
	ClassType _ 6.
	GlobalType  _ 7

	"BinaryIOManager initialize"! !

BinaryIOManager initialize!



!SmallInteger methodsFor: 'binary storage'!

hasSpecialBinaryRepresentation
	^true!

storeBinaryOn: stream manager: manager
	"SmallIntegers are stored as their value with the 32nd bit set as a tag."

	stream
		nextPut: (((self bitShift: -24) bitAnd: 16rFF) bitOr: 16r80);
		nextPut: ((self bitShift: -16) bitAnd: 16rFF);
		nextPut: ((self bitShift: -8) bitAnd: 16rFF);
		nextPut: (self bitAnd: 16rFF)! !


!String methodsFor: 'binary storage'!

storeBinaryDefinitionOn: stream manager: manager
	manager putIdOf: self class on: stream.
	stream nextNumber: 4 put: self basicSize.
	stream nextPutAll: self asByteArray! !


!Set methodsFor: 'binary storage'!

readBinaryContentsFrom: stream manager: manager
	super readBinaryContentsFrom: stream manager: manager.
	self rehash! !


!TextStyle class methodsFor: 'binary storage'!

addGlobalsTo: globalDictionary manager: manager
	TextStyles do: [:style|
		style fontArray do: [:font|
			globalDictionary at: font put: self]]!

storeBinaryDefinitionOf: anObject on: stream manager: manager
	anObject class == StrikeFont ifTrue: [
		TextStyles associationsDo: [:assoc|
			| style |
			style _ assoc value.
			1 to: style fontArray size do: [:i|
				(style fontAt: i) == anObject ifTrue: [
					| string |
					string _ '(TextStyle styleNamed: ', assoc key storeString, ') fontAt: ', i printString.
					stream nextNumber: 2 put: string size.
					string do: [:char| stream nextPut: char asciiValue].
					^self]]]].
	^super storeBinaryDefinitionOf: anObject on: stream manager: manager! !


!ClassDescription methodsFor: 'binary storage'!

binaryDefinitionFrom: stream manager: manager
	| obj basicSize i |
	self isPointers ifTrue: [
		stream next. "skip instSize"
		^self isVariable
			ifTrue: [self basicNew: (stream nextNumber: 3)]
			ifFalse: [self basicNew]].

	obj _ self basicNew: (basicSize _ stream nextNumber: 4).
	i _ 0.
	self isBytes
		ifTrue: [
			[(i _ i + 1) <= basicSize] whileTrue: [
				obj basicAt: i put: stream next]]
		ifFalse: [
			[(i _ i + 1) <= basicSize] whileTrue: [
				obj basicAt: i put: stream nextWord]].
	^obj!

storeBinaryDefinitionOn: stream manager: manager
	| myName |
	stream
		nextWordPut: (format bitAnd: 16rFFFF);
		nextWordPut: (myName _ self name) size.
	myName do: [:c| stream nextPut: c asciiValue]! !


!String class methodsFor: 'binary storage'!

binaryDefinitionFrom: stream manager: manager
	^(stream next: (stream nextNumber: 4)) asString! !


!Object methodsFor: 'testing'!

isClass
	^false!

isFileStream
	^false! !

!Object methodsFor: 'public binary storage'!

storeBinary
	"Writes a description of the receiver into a file, in a way that allows
	 the object's structure to be reconstructed from the file's contents."

	| fileName |
	fileName _ FileDirectory
					requestFileName: 'Store binary on which file name?'
					default: (self class name, '.', self asOop printString, '.stbin')
					version: #any
					ifFail: [^nil].
	BinaryOutputManager store: self on: fileName!

storeBinaryOn: aStream
	"Writes a description of the receiver onto aStream, in a way that allows
	 the object's structure to be reconstructed from the stream's contents"

	BinaryOutputManager store: self on: aStream! !

!Object methodsFor: 'binary storage'!

hasSpecialBinaryRepresentation
	^false!

readBinaryContentsFrom: stream manager: manager
	| size i |
	size _ self class instSize.
	i _ 0.
	[(i _ i + 1) <= size] whileTrue: [
		self instVarAt: i put: manager nextObject].
	size _ self basicSize.
	i _ 0.
	[(i _ i + 1) <= size] whileTrue: [
		self basicAt: i put: manager nextObject]!

storeBinaryDefinitionOn: stream manager: manager

	| i basicSize |
	manager putIdOf: self class on: stream.
	i _ 0.
	self class isPointers
		ifTrue: [
			| instSize |
			stream nextPut: (instSize _ self class instSize).
			self class isVariable
				ifTrue: [stream nextNumber: 3 put: (basicSize _ self basicSize)]
				ifFalse: [basicSize _ 0].

			[(i _ i + 1) <= instSize] whileTrue: [
				manager putIdOf: (self instVarAt: i) on: stream].

			i _ 0.
			[(i _ i + 1) <= basicSize] whileTrue: [
				manager putIdOf: (self basicAt: i) on: stream]]
		ifFalse: [
			stream nextNumber: 4 put: (basicSize _ self basicSize).
			self class isBytes
				ifTrue: [
					[(i _ i + 1) <= basicSize] whileTrue: [
						stream nextPut: (self basicAt: i)]]
				ifFalse: [
					[(i _ i + 1) <= basicSize] whileTrue: [
						stream nextWordPut: (self basicAt: i)]]]!

storeBinaryOn: stream manager: manager
	manager putIdOf: self on: stream! !


!ExternalStream methodsFor: 'nonhomogeneous accessing'!

nextNumber: n 
	"Answer the next n bytes as a positive Integer or LargePositiveInteger."

	| s i |
	n <= 4 ifTrue: 
		[s _ 0.
		i _ 0.
		[(i _ i + 1) <= n] whileTrue: [s _ ((s bitShift: 8) bitOr: self next)].
		^s].
	s _ LargePositiveInteger new: n.
	1 to: n do: [:j | s at: n + 1 - j put: self next].
	"reverse order of significance"
	^s truncated!

nextNumber: n put: v 
	"Append to the receiver the argument, v, which is a positive SmallInteger or
	a LargePositiveInteger, as the next n bytes.  Possibly pad with leading zeros."

	| vlen i |
	n < (vlen _ v digitLength) ifTrue: [self error: 'number too big'].

	"pad with leading zeros"
	i _ n.
	[i > vlen] whileTrue: [self nextPut: 0. i _ i - 1].
	i = 1 ifTrue: [^self nextPut: v].
	[i > 0] whileTrue: [self nextPut: (v digitAt: i). i _ i - 1]! !


!SmallInteger class methodsFor: 'binary storage'!

binaryDefinitionFrom: stream manager: manager
	| value |
	(value _ (stream next bitAnd: 16r7F)) > 16r3F
		ifTrue: [value _ value - 16r80].
	value _ (value bitShift: 8) bitOr: stream next.
	value _ (value bitShift: 8) bitOr: stream next.
	value _ (value bitShift: 8) bitOr: stream next.
	^value! !


!Class methodsFor: 'testing'!

isClass
	^true! !

!Class methodsFor: 'binary storage'!

addGlobalsTo: globalDictionary manager: manager
	classPool == nil ifFalse: [
		classPool associationsDo: [:assoc|
			globalDictionary at: assoc put: self]]!

storeBinaryDefinitionOf: anAssociation on: stream manager: manager
	| string | 
	string _ self name, ' classPool at: ', anAssociation key storeString.
	stream nextNumber: 2 put: string size.
	string do: [:char| stream nextPut: char asciiValue]! !


!Symbol class methodsFor: 'binary storage'!

binaryDefinitionFrom: stream manager: manager
	^self intern: (super binaryDefinitionFrom: stream manager: manager)! !


!Dictionary methodsFor: 'binary storage'!

addGlobalsTo: globalDictionary manager: manager
	self associationsDo: [:assoc| globalDictionary at: assoc put: self]!

storeBinaryDefinitionOf: anObject on: stream manager: manager
	| string | 
	string _ (Smalltalk keyAtValue: self), ' associationAt: ', anObject key storeString.
	stream nextNumber: 2 put: string size.
	string do: [:char| stream nextPut: char asciiValue]! !


!Boolean methodsFor: 'binary storage'!

hasSpecialBinaryRepresentation
	^true! !


!Form methodsFor: 'binary storage'!

readBinaryContentsFrom: stream manager: manager
	"read the trailing byte containing flags to define system dependent information about the form
	 and respond accordingly."

	| flags |
	super readBinaryContentsFrom: stream manager: manager.
	flags _ stream next.

	(flags allMask: 1) ~= SystemIsBigEndian ifTrue: [	"Reverse the bits in form"
		depth = 1 ifTrue: [
			1 to: bits size do: [:i| bits at: i put: (bits at: i) wordReversed]]]!

storeBinaryDefinitionOn: stream manager: manager
	"append a byte containing flags to define system dependent information about the form.
	 Currently the bits are:
		bit 1:	is the system bigendian+bigbittian"

	super storeBinaryDefinitionOn: stream manager: manager.
	stream nextPut: (SystemIsBigEndian ifTrue: [1] ifFalse: [0])! !


!UndefinedObject methodsFor: 'binary storage'!

hasSpecialBinaryRepresentation
	^true!

storeBinaryOn: stream manager: manager
	stream nextPut: manager codeForNil! !

BinaryIOManager variableSubclass: #BinaryInputManager
	instanceVariableNames: 'stream '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'System-Support'!
BinaryInputManager comment:
'I read binary files of the format created by BinaryOutputManager.  See the comment there for details of the format.  Use

	BinaryInputManager readFrom: ''filename.stbin''

to recreate the objects stored on such files.'!


!BinaryInputManager methodsFor: 'public access'!

readFrom: aStream
	(stream _ aStream) isFileStream
		ifTrue: [stream binary].
	^self nextObject! !

!BinaryInputManager methodsFor: 'structure reading'!

getClassDefinition
	| id format nameLength name class |
	id _ stream nextNumber: 3.
	format _ stream nextNumber: 2.
	nameLength _ stream nextNumber: 2.
	name _ (stream next: nameLength) asString.
	(Symbol hasInterned: name ifTrue: [:sym| name _ sym])
		ifFalse: [self error: 'Unknown class name: ', name].
	class _ Smalltalk at: name ifAbsent: [self error: 'Non-existant class: ', name].
	(class format bitAnd: 16rFFFF) ~= format
		ifTrue: [self error: 'Class format has changed'].
	self at: id put: class.
	^class!

getFalse
	^false!

getGlobalDefinition
	| id nameLength object |
	id _ stream nextNumber: 3.
	nameLength _ stream nextNumber: 2.
	object _ Cursor execute showWhile: [
					Compiler evaluate: (stream next: nameLength) asString for: nil logged: false].
	^self at: id put: object!

getNil
	^nil!

getObjectDefinition
	| id class obj |
	self
		at: (id _ stream nextNumber: 3)
		put: (obj _ (class _ self nextObject)
						binaryDefinitionFrom: stream manager: self).
	"Must add the object to the table BEFORE reading the rest of its definition
	 because it may (even indirectly) refer to itself"
	class isPointers ifTrue: [obj readBinaryContentsFrom: stream manager: self].
	^obj!

getObjectId
	| id |
	^self at: (id _ stream nextNumber: 3) ifAbsent: [self error: 'non-existant object id']!

getTrue
	^true!

nextObject
	| typeByte |
	(typeByte _ stream next) > 127 ifTrue: [
		stream skip: -1.
		^SmallInteger binaryDefinitionFrom: stream manager: self].

	^self perform: (TypeTable at: typeByte)! !
"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!

BinaryInputManager class
	instanceVariableNames: ''!


!BinaryInputManager class methodsFor: 'structure reading'!

readFrom: streamOrFileName

	"Reads an object's structure from the stream streamOrFileName
	 or the file named streamOrFileName"

	(streamOrFileName isKindOf: String)
		ifTrue:
			[^Cursor read showWhile: [(self new: 1024) readFrom: (FileStream fileNamed: streamOrFileName)]].
	^(self new: 1024) readFrom: streamOrFileName! !


!FileStream methodsFor: 'testing'!

isFileStream
	^true! !


!True methodsFor: 'binary storage'!

storeBinaryOn: stream manager: manager
	stream nextPut: manager codeForTrue! !


!SystemDictionary methodsFor: 'binary storage'!

addGlobalsTo: globalDictionary manager: manager
	| pools |
	pools _ Set new.
	self associationsDo: [:assoc|
		assoc value isClass
			ifTrue: [
				assoc value addGlobalsTo: globalDictionary manager: manager.
				pools addAll: assoc value sharedPools]
			ifFalse: [
				globalDictionary at: assoc put: self].
		globalDictionary at: assoc value put: self].

	pools do: [:poolDictionary|
		poolDictionary addGlobalsTo: globalDictionary manager: manager]!

storeBinaryDefinitionOf: anObject on: stream manager: manager
	| string | 
	string _ anObject class == Association
				ifTrue: ['Smalltalk associationAt: ', anObject key storeString]
				ifFalse: ['Smalltalk at: ', (self keyAtValue: anObject) storeString].
	stream nextNumber: 2 put: string size.
	string do: [:char| stream nextPut: char asciiValue]! !

BinaryIOManager variableSubclass: #BinaryOutputManager
	instanceVariableNames: 'lastIndex globals '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'System-Support'!
BinaryOutputManager comment:
'Binary storage consists of a sequence of Object IDs

Object IDs are identified by 4 byte words.
First byte defines type:

byte between: 128 and: 255
		small integer in 31 bits

byte = 0		object id in next 3 bytes
byte = 1		nil
byte = 2		true
byte = 3		false
byte = 4		object id in next 3 bytes; object definition follows
byte = 5		class id in next 3 bytes; class definition follows
byte = 6		global id in next 3 bytes; global definition follows

Object Definitions are
	class id
	followed by
		non-indexable
			inst size in next byte
			''inst size'' ids follow
		indexable
			inst size in next byte
			variable size in next 3 bytes
			''inst size'' ids follow
			''variable size'' elements follow

	see implementors of storeBinaryDefinitionOn:manager: & readBinaryContentsFrom:manager:

Class Definitions are
		format in next 2 bytes
		name length in next 2 bytes
		name length bytes of name

Global Definitions are
	expression length in next two bytes
	''expression'' characters follow


The objects stored as global definitions are collected during BinaryOutputManager>>initialize using the addGlobalsTo:manager: message.  It is possible (hopefully easily) to customize this to add your own globals to the set.'!


!BinaryOutputManager methodsFor: 'initialize-release'!

initialize
	"Initialize my self for subsequent binary output of some object."
	lastIndex _ 0.
	globals _ IdentityDictionary new: 2048.

	"Get the system (Smalltalk) to register all objects it considers 'global'
	 to the globals table.  Such objects will not be stored; instead an expression
	 is stored which (when evaluated) references the global.
	 Arbitrary objects may be defined as globals. (use the messages menu item &
	 look for implementors of addGlobalsTo:manager:).
	The default is to define as global
		globals in Smalltalk,
		classes,
		class variables & pool variables.
	Collecting the globals takes about 2 seconds. If this is too much time per object
	a default set of globals could be maintained in a class variable"

	Smalltalk addGlobalsTo: globals manager: self

	"MessageTally spyOn: [(BinaryOutputManager new: 2) initialize]

	 Time millisecondsToRun: [(BinaryOutputManager new: 2) initialize]"! !

!BinaryOutputManager methodsFor: 'accessing'!

putIdOf: anObject on: aStream
	| objectId owner |
	anObject hasSpecialBinaryRepresentation ifTrue: [
		^anObject storeBinaryOn: aStream manager: self].
	nil == (objectId _ self findValueOrNil: anObject)
		ifFalse: [^aStream nextPut: IdType; nextNumber: 3 put: objectId].
	(owner _ globals at: anObject ifAbsent: []) == nil
		ifTrue: [
			self at: anObject put: (lastIndex _ lastIndex + 1).
			aStream
				nextPut: ObjectType;
				nextNumber: 3 put: lastIndex.
			anObject storeBinaryDefinitionOn: aStream manager: self]
		ifFalse: [
			anObject isClass ifTrue: [^self putIdOfClass: anObject on: aStream].
			self at: anObject put: (lastIndex _ lastIndex + 1).	
			aStream
				nextPut: GlobalType;
				nextNumber: 3 put: lastIndex.
			owner storeBinaryDefinitionOf: anObject on: aStream manager: self]!

putIdOfClass: anObject on: aStream
	| classId |
	nil == (classId _ self findValueOrNil: anObject)
		ifFalse: [^aStream nextPut: IdType; nextNumber: 3 put: classId].
	self at: anObject put: (lastIndex _ lastIndex + 1).
	aStream
		nextPut: ClassType;
		nextNumber: 3 put: lastIndex.
	anObject storeBinaryDefinitionOn: aStream manager: self! !

!BinaryOutputManager methodsFor: 'private'!

findValueOrNil: key  
	"Look for the key in the receiver.  If it is found, answer
	the value corresponding to the key, otherwise answer nil."

	| index length probe pass |
	length _ self basicSize.
	pass _ 1.
	index _ key identityHash \\ length + 1.
	[(probe _ self basicAt: index) == nil ifTrue: [^nil].
	probe == key]
		whileFalse: [
			(index _ index + 1) > length ifTrue: 
				[index _ 1.
				pass _ pass + 1.
				pass > 2 ifTrue: [^nil]]].
	^(valueArray basicAt: index)! !
"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!

BinaryOutputManager class
	instanceVariableNames: ''!


!BinaryOutputManager class methodsFor: 'binary storage'!

store: anObject on: aStream
	| manager |
	Cursor wait showWhile: [manager _ (self new: 1024) initialize].
	Cursor write showWhile: [
		(aStream isKindOf: String)
			ifTrue: [
				| fileStream |
				fileStream _ FileStream fileNamed: aStream.
				fileStream binary.
				anObject storeBinaryOn: fileStream manager: manager.
				fileStream close]
			ifFalse: [
				anObject storeBinaryOn: aStream manager: manager]]! !


!False methodsFor: 'binary storage'!

storeBinaryOn: stream manager: manager
	stream nextPut: manager codeForFalse! !
-- 
Eliot Miranda				email:		eliot@cs.qmc.ac.uk
Dept of Computer Science		Tel:		01 975 5220
Queen Mary College			International:	+44 1 975 5220
Mile End Road
LONDON E1 4NS