[comp.sys.handhelds] HP 48 Data Storage

jimd@hpcvra.CV.HP.COM (Jim Donnelly) (08/12/90)
	What follows is one of many possible solutions to a fairly
	common data storage problem on the HP 48:

		"How do I store fields of variable length string
		 data in a compact, rapidly accessible manner that
		 does not require the overhead of storing strings
		 in lists?"

	I hope this will be of use to someone.  I'm sure that
	many people have different "programming styles", so I'll
	look forward to suggestions/code-packs/improvements/etc.

	-- Jim Donnelly
	jimd@cv.hp.com


---------------------------------------------------------------------


			Compact Data Storage

	A simple length-encoding technique can be put to use for a
	free-format, very compact multi-field data storage system.
	Two tiny programs, SUBNUM and STRCON are here to help the
	process, and are listed near the end of this note.  At the
	end of the note is a directory that may be downloaded into
	the HP 48 that contains the examples.

	The principle is to store starting indices in the beginning
	of a string that point to fields stored subsequently in the
	string.  The indices are stored in field order, with an
	additional index at the end to accommodate the last field.
	There are several small points worth mentioning:

	1) Fields may be 0-length using this technique.
	2) The execution time is uniform across all fields.
	3) This technique saves about 4 bytes per field after
	   the first field, because the string prolog and length
	   are omitted for fields 2 -> n.


	EXAMPLE:
	--------

	                 Indices  |          Fields
          Character               |     1 11111111 12222222222
          Position :   1  2  3  4 |567890 12345678 90123456789
	              +--+--+--+--+------+--------+-----------+
          String :    | 5|11|19|30|Field1| Field2 |  Field 3  |
                      +--+--+--+--+------+--------+-----------+

	This is a string that contains 3 fields, and therefore 4
	index entries.  The first field begins at character 5, the
	second field begins at character 11, and the third field
	begins at character 19. To keep the pattern consistent,
	notice that the index for field 4 is 30, which is one more
	than the length of the 29 character data string.

	To extract the second field, place the string on the stack,
	use SUBNUM on character 2 to extract the starting position,
	use SUBNUM on character 3 to extract the (ending position +1),
	subtract 1 from the (ending position+1), then do a SUB to
	get the field data.  NOTE: The index for field 1 is stored
	as character code 5, NOT "5"!  To place the field index for
	field 1 in the string, you would execute "data" 1 5 CHR REPL.


	PROGRAM:
	--------

	The following program accepts an encoded data string in
	level 2 and a field number in level 1:

	DECODE	 "data"  field#  -->  "field"

	<<  --> f
	  <<
	    DUP f SUBNUM		; "data" start -->
	    OVER f 1 + SUBNUM		; "data" start end+1 -->
	    1 -				; "data" start end -->
	    SUB				; "field" -->
	  >>
	>>


	DATA ENCODING
	-------------

	The following program expects a series of 'n' strings on
	the stack and encodes them into a data string suitable
	for reading by the first example above.

	The programs SUBNUM and STRCON are used to assemble the
	indices.

	ENCODE      field n  ...  field 1   n   -->  "data"

	<< DUP 2 + DUP 1 - STRCON --> n  data
	  <<
	    1 n
	    FOR i
	      data i SUBNUM OVER SIZE	; ... field index fieldsize
	      + data SWAP		; ... field "data" index'
	      i 1 + i + SWAP CHR REPL	; ... field "data"'
	      SWAP + 'data' STO		; ...
	    NEXT
	    data			; "data"
          >>
	>>

	In this example, four strings are encoded:

	Input:  5: "String"
		4: "Str"
		3: "STR"
		2: "STRING"
		1:         4

	Output: "xxxxxSTRINGSTRStrString"      (23 character string)
	(The first five characters have codes 6, 12, 15, 18, and 24)



	VARIATION:
	----------

	The technique above has a practical limit of storing
	up to 254 characters of data in a string.  To overcome
	this, just allocate two bytes for each field position.
	The code to extract the starting index for becomes a
	little more busy.  In this case, the index is stored as
	two characters in hex.

	              Indices  |          Fields
       Character               | 11111 11111222 22222223333
       Position :   12 34 56 78|901234 56789012 34567890123
                   +--+--+--+--+------+--------+-----------+
       String :    |09|0F|17|21|Field1| Field2 |  Field 3  |
                   +--+--+--+--+------+--------+-----------+

	   <<  --> f
	     <<
		DUP f 2 * 1 -      	; "data" "data" indx1 -->
		SUBNUM 16 *		; "data" 16*start_left_byte  -->
 		OVER f 2 * SUBNUM + 	; "data" start
		OVER f 2 * 1 + SUBNUM	; "data" start end_left_byte -->
		16 * 3PICK f 1 + 2 *
		SUBNUM + 1 -		; "data" start end -->
		SUB			; "field"  -->
	     >>
	   >>



	TWO VERY TINY HELPFUL PROGRAMS
	------------------------------

	SUBNUM		"string"  position  -->  code

	<< DUP SUB NUM >>



	STRCON		code  count  -->  "repeated string"

	<< -->  code count
	  << "" code CHR 'code' STO
	     1 count START code + NEXT
          >>
	>>


	A DIRECTORY YOU CAN DOWNLOAD
	----------------------------

	This is a directory object.  Cut after the === to the end of
	the file and download to your HP 48 using the ASCII transfer.

========================================================================
%%HP: T(3)A(D)F(.);
DIR
  DECODE
    \<< \-> f
      \<< DUP f
SUBNUM OVER f 1 +
SUBNUM 1 - SUB
      \>>
    \>>
  ENCODE
    \<< DUP 2 + DUP 1
- STRCON \-> n data
      \<< 1 n
        FOR i data
i SUBNUM OVER SIZE
+ data SWAP i 1 +
SWAP CHR REPL SWAP
+ 'data' STO
        NEXT data
      \>>
    \>>
  STRCON
    \<< \-> code count
      \<< "" code CHR
'code' STO 1 count
        START code
+
        NEXT
      \>>
    \>>
  SUBNUM
    \<< DUP SUB NUM
    \>>
END