[comp.sys.handhelds] SAD 1.03 [README] - Saturn Disassembler Version 1.03

bson@rice-chex.ai.mit.edu (Jan Brittenson) (11/17/90)

   Alright. Here is SAD 1.03, which has been in the works for quite a
while. It works fairly well; I've been using it myself for a couple of
weeks now, and can't say I've found any anomalities. But then I'm an
unsophisticated user, and don't need to press all buttons at once. :-)
Still no documentation.


Anyway, here is a brief list of changes:

	* Multiple passes:
	  - Local symbols are generated during pass 1, as well as
	    cross referencing info gathered.
	  - An intermediate `pass F' collects formatting info, and is
	    repeated until no further info can be collected. This may,
	    in grotesquely misused cases, mean indefinitely.
	  - The final pass 2 generates the final output.
	* GNU Emacs mode additions
	* The infamous br/ret and several other bugs are fixed.
	* Cosmetical changes (indentation, code objects, etc)
	* Partially recoded to improve robustness.


Two anomalities that have _not_ been fixed:

	* .formats MUST contain an explicit statement for address 0.
	  (`0:c' is recommended.)
	* xcom problems with major comments.


   SAD 1.03 is available from rice-chex.ai.mit.edu [128.52.38.46] by
anonymous FTP, the file is ~/pub/sad-1.03.tar.Z.

   Like with the anouncements of 1.01 and 1.02 I have included the
README file. SAD is distributed in the hope that it will be useful,
but with ABSOLUTELY NO WARRANTY; without even implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

						-- Jan Brittenson
						   bson@ai.mit.edu
O  /
 \/
 /\  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
O  \


* SYNOPSIS, SAD 1.03

   This is the README file for SAD, the Saturn Disassembler package.
SAD comes with no documentation at this time, other than this file.

   When the documentation is completely finished, it will be included
in the distribution. No partial documentation is included because it
would serve to confuse and cause complaints only. Documentation is
under way, and will be available as a GNU Emacs Info tree. For DOS
users, the info tree will be included in TeX format.

   SAD is a package currently consisting of sad (the disassembler),
xsym (the symbol extractor), xcom (the comment extractor), sadfmt
(formats tool) and sad.el (GNU Emacs SAD mode). The purpose of SAD is
to let you disassemble Saturn Machine Language (ML) and RPL code, edit
it, and maintain databases of symbols, comments, formats and macros.
The formats database contains information directing the disassembler
to either ML, RPL, or Data, the latter of which may be complex nested
structures. The Macros database contains nibble patterns for various
common idioms. You may, for instance, declare that the sequence -

		84e20201424

is to be printed as -

		84e20201424  GLOBAL "AB"

by defining the macro -

		5,2e48:GLOBAL "%2S"

The formats database lets you declare structures such as

		x5,2(d2,a5)

to instruct the dissassembler to consider this to be Data consisting
of a 5-nibble hex integer, and two 5-character strings each preceeded
by a 2-nibble decimal integer. A formats entry will remain active and
repeated until a new formats entry is applicable. The main purpose
is to specify synchronization points and data formatting.

Note:

   In SAD 1.03, macros are used only during RPL disassembly, and are
restricted to 5-nibble sequences. You may, however, define macros with
pattern tags of any length up to 8 nibbles, they will merely be
ignored.


* INSTALLATION

   Typing `make' should be sufficient. Since execution speed is
crucial, you may want to turn on all macho speed optimizations
available. To do this, edit Makefile. DOS users may have to convert it
to some other format, or compile manually.


* REPORTING BUGS

   If you find a bug in SAD, you should report it. But first, you
should make sure that it really is a bug, and that it appears in the
latest version of SAD that you have.

   Once you have ascertained that a bug really exists, please mail me
a bug report. If you have a fix, please mail that as well!
Suggestions and `philosophical' bugs are equally welcome.

Please include the following:

	* The version number of SAD
	* A description of the bug behaviour
	* A short script or `recipe' which exercises the bug

And mail it to bson@ai.mit.edu.


* USAGE

Note for DOS users:

   Please read through these instructions first, as several file names
are not compatible with DOS file names. To change the names, edit
sad.h, dump2core.c, and scan2core.

Dump alternative 1:

   On your HP48 KGET (or type in) the DUMP program included in
Dump.RPL. The checksum should be # 5149h and the size 72. This program
takes two arguments: start and end addresses. It will dump memory,
using the PEEK program posted Mar 16, 1990 by Alonzo Gariepy. Make
sure to set the word size to 64 first, with `64 STWS'. Direct I/O to
WIRE, make sure your computer is set to capture the dump. Hook up your
HP48, and type in #0h #6FFF0h DUMP. DUMP will continually display the
currect dump address in the top left corner of the display, which will
otherwise remain blank apart from the menu.

   DUMP will take a long time. The the entire ROM dump is about 450
kilobytes - so try and use as high speed as possible. 

   The utility dump2core will convert your dump to a core file named
.core, which is what the disassembler will be looking for. It reads
the dump from standad input, and overwrites .core if it exists;
otherwise a new one will be created. 

[Note: the dump consists of records of two lines each. The first is
the address, the second the data as returned by PEEK. Dump2core
ignores the address part, it's included only to serve as a reference
for you, to allow you to retransmit smaller portions, should it prove
necessary. You are recommended to verify that the dump is correct; the
following command will list all clobbered lines, if any, along with
their line numbers:

		grep -vn '^# [0-9A-F]+h$' romdump                       ]


Dump alternative 2:

   Enter memory scanner mode. Hook up you HP48 to your computer, and
make sure the HP48 output is captured in a file. Use the scanner to
continuously dump 00000-6FFFF by first pressing ENTER followed by /
and then keep pressing SPC until done.


   Copy the provided set of standard symbols, formats, and macros to
.symbols, .formats, and .macros respectively:

		cp stdsymbols .symbols
		cp stdformats .formats
		cp stdmacros .macros

   Use cp, not rm. Keep the standard set of databases right by the
source code to facilitate recovering the standard table should you
happen to wreck things.


Disassembly is done with the `sad' command:

		sad [flags] start end

where		      are
	start,end	Hex addresses of first and last instructions.

	flags, 		A set of flags, always bundled up as one argument.
	   -acsdxz	
		a	Assembler format, i.e. PC and opcode fields are
			suppressed.
	
		c	Suppression of disassembler comments.

		s	Symbolic addresses are moved to the comments.

		d	The supplementary definition of symbols known,
			referenced, but not otherwise defined in the
			output, is suppressed.

		f	Keep repeating the F pass until no further
			formatting information can be collected. Write
			output to formats.out. (See sadfmt -j.)

		1	One pass only. Skip local symbols. Independent
			of the f flag.

		g	Don't generate local symbols (globals only).

		C	Don't output any code. Useful if all you want
			is a cross reference (see -x below), or collect
			formatting information.

		x	A cross reference is added at the end, as
			comments with symbols and addresses in the
			disassembly where they are referenced.
		 	
		z	Alonzo mode. PC and opcode fields are printed
			slightly differently. The initial org instruction
			is suppressed.


   Two of the maintenance tools are xsym and xcom. Both take as arguments
a collection of bundled-up flags (`xsym -sr' for instance).

		s	Supersede contents of database with information
			extracted from a listing on standard input.

		m	Merge contents of database with information
			extracted from a listing on standard input.

		l	Include source line numbers along with any
			errors or warnings,

		r	Overwrite the database file instead of sending
			the superseded/merged result to standard
			output.


   The third maintenance tools is sadfmt. It takes as its first
argument the optional flag "-r", similar to the -r flag or xsym and
xcom, as its second (or first when no flags are present) argument an
address in hex, and as an optional last argument a new format. If no
new format is supplied, the previous format is displayed. Sadfmt also
takes these additional command lines:


	-[r]j [joinfile]	Join `joinfile' with .formats.
	-[r]d addr		Remove format, if any, at `addr.'

Note:
Do not use this form, as it will clobber your file:

		xsym >.symbols
Instead, use:
		xsym -r

The same applies to xcom and sadfmt.


* FILES

   `.core' consists of binary raw data, where each byte corresponds to
one nibble. The upper half is reserved for other purposes, but
currently unused. Address 0 corresponds to offset 0.

`.symbols' consists of lines of the following format:

		<value>:<symbol>
		<value>=<symbol>

Example:	70579=TOS

   The presence of either ":" or "=" reflects whether the symbol was
defined with a "symbol=val" or "val symbol:" statement. It is
currently not used, but may be in the future, especially in
conjunction with Formats and Macros. The value is the symbol value,
and the symbol is the symbol. The file is not ordered.


`.comments' is similar to .symbols:

		<address>=<comment string>
		<address>:<comment string>

Example:	5b79=Allocate a string

   Several comments may be bound to the same address, in which case
they appear in the specified order. Here "=" and ":" reflect whether
the comment is considered a `major comment' or a mere `minor' one.
Major comments are comments put on a line of their own, whereas minor
comments are appended to the right of the code. "=" signals a major
comment, and ":" a minor. The semicolon is implicit, and not included.

   During disassembly, at any given address, all major comments are
output first, follwed by any symbol definitions, and then code with
minor comments appended to their right. The file is not ordered.


   `.formats' contains disassembly formatting information, mostly
related to correctly decoding data and synchronization. The directives
can be divided into three categories: Machine Language (ML), RPL, and
Data. The file constists of entries of the form:

		<address>:<format>

specifying that from <address> and on, <format> is to be active. If
during disassembly <address> is about to be passed, the disassembler
will back up to <address>. This behaviour is called `synchronization,'
and is performed even if an identical format was previously in effect.
For RPL and ML, <format> is either `r' or `c' respectively, and may
not be nested or combined with or within Data format specifications.

The syntax for Data format specifications is:

		[<repeat>]<formatchar>[<width>]
	or	<format>[,<format>]
	or	[<repeat>](<format>)

   Where <repeat> and <width> are decimal integers. Commas (,) are
used to separate sequences of formats to be used sequentially.

   The format character <formatchar> is one of the following. `R'
refers to the repeat count, and `w' to the width.

   x	Hex	R words of W nibbles in hexadecimal.
   d	Dec	R words of W nibbles in decimal.
   o	Oct	R words of W nibbles in octal.
   a	Ascii	R sequences of W characters.
   s	String	R sequences of characters whose lengths are determined by
		a W-nibble word preceding the sequence, minus W.
   v	Vector	R sequences of nibbles presented in hex, whose lengths
		are determined by a W-nibble word preceding the sequence,
		minus W.
   w	Word	R 64-bit words presented in floating point, RPL style.
	
Examples:

		5b79:c
		2a2b4:r
		2a2b4:x5,w

[Note: if the example above were actually used, the format effective at
2a2b4 would unpredictably be either one of the two conflicting ones.]


   `.macros' contains pairs of patterns and macro definitions. The file
consists of entries of the form:

		<length>,<pattern>:<definition>

   Where <length> is the length of the pattern, <pattern> the pattern
data, and <definition> a string to be expanded. The left-hand side of
the colon (:) is referred to as the `tag.' The definition is the
resultant strings, possibly with embedded expansion directives. These
start with a percent sign (%), are optionally followed by width (W)
and adjustment (A) terms, and end in a directive character. The
interpretation, if any, of w and a is directive dependent.

General:
		%[<w>[,<a>]]<d>

Directives:
   x	Hex	W (default 5) nibbles as hex digits, or as a symbol.
   d	Dec	W (default 5) nibbles as a decimal word.
   o	Oct	W (default 5) nibbles as an octal word.
   b	Bin	W (default 5) nibbles as a binary word.
   w	Word	64-bit word as a floating-point word, RPL fashion.
   l	Long	84-bit word as a long floating-point word, RPL fashion.
   a	Ascii	W (default 1) characters.
   s	String  W (default 2) nibble word specifying the string length
		in nibbles, minus A (default 0).
   S	String  W (default 2) nibble word specifying the string length
		in characters, minus A (default 0).
   v	Vector	W (default 2) nibble word specifying the vector length,
		minus A (default 0), presented as hex digits.
   i	Instr.	W nibble (default 5) word minus 4 minus A specifies a length
		in nibbles to be disassembled as ML. Expands to the
		word content minus A, in decimal. Returns to previous
		format after ML of the given length has been disassembled.
   I	Instr	Override current format with ML (format `c').

   z	Skip	Skip (advance) W nibbles.

   +	Begin	Designate beginning of new block.
   -	End	Designate end of block.
   e	End	Same.
   
   =	Equal	Assert that the following W nibbles are A.

Examples:

		5,2a2c:STRING "%5,5s"
		5,2933:REAL %w
		5,2d9d:PROGRAM%+
		5,312b:END%-
		5,2e48:GLOBAL "%2S"
		5,2dcc:CODE %5,1i
		5,28fc:TYPE%I


   The general idea is to put the programs in a bin directory and set
up a separate directory for each disassembly project.


* GNU Emacs AND sad.el

   The sad-mode facilitates interactive exploration of a core. First
edit sad.el and the runfile variables to point to sad, xsym, and xcom
as appropriate. (Default is according to the current search path.)
Load sad.el and do M-x sad. Emacs will first prompt for a range before
setting up a new buffer and disassembling. The range format is


		<from>-<to>

	   where	     are
		from,to		addresses in hexadecimal.



   While in a SAD buffer, the following key bindings are in effect.
C-c is the conventional "special mode prefix."

	C-c d		Redisassemble.
	C-c r		Set new range and redisassemble.
	C-c C-c		Call on xsym and xcom to extract information,
			and redisassemble. Any errors or warnings
			go into the *SAD Output* buffer.
	C-c q		Quit current buffer.
	C-c n		Set up new buffer with new range.
	C-c o		Set up new buffer with new range in a
			different window.
	C-c v		View format.
	C-c f		Change format.
	C-c C-d		Remove format.
	C-c j		Join (see sadfmt -j) format file.
	C-c m		Edit macros database.
	C-c e	-or-	Move to line of next error in *SAD Output*.
	C-x `
	C-c .	-or-	Move to symbol definition. Will currently search
	M-.		the current buffer only.
	C-c ,	-or-	Move to next definition of same symbol, if any.
	M-,

	C-c s		View value of symbol


	M-;		Add comment, or reindent current comment, as
			appropriate.
	M-LF		Continue comment on next line.


   After C-c C-c an attempt is made at approximately preserving the
current position, so don't be too suprised if the cursor moves a
couple of lines. The window is also recentered around the new point.


   The range is indicated in the mode line, and also makes the default
file name. Should you prefer some other file name, you can change the
variable *sad-default-file-name* in sad.el.



* CODE NOTES

   Sad.el, xcom.c, xsym.c, sadfmt.c, dump2core.c, and scan2core, are
pretty straightforward, while sad.c does a lot of hairy stuff related
to Saturn disassembly.  Sad.c, xcom.c, and xsym.c, have some common
code in misc.c. Formats.c contain most formats-related code, while
macros.c contains what pertains to the implementation of macros. The
code aint pretty, but it does work.


* MS-DOS

   I haven't used MS-DOS for, eh, 6 years now. Hopefully someone will
make whatever changes are necessary, and repackage SAD with zip/zoo.
This should be fairly trivial for anyone with an MS-DOS system.


* A FINAL WORD

   This is SAD 1.01. Don't expect it to be bug free. It will
eventually be succeeded by 1.02, but nothing prevents you from using
1.01 as no major changes will occur in the database format. You will
be able to reuse your old data with 1.02. Also, by 1.02 a New Syntax
Order may rule - simply redisassemble using your old database.

   This is now 1.02. Several bugs have been fixed, and formats and
macros added. Existing databases are compatible. Sad no longer crashes
without a symbols database. Quoted names work with xsym. The code is
generally more robust. Funny names no longer pose any problems.

   This is SAD 1.03. A number of features have been added: formats
joining, full indentation regardless of format, multiple passes, an
intermediate F pass, local symbol support. Also, the GNU Emacs mode
has been improved, and several bugs fixed. Funny names are still a
little awkward; use them sparingly. NOTICE: the .formats file MUST
contain the line `0:c'.


* DISTRIBUTION AND COPYRIGHT

SAD 1.01 and 1.02 are no longer available from rice-chex.ai.mit.edu.

   SAD 1.03 can at be picked up with anonymous FTP from
rice-chex.ai.mit.edu [128.52.38.46] as `~/pub/sad-1.03.tar.Z'. It is
not in the Public Domain as the author retains all copyrights, but it
is free software covered by the GNU General Public License. The file
COPYING describes this license in great detail. If you find it to be
sheer legalese, don't despair: in short, you can do whatever you want
with SAD except sell it for anything beyond copying costs, hide the
source code, or distribute it or any modifications you've made without
the original copyright notices and the file COPYING.

   SAD is distributed in the hope that it will be useful, but with
ABSOLUTELY NO WARRANTY; without even implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Enjoy,
						-- Jan Brittenson 
						   bson@ai.mit.edu