[comp.sys.handhelds] Announcement: SAD 1.02

bson@rice-chex.ai.mit.edu (Jan Brittenson) (10/19/90)

A new release of SAD can be found in:

	 @rice-chex.ai.mit.edu:~/pub/sad-1.02.tar.Z


Differences from 1.01:

	* All bugs reported in 1.01 fixed.

	* RPL disassembly.

	* Data formatting. (Aka data disassembly.)

	* Formats database.

	* Macros.


   Macros are currently limited to 5-nibble sequences, and are only
used during RPL disassembly. Formats are used to direct to
disassembler to choose between RPL, ML, and in the case of Data, how
to decode it. Formats are bound to specific addresses, and are
considered to be in effect until a new format entry is encountered
during disassembly. The disassembler will always synchronize with
format entries.

Included below is README.

Enjoy!


O  /
 \/
 /\ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
O  \


* SYNOPSIS, SAD 1.02

   This is the README file for SAD, the Saturn Disassembler package.
SAD comes with no documentation at this time, other than this file.

   When the documentation is completely finished, it will be included
in the distribution. No partial documentation is included because it
would serve to confuse and cause complaints only. Documentation is
under way, and will be available as a GNU Emacs Info tree. For DOS
users, the info tree will be included in TeX format.

   SAD is a package currently consisting of sad (the disassembler),
xsym (the symbol extractor), xcom (the comment extractor), sadfmt
(formats tool) and sad.el (GNU Emacs SAD mode). The purpose of SAD is
to let you disassemble Saturn Machine Language (ML) and RPL code, edit
it, and maintain databases of symbols, comments, formats and macros.
The formats database contains information directing the disassembler
to either ML, RPL, or Data, the latter of which may be complex nested
structures. The Macros database contains nibble patterns for various
common idioms. You may, for instance, declare that the sequence -

		84e20201424

is to be printed as -

		84e20201424  GLOBAL "AB"

by defining the macro -

		5,2e48:GLOBAL "%2S"

The formats database lets you declare structures such as

		x5,2(d2,a5)

to instruct the dissassembler to consider this to be Data consisting
of a 5-nibble hex integer, and two 5-character strings each preceeded
by a 2-nibble decimal integer. A formats entry will remain active and
repeated until a new formats entry is applicable. The main purpose
is to specify synchronization points and data formatting.

Note:

   In SAD 1.02, macros are only used during RPL disassembly, and are
restricted to 5-nibble sequences. You may, however, define macros with
pattern tags of any length up to 8 nibbles, they will merely be
ignored.


* INSTALLATION

   Typing `make' should be sufficient. Since execution speed is
crucial, you may want to turn on all macho speed optimizations
available. To do this, edit Makefile. DOS users may have to convert it
to some other format, or compile manually.


* REPORTING BUGS

   If you find a bug in SAD, you should report it. But first, you
should make sure that it really is a bug, and that it appears in the
latest version of SAD that you have.

   Once you have ascertained that a bug really exists, please mail me
a bug report. If you have a fix, please mail that as well!
Suggestions and `philosophical' bugs are equally welcome.

Please include the following:

	* The version number of SAD
	* A description of the bug behaviour
	* A short script or `recipe' which exercises the bug

And mail it to bson@ai.mit.edu.


* USAGE

Note for DOS users:

   Please read through these instructions first, as several file names
are not compatible with DOS file names. To change the names, edit
sad.h and dump2core.c.

   First KGET (or type in) the DUMP program included in Dump.RPL. The
checksum should be # 5149h and the size 72. This program takes two
arguments: start and end addresses. It will dump memory, using the
PEEK program posted Mar 16, 1990 by Alonzo Gariepy. Make sure to set
the word size to 64 first, with `64 STWS'. Direct I/O to WIRE, make
sure your computer is set to capture the dump. Hook your 48 up, and
type in #0h #6FFF0h DUMP. DUMP will continually display the currect
dump address in the top left corner of the display, which will
otherwise remain blank apart from the menu.

   DUMP will take a long time. The the entire ROM dump is about 450
kilobytes - so try and use 9600 bps if possible. Alternatively, you
can ask for an FTP site on the net where a dump of your specific ROM
revision can be found.

   The utility dump2core will convert your dump to a core file named
.core, which is what the disassembler will be looking for. It reads
the dump from standad input, and overwrites .core if it exists;
otherwise a new one will be created. 

[Note: the dump consists of records of two lines each. The first is
the address, the second the data as returned by PEEK. Dump2core
ignores the address part, it's included only to serve as a reference
for you, to allow you to retransmit smaller portions, should it prove
necessary. You are recommended to verify that the dump is correct; the
following command will list all clobbered lines, if any, along with
their line numbers:

		grep -vn '^# [0-9A-F]+h$' romdump                       ]

   Copy the provided set of standard symbols, formats, and macros to
.symbols, .formats, and .macros respectively:

		cp stdsymbols .symbols
		cp stdformats .formats
		cp stdmacros .macros

   Use cp, not rm. Keep the standard set of databases right by the
source code to facilitate recreating the standard symbol table should
you happen to wreck something.


Disassembling is done with the `sad' command:

		sad [flags] start end

where		      are
	start,end	Hex addresses of first and last instructions.

	flags, 		A set of flags, always bundled up as one argument.
	   -acsdxz	
		a	Assembler format, i.e. PC and opcode fields are
			suppressed.
	
		c	Suppression of disassembler comments.

		s	Symbolic addresses are moved to the comments.

		d	The supplementary definition of symbols known,
			referenced, but not otherwise defined in the
			output, is suppressed.

		x	A cross reference is added at the end, as
			comments with symbols and addresses in the
			disassembly where they are referenced.
		 	
		z	Alonzo mode. PC and opcode fields are printed
			slightly differently. The initial org instruction
			is suppressed.


   Two of the maintenance tools are xsym and xcom. Both take as arguments
a collection of bundled-up flags (`xsym -sr' for instance).

		s	Supersede contents of database with information
			extracted from a listing on standard input.

		m	Merge contents of database with information
			extracted from a listing on standard input.

		l	Include source line numbers along with any
			errors or warnings,

		r	Overwrite the database file instead of sending
			the superseded/merged result to standard
			output.


   The third maintenance tools is sadfmt. It takes as its first
argument the optional flag "-r", similar to the -r flag or xsym and
xcom, as its second (or first when no flags are present) argument an
address in hex, and as an optional last argument a new format. If no
new format is supplied, the previous format is displayed.

Note:
Do not use this form, as it will clobber your file:

		xsym >.symbols
Instead, use:
		xsym -r

The same applies to xcom and sadfmt.


* FILES

   `.core' consists of binary raw data, where each byte corresponds to
one nibble. The upper half is reserved for other purposes, but
currently unused. Address 0 corresponds to offset 0.

`.symbols' consists of lines of the following format:

		<value>:<symbol>
		<value>=<symbol>

Example:	70579=TOS

   The presence of either ":" or "=" reflects whether the symbol was
defined with a "symbol=val" or "val symbol:" statement. It is
currently not used, but may be in the future, especially in
conjunction with Formats and Macros. The value is the symbol value,
and the symbol is the symbol. The file is not ordered.


`.comments' is similar to .symbols:

		<address>=<comment string>
		<address>:<comment string>

Example:	5b79=Allocate a string

   Several comments may be bound to the same address, in which case
they appear in the specified order. Here "=" and ":" reflect whether
the comment is considered a `major comment' or a mere `minor' one.
Major comments are comments put on a line of their own, whereas minor
comments are appended to the right of the code. "=" signals a major
comment, and ":" a minor. The semicolon is implicit, and not included.

   During disassembly, at any given address, all major comments are
output first, follwed by any symbol definitions, and then code with
minor comments appended to their right. The file is not ordered.


   `.formats' contains disassembly formatting information, mostly
related to correctly decoding data and synchronization. The directives
can be divided into three categories: Machine Language (ML), RPL, and
Data. The file constists of entries of the form:

		<address>:<format>

specifying that from <address> and on, <format> is to be active. If
during disassembly <address> is about to be passed, the disassembler
will back up to <address>. This behaviour is called `synchronization,'
and is performed even if an identical format was previously in effect.
For RPL and ML, <format> is either `r' or `c' respectively, and may
not be nested or combined with or within Data format specifications.

The syntax for Data format specifications is:

		[<repeat>]<formatchar>[<width>]
	or	<format>[,<format>]
	or	[<repeat>](<format>)

   Where <repeat> and <width> are decimal integers. Commas (,) are
used to separate sequences of formats to be used sequentially.

   The format character <formatchar> is one of the following. `R'
refers to the repeat count, and `w' to the width.

   x	Hex	R words of W nibbles in hexadecimal.
   d	Dec	R words of W nibbles in decimal.
   o	Oct	R words of W nibbles in octal.
   a	Ascii	R sequences of W characters.
   s	String	R sequences of characters whose lengths are determined by
		a W-nibble word preceding the sequence, minus W.
   v	Vector	R sequences of nibbles presented in hex, whose lengths
		are determined by a W-nibble word preceding the sequence,
		minus W.
   w	Word	R 64-bit words presented in floating point, RPL style.
	
Examples:

		5b79:c
		2a2b4:r
		2a2b4:x5,w

[Note: if the example above were actually used, the format effective at
2a2b4 would unpredictably be either one of the two conflicting ones.]


   `.macros' contains pairs of patterns and macro definitions. The file
consists of entries of the form:

		<length>,<pattern>:<definition>

   Where <length> is the length of the pattern, <pattern> the pattern
data, and <definition> a string to be expanded. The left-hand side of
the colon (:) is referred to as the `tag.' The definition is the
resultant strings, possibly with embedded expansion directives. These
start with a percent sign (%), are optionally followed by width (W)
and adjustment (A) terms, and end in a directive character. The
interpretation, if any, of w and a is directive dependent.

General:
		%[<w>[,<a>]]<d>

Directives:
   x	Hex	W (default 5) nibbles as hex digits, or as a symbol.
   d	Dec	W (default 5) nibbles as a decimal word.
   o	Oct	W (default 5) nibbles as an octal word.
   b	Bin	W (default 5) nibbles as a binary word.
   w	Word	64-bit word as a floating-point word, RPL fashion.
   l	Long	84-bit word as a long floating-point word, RPL fashion.
   a	Ascii	W (default 1) characters.
   s	String  W (default 2) nibble word specifying the string length
		in nibbles, minus A (default 0).
   S	String  W (default 2) nibble word specifying the string length
		in characters, minus A (default 0).
   v	Vector	W (default 2) nibble word specifying the vector length,
		minus A (default 0), presented as hex digits.
   z	Skip	Skip W nibbles.

   +	Begin	Designate beginning of new block.
   -	End	Designate end of block.
   e	End	Same.
   
   =	Equal	Assert that the following W nibbles are A.

Examples:

		5,2a2c:STRING "%5,5s"
		5,2933:REAL %w
		5,2d9d:PROGRAM%+
		5,312b:END%-
		5,2e48:GLOBAL "%2S"



   The general idea is to put the programs in a bin directory and set
up a separate directory for each disassembly project.


* GNU Emacs AND sad.el

   The sad-mode facilitates interactive exploration of a core. First
edit sad.el and the runfile variables to point to sad, xsym, and xcom
as appropriate. Load sad.el and do M-x sad. Emacs will first prompt
for a range before setting up a new buffer and disassembling. The range
format is

		<from>-<to>

	   where	     are
		from,to		addresses in hexadecimal.



   While in a SAD buffer, the following key bindings are in effect.
C-c is the conventional "special mode prefix."

	C-c d		Redisassemble.
	C-c r		Set new range and redisassemble.
	C-c C-c		Call on xsym and xcom to extract information,
			and redisassemble. Any errors or warnings
			go into the *SAD Output* buffer.
	C-c q		Quit current buffer.
	C-c n		Set up new buffer with new range.
	C-c o		Set up new buffer with new range in a
			different window.
	C-c v		View format.
	C-c f		Change format.
	C-c m		Edit macros database.
	C-c e	-or-	Move to line of next error in *SAD Output*.
	C-x `
	C-c .	-or-	Move to symbol definition. Will currently search
	M-.		the current buffer only.
	C-c ,	-or-	Move to next definition of same symbol, if any.
	M-,

	M-;		Add comment, or reindent current comment, as
			appropriate.
	M-LF		Continue comment on next line.

   At C-c C-c an attempt is made at approximately preserving the
current position, so don't be too suprised if the cursor moves a
couple of lines.


   The range is indicated in the mode line, and also makes the default
file name. Should you prefer some other file name, you can change the
variable *sad-default-file-name* in sad.el.



* CODE NOTES

   Sad.el, xcom.c, xsym.c, and dump2core.c are pretty straightforward,
while sad.c does a lot of hairy stuff related to Saturn disassembly.
Sad.c, xcom.c, and xsym.c, have some common code in misc.c. Formats.c
contain most formats-related code, while macros.c contains what
pertains to the implementation of macros.


* MS-DOS

   I haven't used MS-DOS for, eh, 6 years now. Hopefully someone will
make whatever changes are necessary, and repackage SAD with zip/zoo.
This should be trivial for anyone with an MS-DOS system.


* A FINAL WORD

   This is SAD 1.01. Don't expect it to be bug free. It will
eventually be succeeded by 1.02, but nothing prevents you from using
1.01 as no major changes will occur in the database format. You will
be able to reuse your old data with 1.02. Also, by 1.02 a New Syntax
Order may rule - simply redisassemble using your old database.

   This is now 1.02. Several bugs have been fixed, and formats and
macros added. Existing databases are compatible. Sad no longer crashes
without a symbols database. Quoted names work with xsym. The code is
generally more robust. Funny names no longer pose any problems.


* DISTRIBUTION AND COPYRIGHT

SAD 1.01 is no longer available at rice-chex.ai.mit.edu.

   SAD 1.02 can at be picked up with anonymous FTP at
rice-chex.ai.mit.edu [128.52.38.46] as `~/pub/sad-1.02.tar.Z'. It is
not in the Public Domain as it is copyrighted, but it is free software
currently covered by the GNU General Public License. The file COPYING
describes this License in great detail. If you find it to be sheer
legalese, don't despair: in short, you can do whatever you want with
SAD except sell it for anything beyond copying costs, hide the source
code, or distribute it or any modifications you've made without the
original copyright notices and the file COPYING.

   SAD is distributed in the hope that it will be useful, but with
ABSOLUTELY NO WARRANTY; without even implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Enjoy,

	Jan Brittenson <bson@ai.mit.edu>