[comp.sys.mac.programmer] XFCN/XCMD string in LSC C v3.0

oster@dewey.soe.berkeley.edu (David Phillip Oster) (04/09/89)

In article <12964@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle R. Horton) writes:
>     Using A4 as a base for global variables is a hack, anyway.  There
>are situations I know of which will invalidate A4, making your global
>variables useless.  (Dialog filter procs, for example.)

But Earle, that is exactly what the routines SetUpA4() and RestoreA4()
fix:

The compiler puts the data in the same resource as the code, after the
code. The routine RememberA0() stashes away A0, which on entry is a
pointer to the resource itself.  It staches it in the resource, but you
are going to be writing to the data area of the resource anyway, so
this is no worse.  SetUpA4() saves the old value of A4, and sets A4 to
point at the remembered value. After that, you have access to globals
until RestoreA4().  Things like dialog filter procs, and other
procedures that will be called by the operating system need to use
SetUpA4() RestoreA4() to get at the code resource's globals.  This is
no kludge, this is elegance compared to passing a pointer to a record
to every call.

One surprise: even if you have no globals, you still need to use
SetUpA4() and RestoreA4() to get at string constants and float
constants. Now that is a kludge!

earleh@eleazar.dartmouth.edu (Earle R. Horton) (04/09/89)

In article <28737@ucbvax.BERKELEY.EDU> oster@dewey.soe.berkeley.edu.UUCP 
	(David Phillip Oster) writes:
>In article <12964@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu 
	(Earle R. Horton) writes:
>>     Using A4 as a base for global variables is a hack, anyway.  There
>>are situations I know of which will invalidate A4, making your global
>>variables useless.  (Dialog filter procs, for example.)

>But Earle, that is exactly what the routines SetUpA4() and RestoreA4()
>fix:
>
>The compiler puts the data in the same resource as the code, after the
>code. The routine RememberA0() stashes away A0, which on entry is a
>pointer to the resource itself.  It staches it in the resource, but you
>are going to be writing to the data area of the resource anyway, so
>this is no worse.  SetUpA4() saves the old value of A4, and sets A4 to
>point at the remembered value. After that, you have access to globals
>until RestoreA4().  Things like dialog filter procs, and other
>procedures that will be called by the operating system need to use
>SetUpA4() RestoreA4() to get at the code resource's globals.  This is
>no kludge, this is elegance compared to passing a pointer to a record
>to every call.

     Several problems exist with this approach.  The chief one is
writing to the code resource.  This is bad.  It can lead to strange
bugs on 68020 machines.

     Exactly where are the A4 values stored?  If in the code resource,
see previous paragraph.  If on the stack, see the latest TechNote on
SetUpA5() and RestoreA5().

     Saying that something is "no worse" is not as good as taking the
trouble to avoid the problem altogether.

     There are places where stand-alone code resources can implement
"global" variables, and not risk problems with processor caches.
Inside of the resource containing the code is not one of them.
Furthermore, these places vary from application to application, making
it impossible to do this right at the compiler level.  Sure, passing a
pointer to a data structure to every call may not be elegant, but it
is the only method which I can think of that:

     a)  Works for every conceivable application.
     b)  Does not write to the code resource sooner or later.

     The worst thing about development-system-specific tricks like
this is that they are not portable.  Remember portability?  When using
A4-relative data in a driver or code resource, you are perhaps saving
yourself some work, but you are writing code which will you can only
expect to be compatible with the development system you are using
right now.  The original poster wanted to convert some XCMDs from
LightSpeedD to MPW C.  He was in trouble because MPW C does not
provide A4-relative data.

     Coding for portability is being friendly to other programmers.
Using A4-relative data is not portable.

>One surprise: even if you have no globals, you still need to use
>SetUpA4() and RestoreA4() to get at string constants and float
>constants. Now that is a kludge!

     This is a compiler problem, for sure.  I recommend using Aztec C
as a fix to this particular problem.  Aztec C will put both string
constants and floating point constants in the code for you.

Earle R. Horton

Graduate Student.  Programmer.  God to my cats.

tim@hoptoad.uucp (Tim Maroney) (04/09/89)

In article <12968@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle
R. Horton) writes:
>     Several problems exist with this approach.  The chief one is
>writing to the code resource.  This is bad.  It can lead to strange
>bugs on 68020 machines.

Sorry, Earle, you're generally one of the better informed voices here
and one well worth reading, but you seem to be having a bad luck streak
this week.  There would only be a problem if the address where A4 is
stashed were executed.  That's the only way the address could find
its way into the instruction cache and cause problems: by being executed.
Since it's just being handled as data, there's no problem.

>     There are places where stand-alone code resources can implement
>"global" variables, and not risk problems with processor caches.
>Inside of the resource containing the code is not one of them.

The processor doesn't know resources from racehorses.  What it knows is
that there are lots of numbered memory locations, and it's ordered to
execute some of them.  These it may cache.  Others it won't, at least
not in the instruction cache.  Stashing data in code space is *not*
self-modifying code.

>Sure, passing a
>pointer to a data structure to every call may not be elegant, but it
>is the only method which I can think of that:
>
>     a)  Works for every conceivable application.
>     b)  Does not write to the code resource sooner or later.

Here we're pretty much in agreement.  In fact, I think it *is* elegant
to keep passing data structure handles through most levels of your
code, though if a routine only uses one or two fields of the data
structure, it's better just to pass those fields.  I've seen too many
programs ruined by indiscriminate use of globals, and I think data
structure arguments lead to far cleaner code in the long run.  However,
I still have not been able to write a serious program with no globals
at all (though I'd think you could do it in a small code chunk like an
XCMD) and mucking about with globals registers remains a neccessary
evil in the real world.  Globals should be minimized as a matter of
good programming practice, but it's unrealistic to expect them to
vanish altogether from any large piece of code.

One more note.  This whole thing got started because of a discussion of
string constants.  Uh, guys, you're not supposed to be using those on
the Mac, ya know.  Sure it would be nice if every compiler let you put
them in code space, but for internationalization, you're supposed to
put all of them into 'STR ' and 'STR#' resources.  I have not found
this to be a real burden as long as you run MultiFinder; you can use
ResEdit while your development system stays up.
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"Something was badly amiss with the spiritual life of the planet, thought
 Gibreel Farishta.  Too many demons inside people claiming to believe in
 God." -- Salman Rushdie, THE SATANIC VERSES

earleh@eleazar.dartmouth.edu (Earle R. Horton) (04/10/89)

In article <6944@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>In article <12968@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu 
	(Earle R. Horton) writes:
>>     Several problems exist with this approach.  The chief one is
>>writing to the code resource.  This is bad.  It can lead to strange
>>bugs on 68020 machines.
>
>Sorry, Earle, you're generally one of the better informed voices here
>and one well worth reading, but you seem to be having a bad luck streak
>this week.  There would only be a problem if the address where A4 is
>stashed were executed.  That's the only way the address could find
>its way into the instruction cache and cause problems: by being executed.
>Since it's just being handled as data, there's no problem.

>...Stashing data in code space is *not* self-modifying code.

  The second group is code that changes the block that the code is
  stored in.  Keeping variables in the CODE segment itself is an example
  of this.  This is uncommon with high-level languages, but it is easy
  to do in assembly language (using the DC directive).  Variables
  defined in the code itself should be read-only (constants).  Code that
  modifies itself has signed a tacit agreement that says "I'm being
  tricky, if I die, I'll revise it."
  ...
  If you choose to abuse, you also agree to personal visits from the
  Apple thought police, who will be hired as soon as we find out.

     Technical Note 117, "Compatibility: Why and How." Bo3b Johnson.

     Well, Apple thinks that this *is* self-modifying code.  To be
honest, I don't know if this kind of thing gets into the code cache or
not, but it might someday.  If you want to write straight-up code
then writing to logical code blocks is to be avoided.  If you want to
write tricky code, then you may do so.  The best advice seems to be to
avoid being tricky if you want your programs to last.

     Programming the Macintosh is difficult.  Relying on a compiler
which uses questionable tricks to make that task appear easier is
risky at best.  MPW compilers do not implement global read/write
variables for non-application code resources, presumably because Apple
has not found a safe place to stash the base register.

     Conclusion:  Using global variables in non-application code
resources is at present risky business.  Even though the compiler
writer says "Go ahead and do it," the choice to be tricky or not is
yours.
Earle R. Horton

Graduate Student.  Programmer.  God to my cats.

dg@okra.sybase.com (David Gould) (04/10/89)

In article <6944@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>In article <12968@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle
>R. Horton) writes:
>>     Several problems exist with this approach.  The chief one is
>>writing to the code resource.  This is bad.  It can lead to strange
>>bugs on 68020 machines.
>
>Sorry, Earle, you're generally one of the better informed voices here
>and one well worth reading, but you seem to be having a bad luck streak
>this week.  There would only be a problem if the address where A4 is
>stashed were executed.  That's the only way the address could find
>its way into the instruction cache and cause problems: by being executed.
>Since it's just being handled as data, there's no problem.
 [ other cogent discussion omitted...]
>Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim

Sorry, Tim, but Earle is right, even if for the wrong reasons.  In a protected
or virtual memory environment, it is very common to use the MMU to map 'pure'
code and constant data pages as read only.

 - Pure code and data pages can be shared by all programs without using extra
   physical memory.
 - The paging mechanism knowns in advance that pure pages will never be written
   to the pagefile, so it doesn't need pagefile space for them.
 - It helps system reliability to have OS enforced memory access protection.
   This helps catch pointer runaways and other system subversion (eg virii).

It is quite likely that in Apple's future multitasking virtual memory system
that code resources will be mapped read only.  Thus, under system N.0 (for some
large N) any code that writes on code resources will break. 

					...dg
 David Gould                           Sybase, Incorporated      (415) 596-3414
 sybase!dg@sun.com                     6475 Christie Ave.  Emeryville, CA 94608
 {sun,lll-tis,pyramid,pacbell}!sybase!dg

tim@hoptoad.uucp (Tim Maroney) (04/11/89)

Recap to avoid excessive quotation:  We're discussing the problems of using
globals in non-application code, specifically with respect to stashing
globals in code space and providing globals in non-apps under MPW C.

In article <12970@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle
R. Horton) writes:
>
>     Well, Apple thinks that this *is* self-modifying code.  To be
>honest, I don't know if this kind of thing gets into the code cache or
>not, but it might someday.  If you want to write straight-up code
>then writing to logical code blocks is to be avoided.  If you want to
>write tricky code, then you may do so.  The best advice seems to be to
>avoid being tricky if you want your programs to last.

("This" being putting globals in code space.)  In my opinion, the Tech
Note is being loose in its definitions.  Most people reserve the phrase
"self-modifying code" for software that rewrites its INSTRUCTIONS.  If
you review the Tech Note, you will also find that no rationale is given
for not putting globals into code space; he just says that since it's
"self-modifying code" (it isn't) it's tricky and shoudl be avoided.

The fact remains that such non-executable and non-executed data is not
stored in the instruction cache and is very unlikely to be on any
future processor.  Motorola understands that its processors need to
make the minimum of assumptions about the operating environment, and
when it has an approach in place that works in every conceivable
situation (only caching instructions that it has executed, and only
consulting the instruction cache during instruction fetch) it is
inconceivable to me that they would change it to a less general
appraoch that has no apparent advantages.

It's always best to avoid trickiness.  I don't consider this technique
particularly tricky.  I do think the Tech Note you cited was employing
a tricky definition, however.

>     Programming the Macintosh is difficult.  Relying on a compiler
>which uses questionable tricks to make that task appear easier is
>risky at best.  MPW compilers do not implement global read/write
>variables for non-application code resources, presumably because Apple
>has not found a safe place to stash the base register.

It doesn't implement them in the sense of holding your hand at every
step of the process.  However, it doesn't put up much of a fuss if you
decide to do it yourself.  The compiler and linker will happily churn
out a consistent set of offsets from A5.  As for where to stash the
data register, newer Technical Notes explicitly state that if you must,
you can stash your globals register in code-relative space.  (This is
primarily for interrupt-driven code, but it works fine for maintaining
your own globals register.)

>     Conclusion:  Using global variables in non-application code
>resources is at present risky business.  Even though the compiler
>writer says "Go ahead and do it," the choice to be tricky or not is
>yours.

Conclusion based on what?  At present, it's perfectly safe.  In the
future, it is remotely possible that this culd break if Motorola does
something tremendously stupid, or if Apple implements a memory-protected
OS that destroys many of its major third-party products.

As I've said, it is practically impossible to write medium-and-larger
programs that completely eschew globals.  Something of the magnitude of
TOPS never could.  TOPS is a user-level protocol; programmer protocols
like TCP also have a need to maintain connection tables and access them
when a packet comes in.  It might seem that a protocol could avoid
globals by letting the protocol caller maintain a copy of a connection
record, but the software needs to access the tables on the fly, not
just when the caller accesses the protocol implementation.  And so on.

A small code resource like a definition routine or XCMD can probably
get by with no globals, and ought to do so if possible.  But as I said
previously, globals are a neccessary evil in modern programming
languages.  Cavalier dismissals of their necessity are not too likely
to convince someone who has been involved in large projects and lots of
large code resources.
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"The above opinions and suggestions have absolutely nothing to do with
 the little, fat man putting crisp, $100 bills in my pocket."
    -- Alan Vymetalik

mjm@eleazar.dartmouth.edu (Michael McClemen) (04/11/89)

If (as I read in current trade publications) the 68040 (presumably the wave
of the future as far as such things go) has separate data and instruction
caches, why would storing data in code space be a problem for a program running
on that processor?  Being rather ignorant about low-level architecture, my
initial perception is that instruction fetches would use one cache, while data
fetches would use the other.  Now, since you obviously have to be able to write
to data-cached locations (a write-through cache, correct?) why wouldn't data
fetches from an address also coincidentally stored in the instruction cache go
through the data cache and not mind that that memory cell had been written to?
This is quite a different matter from self-modifying code, where subsequent
fetches from a modified location go through a read-only cache.

If there is a flaw in my understanding, I would be grateful if someone would
explain this business more throughly; I have had the question for quite a while
now.

-- Michael McClennen

thecloud@dhw68k.cts.com (Ken McLeod) (04/11/89)

In article <6944@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>In article <12968@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle
>R. Horton) writes:
>>     Several problems exist with this approach.  The chief one is
>>writing to the code resource.  This is bad.  It can lead to strange
>>bugs on 68020 machines.
>>
>>     There are places where stand-alone code resources can implement
>>"global" variables, and not risk problems with processor caches.
>>Inside of the resource containing the code is not one of them.

>The processor doesn't know resources from racehorses.  What it knows is
>that there are lots of numbered memory locations, and it's ordered to
>execute some of them.  These it may cache.  Others it won't, at least
>not in the instruction cache.  Stashing data in code space is *not*
>self-modifying code.

  What happens on 68030 machines, which have both instruction and data
caches? I've written an INIT in LSC, using A4-relative global data which
I update (read: write to) when certain parameters are changed by the user
through a 'cdev' interface to my patch. As far as I can tell, when my INIT
code goes to fetch the value stored in one of my 'globals', the processor
will use the value in the data cache if it exists there; otherwise, it will
get it from the actual address. But won't altering data at a particular
address cause the cached 'copy' of the data to become invalid, and force
the processor to re-read the data? Or will the "old" value stored in the
cache stay there, and continue to be used? Perhaps there's a way to flush
the data cache? How exactly does processor data caching work?

 (Insert "inquiring minds..." cliche here)

-- 
==========     .......     =============================================
Ken McLeod    :.     .:    UUCP: ...{spsd,zardoz,felix}!dhw68k!thecloud
==========   :::.. ..:::   INTERNET: thecloud@dhw68k.cts.com
                ////       =============================================

tim@hoptoad.uucp (Tim Maroney) (04/12/89)

In article <3756@sybase.sybase.com> dg@okra.UUCP (David Gould) writes:
>Sorry, Tim, but Earle is right, even if for the wrong reasons.  In a protected
>or virtual memory environment, it is very common to use the MMU to map 'pure'
>code and constant data pages as read only.

Sure, and I expect that when Apple comes out with a memory protected
OS, probably in 1990, application CODE resources and possibly some
definition routines (WDEFs, MDEFs, CDEFs, LDEFs) will be mapped into
read-only space.  It is also possible that this will be done with DRVRs
and CDEVs, though I doubt it -- too many of them stash things in code
space so they can access them at the interrupt level without wasting
time searching through system queues.  Possibly even XCMDs/XFCNs will
be mapped read-only.

However, there are some resources that won't ever be marked as
read-only unless they conspire to do it themselves, and that's the kind
I usually write.  For instance, INITs that install themselves in the
system heap or BufPtr and patch traps won't be marked read-only because
the system doesn't know what's going on -- as far as it's concerned,
this is data.  Once this kind of code is installed, it's not even a
resource any more.  Or, utility libraries that provide database services
or implement a help system, stored as a code resource with a "routine
selector" type dispatch routine at the start of the resource.  These
may remain resources, but of a non-standard type that the system has
no way of recognizing.

>It is quite likely that in Apple's future multitasking virtual memory system
>that code resources will be mapped read only.  Thus, under system N.0 (for some
>large N) any code that writes on code resources will break. 

Not "any", for the reasons cited above.  Some.

Furthermore, I expect that Apple will have to provide a way to override
this protection.  For instance, suppose that your application
implements a network protocol that installs a socket listener or
protocol handler.  There's no way around it -- that listener *must*
stash its globals register (at least) in code space.  If it doesn't, it
has no way to communicate packets received to the rest of the
application.  Furthermore, it must be linked with the rest of the
application, not as a separate code resource, because otherwise it
can't know what offset to use for the "packets received queue" global.
Other code that can't get around running at the interrupt level will
also need to use this trick.

Apple may or may not do this in a backwards-compatible way -- I hope
they do.  MultiFinder is backwards compatible because "no MultiFinder
resource" means "not MultiFinder friendly".  The same could be
accomplished by a resource indicating "make my code resources
read-only".  If it's missing, they're read-write; otherwise, they're
read-only.  The same approach could be used with CDEVs and DRVRs, using
an owned "make me read-only" resource.  Or Apple could just use the
reserved resource flags bit for that purpose (bit 0; see IM I-111.)
Nobody's got it set now, so this would work correctly.

On the other hand, a clean but incompatible way to do this would be to
provide a new trap that temporarily turns off memory protection, or a
version of BlockMove that ignores memory protection.  Notice that the
application socket listener only needs to write into code space once.
This could easily be accomplished using these traps, albeit breaking
existing code in the process.  Apple's usually pretty good about
compatibility, since computers get sold by their software base, so I
expect they will probably use some variant on the "read-only resource",
rather than requiring calls to new traps.

So why don't older OS's like UNIX have to do this?  Typically, they use
absolute globals rather than globals relative to an offset, because
they had some form of MMU built in from the start.  If the Mac used
this approach, then the issue would be irrelevant -- the socket
listener would just stash it at global location 0x2500 or something.
The OS is also statically linked -- the system maintainer has to relink
the system whenever something is added, barring special hacks for
run-time drivers that have been made in some UNIXes this decade.

If Apple had done this to globals, they could never have expanded the
system heap, nor provided Switcher or MultiFinder, on their existing
machines.  With register-relative globals, relocatability does not
require an MMU.  Good design decision for a low-end computer.

Semi-related PS:  John Gilmore wrote to inform me that non-executed
data *can* find its way into the instruction cache because of prefetch.
However, since the cache is only consulted during instruction fetch,
not data fetch, there's still not a problem with PC-relative data.
The instruction cache may have an invalid copy, but the processor
won't look there unless it's executing that location.
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"Mere opinion without supporting argument is no more than the American
 Bandstand school of literary evaluation."  -- Tom Maddox

tim@hoptoad.uucp (Tim Maroney) (04/14/89)

In article <22027@dhw68k.cts.com> thecloud@dhw68k.cts.com (Ken McLeod) writes:
>  What happens on 68030 machines, which have both instruction and data
>caches?

Simple; when you change the value at an address, the value in the data
cache changes.  Data caching would be pretty useless otherwise, if you
think about it.  What use would there be in having a data cache that
didn't stay synched with RAM?  This is not a problem.

(Note -- I don't have a 68030 manual, and it's possible that instead of
immediately updating the cache value, it just invalidates it on write.
As far as the programmer is concerned, the effect is the same.)
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"Every institution I've ever been associated with has tried to screw me."
	-- Stephen Wolfram

ech@pegasus.ATT.COM (Edward C Horvath) (04/14/89)

From article <6944@hoptoad.uucp>, by tim@hoptoad.uucp (Tim Maroney):

> One more note.  This whole thing got started because of a discussion of
> string constants.  Uh, guys, you're not supposed to be using those on
> the Mac, ya know.  Sure it would be nice if every compiler let you put
> them in code space, but for internationalization, you're supposed to
> put all of them into 'STR ' and 'STR#' resources.  I have not found
> this to be a real burden as long as you run MultiFinder; you can use
> ResEdit while your development system stays up.

If all you develop are applications, which own all resource IDs from
128 through 2^15-1, or DRVRs (each which owns 32 IDs of every conceivable
resource) or one of the other "blessed" code types, this is a reasonable
approach.  What STR and STR# resource IDs should an XCMD use?  Or should
I use GetNamedResource and HOPE nobody else uses a STR# named "godzilla"?

I'm not challenging the value of Tim's approach where it's cleanly
applicable, but what to do where it isn't?  DTS?

=Ned Horvath=

ech@pegasus.ATT.COM (Edward C Horvath) (04/14/89)

In article <6944@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>The processor doesn't know resources from racehorses.  What it knows is
>that there are lots of numbered memory locations, and it's ordered to
>execute some of them.  These it may cache.  Others it won't, at least
>not in the instruction cache.  Stashing data in code space is *not*
>self-modifying code.

From article <22027@dhw68k.cts.com>, by thecloud@dhw68k.cts.com (Ken McLeod):
>   What happens on 68030 machines, which have both instruction and data
> caches?

What SHOULD happen, naturally!  Listen carefully, and re-read as necessary:
The PC is used to fetch instructions, from the cache if there's a hit, from
RAM otherwise.  ALL other memory references are to/from the data cache if
there is one, and all writes to the data cache are write-thru.

Thus, for example, when you save an A5 or A4 or whathaveyou value, it is
written to the data cache (if any) and RAM.

This is so simple that it is hard to screw up.

That doesn't mean that, at some point, you won't run under an OS that
write-protects what it believes to be code.  When that happens, code from
Aztec C or MPW C with the -b will still be fine: only constants (which you
aren't going to modify, right?) are in codespace, and they can be write
protected.  An attempt to store an Ax reg in codespace will cause a memory
exception and termination with extreme prejudice.  But, as Tim Maroney
observed, there are things like IO Completion routines that CAN'T operate
any other way under the present Mac regime: they MUST store the global
base reg in a pc-relative place.

I suspect that Tim has correctly predicted the future as well: there will
be a bit in the BNDL or SIZE resource that says the app will deal properly
with being loaded at (virtual) address 0 (like the Unix model).  Since
data is always at the same (virtual) address with this model, all that
remains is for the app to be able to tell how it's loaded, say by a flag
in SysEnvirons.

=Ned Horvath=

tim@hoptoad.uucp (Tim Maroney) (04/14/89)

In article <2784@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:
>If all you develop are applications, which own all resource IDs from
>128 through 2^15-1, or DRVRs (each which owns 32 IDs of every conceivable
>resource) or one of the other "blessed" code types, this is a reasonable
>approach.  What STR and STR# resource IDs should an XCMD use?  Or should
>I use GetNamedResource and HOPE nobody else uses a STR# named "godzilla"?

Very interesting point.  How about just saying that you use the same
resource id as the XCMD uses?  The programmer ought to be able to find
some suitable ID for both an XCMD and a STR# (and you shouldn't need
more than one of the latter).  The XCMD can discover its own id by
getting a pointer to its first address (either the address of the entry
point routine, or the A0 that the default LSC header passes you), doing
a RecoverHandle, then doing a GetResInfo.

If you use GetNamedResource, you're back to using strings in your code
-- while you're still easier to internationalize, the technical
problems with strings in code resources remain.
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"The government of the United States is not, in any sense, founded
 on the Christian religion." -- George Washington

alexis@ccnysci.UUCP (Alexis Rosen) (04/14/89)

In article <6944@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>In article <12968@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle
>R. Horton) writes:
>>     Several problems exist with this approach.  The chief one is
>>writing to the code resource.  This is bad.  It can lead to strange
>>bugs on 68020 machines.
>
>[...]  There would only be a problem if the address where A4 is
>stashed were executed.  That's the only way the address could find
>its way into the instruction cache and cause problems: by being executed.
>Since it's just being handled as data, there's no problem.

This was my initial reaction as well. But I'm not so sure.

First of all, is code stashed in the cache on a byte-by-byte basis? Maybe,
in the 020/030, it is. I'm not sure. (I can't see it reading ahead in
32-word chunks, really. Too slow.)

But in the 040 there are 4K bytes of instruction cache. Does the CPU tag
each word? If not, data could wind up being sucked into the cache along
with instructions. Not much, but some.

On the other hand, this probably wouldn't affect things anyway, since there
wouldn't be a cache hit in the i-cache when it's looking for data, would
there?

confusion reigns...

---
Alexis Rosen
alexis@ccnysci.{uucp,bitnet}

p.s. Why haven't there been any discussions of the '040, either here or in
comp.arch? Wasn't the information announced enough to start a half-dozen
flame wars?

alexis@ccnysci.UUCP (Alexis Rosen) (04/14/89)

Perhaps I'm being dense (it's 5AM), but why can't compilers just see that
they're creating a stand-alone resource and put the "globals" in a stack
frame?

---
Alexis Rosen
alexis@ccnysci.{uucp,bitnet}

duggie@Jessica.stanford.edu (Doug Felt) (04/15/89)

The problem with using the resource id of the XCMD/XFCN resource as
the key to a corresponding STR# (or other) resource is that generally
when you copy resources the resource id gets reassigned.  Thus if you
distribute an XCMD in a stack with an "install" option you need an
installer XCMD that 1) lets you detect id conflicts and 2) lets you
assign ids explicitly.  The most common installers don't, and I had to
write my own in order to handle this problem.

If you don't plan on letting the user copy the XCMD then you have
complete control over the resource numbering and naming, so can use
whatever scheme suits you (as long as you use the 1 level deep
resource manager calls).

Multiple resources are also a problem if you plan on letting the user
switch stacks, which causes resource files to be closed.  You must
take care not to create data structures containing resource handles
that might persist over the closing of a stack.  Most XCMD's don't
do this so few people hit this problem.

I say go ahead and use embedded strings.  Hypercard is not really
script-manager friendly, and many stacks do lots of string munging
that is not fully language-independent.  When you get right down to
it, XCMDs are basically a hack. Lets not worry too hard about whether
we're following all the compatibility rules.

Doug Felt
Courseware Authoring Tools Project

ech@pegasus.ATT.COM (Edward C Horvath) (04/15/89)

From article <1583@ccnysci.UUCP>, by alexis@ccnysci.UUCP (Alexis Rosen):

> Perhaps I'm being dense (it's 5AM), but why can't compilers just see that
> they're creating a stand-alone resource and put the "globals" in a stack
> frame?

You're not dense: the easiest way for a pascal programmer to end-run the
globals problem is to declare a "main" PROCEDURE and nest all the
subroutines within it.  If the codeRes is small enough, no worries.

C programmers, lacking nested procedures, are underwhelmed by this
solution.  Thus Aztec C and LSC both provide for globals in code resources.
MPW C does not.

Hmm, I wonder who drives the MPW bus...

=Ned Horvath=

bayes@hpfcdc.HP.COM (Scott Bayes) (04/15/89)

As far as I know the cache-line is 4 longwords in the 68030 d and i-caches.
This is 128 bits, and is aligned to quad-longword boundaries. If there's a
cache miss, the address you wanted is loaded first, then the remaining
data in the cache line is loaded into the cache while the CPU messes with the
data just loaded. Of course, all flushes also happen on quad-longword
boundaries as well. You really want a 32-bit wide data bus in the machine
to take advantage of this.

Scott Bayes

pratt@boulder.Colorado.EDU (Jonathan Pratt) (04/15/89)

In article <2788@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:
>You're not dense: the easiest way for a pascal programmer to end-run the
>globals problem is to declare a "main" PROCEDURE and nest all the
>subroutines within it.
>
I just wanted to point out that C programmers can emulate this nested
globals behavior by doing manually what the Pascal compiler does auto-
magically: simply define the globals as a struct var in the procedure
designated as "main" and pass a pointer to this struct to each "nested"
procedure.  Pascal does this with hidden arguments.  Yes, it does get
a bit ugly if the nesting grows deep.  I noticed that LSP does a pretty
nice job of this for single nesting;  Address register A4 picks up the
hidden globals pointer, so there isn't much overhead.

Jonathan

/* Jonathan Pratt          Internet: pratt@boulder.colorado.edu     *
 * Campus Box 525              uucp: ..!{ncar|nbires}!boulder!pratt *
 * University of Colorado                                           *
 * Boulder, CO 80309          Phone: (303) 492-4293                 */

paul@taniwha.UUCP (Paul Campbell) (04/16/89)

In article <2786@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:
>
>What SHOULD happen, naturally!  Listen carefully, and re-read as necessary:
>The PC is used to fetch instructions, from the cache if there's a hit, from
>RAM otherwise.  ALL other memory references are to/from the data cache if
>there is one, and all writes to the data cache are write-thru.
>
>This is so simple that it is hard to screw up.

well not quite that simple, the tags in the caches are 'virtual' addresses,
the MacOS (24 bit) aliases the same memory locations with different
addresses (yes .... the Handle tags in the high byte! :-) on the '030
they get away with this in the data cache (it doesn't matter in the
instruction cache because you don't write to it) by setting the WA bit in
the cache control register.

They also have to do an instruction cache invalidate when doing a _LoadSeg.

	Paul

-- 
Paul Campbell
Taniwha Systems Design			UUCP:		..!mtxinu!taniwha!paul 
Oakland CA				AppleLink:	D3213

tim@hoptoad.uucp (Tim Maroney) (04/17/89)

In article <1567@Portia.Stanford.EDU> duggie@Jessica.stanford.edu (Doug Felt)
writes:
>The problem with using the resource id of the XCMD/XFCN resource as
>the key to a corresponding STR# (or other) resource is that generally
>when you copy resources the resource id gets reassigned.  Thus if you
>distribute an XCMD in a stack with an "install" option you need an
>installer XCMD that 1) lets you detect id conflicts and 2) lets you
>assign ids explicitly.  The most common installers don't, and I had to
>write my own in order to handle this problem.

Well, but, a "user" should never see an XCMD.  It's a developer tool
which is (or should be) only visible to the highest level of HC users,
those who've clicked user level scripting.  Once you get to that
level, there shoudn't be any problem with instructing the developer
to use ResEdit.  You don't need to idiot-proof things by providing
an HC installer.

>I say go ahead and use embedded strings.  Hypercard is not really
>script-manager friendly, and many stacks do lots of string munging
>that is not fully language-independent.  When you get right down to
>it, XCMDs are basically a hack. Lets not worry too hard about whether
>we're following all the compatibility rules.

Gag!  Choke!

The last sentence is too obviously screwy to bother refuting, but there
is an interesting issue earlier on.  You can be internationalizable
without being Script Manager friendly -- there are really two levels of
international portability.  One is when you're compatible with other
Indo-European languages that use one-byte-per-character alphabets and
separate words using spaces; the other is where you're compatible with
any language that can be crammed into the Script Manager.  The claimed
HC incompatibility with the Script Manager may be real, but on the
other hand, HC *is* pretty Europe-friendly.  Any developer who doesn't
want to break this would be well advised to use separate strings.
-- 
Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim
"Conversion, fastidious Goddess, loves blood better than brick, and feasts
 most subtly on the human will." - Virginia Woolf, "Mrs. Dalloway"

duggie@Jessica.stanford.edu (Doug Felt) (04/17/89)

In article <7020@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes:
>In article <1567@Portia.Stanford.EDU> duggie@Jessica.stanford.edu (Doug Felt)
>writes:
>>The problem with using the resource id of the XCMD/XFCN resource as
>>the key to a corresponding STR# (or other) resource is that generally
>>when you copy resources the resource id gets reassigned.  Thus if you
>>...
>
>Well, but, a "user" should never see an XCMD.  It's a developer tool
>which is (or should be) only visible to the highest level of HC users,
>those who've clicked user level scripting.  Once you get to that
>level, there shoudn't be any problem with instructing the developer
>to use ResEdit.  You don't need to idiot-proof things by providing
>an HC installer.

I disagree.  First, there are users who do no appreciable amount of
scripting but who may still create simple stacks for their own use, by
cutting and pasting buttons, laying out fields, and the like.  They
interact with XCMDs by pushing buttons.  A simple example would be an
"Open" button which brings up the open file dialog and lets the user
launch an application.  A user who does no scripting, and does not
know how to use or even own Resedit, may yet wish to have this in her
or his Home or other stack.  An "Install" button is the best means of
getting it there.  Second, there are Hypercard "applications" that use
multiple stacks, and that share resources between stacks.  In this
case the user may never done so much as created a button or typed
"find" into the message window.  Again, such resources must be
installed into Hypercard or the Home stack, so an installer is
necessary.  And of course the installer must check for collisions with
existing resource names or ids.

>>I say go ahead and use embedded strings.  Hypercard is not really
>>script-manager friendly, and many stacks do lots of string munging
>>that is not fully language-independent.  When you get right down to
>>it, XCMDs are basically a hack. Lets not worry too hard about whether
>>we're following all the compatibility rules.

(I forgot the :-))

>The last sentence is too obviously screwy to bother refuting, but there
>is an interesting issue earlier on.  You can be internationalizable
>without being Script Manager friendly -- there are really two levels of
>international portability.  One is when you're compatible with other
>Indo-European languages that use one-byte-per-character alphabets and
>separate words using spaces; the other is where you're compatible with
>any language that can be crammed into the Script Manager.  The claimed
>HC incompatibility with the Script Manager may be real, but on the
>other hand, HC *is* pretty Europe-friendly.  Any developer who doesn't
>want to break this would be well advised to use separate strings.

I agree, although as one who works with languages "crammed into the
Script manager," I find partial internationalization unsatisfactory.
Rather like having an application 3/4 debugged :-).  But it may be the
best you can do, and if you anticipate users with native languages
different from your own, by all means use separate strings (and watch
out for number, date, time, and monetary formats, which might switch on
you).  On the other hand, if you don't plan on wide distribution and
want some string constants that only programmers will see--such as
returning the keyword "Error" at the start of an XFCN result string,
or named parameters to an XCMD--then embedded constants may be the
most expedient, if not the most professional, solution.

>Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim

Doug Felt
Courseware Authoring Tools Project
duggie@jessica.stanford.edu

duggie@Jessica.stanford.edu (Doug Felt) (04/22/89)

In article <2810@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes

>The problem is more generic: it is not just strings, but other resources
>as well (PICT, MENU, DLOG, ICN#, etc) that one might want to tie to an
>XCMD.  I'm not necessarily advocating all that stuff, nor would I be likely
>to use it all in a single XCMD or group thereof.  In the absence of an
>"owned resource" definition, I have to find some way to embed EVERYTHING in
>the XCMD resource itself.

Yes.  Or make up your own definition of owned resources for this
purpose, and write an installer that recognizes it.

(quote of my heretical statement omitted)

>... XCMDs provide a way to EXTEND a powerful user-
>interface engine.  The present mechanism is inadequate; it needs better
>engineering, it is not an excuse for user-interface anarchy.

Oh, a little anarchy helps grease the wheels a bit... :-) 

I would not call Hypercard a powerful user-interface engine.  Rather,
it is a bit of "interesting anarchy" that happens to be distributed to
hundreds of thousands of people for free by a large computer company.
Like the 128K Mac, its uses are limited.  Like the 128K Mac, it is
essentially an enormous beta test (Will users learn Hypertalk?  What
will they do with cards and buttons?  Let's find out!).  And, like the
128K Mac, we will be struggling to get out from under its quirks for
many years to come.  That said, it is undeniably a "seminal" product,
(we like to use that word in the educational community) and I hope
people take a look at the various ideas and run in different
directions with them.

I have long been an advocate of "applicationless environments" where
instead of applications there are just chunks of code that perform
different functions. The granularity would be smaller than an
application but larger than a routine (XCMD).  These chunks, plus
other resources, would be managed by a database in the operating
system.  With a number of object-oriented systems starting to
stabilize on user interface classes, it starts to look possible to
partition out the user interface from the code.  And perhaps when we
get rid of applications we can get rid of files (as user-visible
entities with fixed data formats) too.  Ditto for directories.  (Users
do need to organize their work, but hierarchical directories are not
the best way to do it).

In short, we need more anarchy, as long as it's interesting.  These
string (resource) problems have to be solved, for the moment, but
they're not interesting, and "anarchy" here is just randomness,
neither useful nor, in my opinion, terribly harmful.  I say develop a
nested resource format and a replacement for the resource manager.
Now that would be anarchy!

>=Ned Horvath=

Doug ("overthrow toolbox tyranny") Felt
Courseware Authoring Tools Project
duggie@jessica.stanford.edu

afoster@ogccse.ogc.edu (Allan Foster) (04/24/89)

In article <2810@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:

> ...When you get right down to
> it, XCMDs are basically a hack. Lets not worry too hard about whether
> we're following all the compatibility rules.
>
>=Ned Horvath=

Sorry Ned, but here I have to disagree!

XCMDs in their current implementation are a Bit of a hack, but they
are still very useful and important to extending not only HyperCard
but an increasing number of other applications as well.  

I see this as a good thing since other developers are going to force
Apple to come up with a good way to implement XCMD bundles.  If Apple
doesn't then WE WILL!  Just cause HyperCard started this whole thing
does not mean that they OWN the whole Idea!

My suggestion to this problem is to have a new resource type (maybe
XBDL?) that is the description of a xcmd bundle.  This is similar
to the bnld resource in an appl, but can also contain arg descriptors
for reminders to developers.  They can be removed if space is tight,
but at least XCMD users would have an idea of what params to give it!

Anyway nice to hear from you again!

Regards

Allan Foster -- GURU

-- 
Allan Foster      UUCP  : tektronix!ogcvax!afoster
CSNet : afoster@cse.ogc.edu      GEnie  : A.FOSTER
AppleLink : UG0035                 MacNet : FOSTER

ech@pegasus.ATT.COM (Edward C Horvath) (04/25/89)

In article <2810@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:

> ...When you get right down to
> it, XCMDs are basically a hack. Lets not worry too hard about whether
> we're following all the compatibility rules.
>
>=Ned Horvath=

From article <2442@ogccse.ogc.edu>, by afoster@ogccse.ogc.edu (Allan Foster):
> Sorry Ned, but here I have to disagree!
> 
> XCMDs in their current implementation are a Bit of a hack, but they
> are still very useful and important to extending not only HyperCard
> but an increasing number of other applications as well.  

AAAAARRRRGGGHHHH!!!  I have been DEFAMED!!! and by a FRIEND (or so I
thought) to boot!

Allan, I quoted some poor slob (who has been flamed enough) -- I CONDEMNED
this position, I didn't endorse it!

I assume this was an accident.  Else, I hope Maggie gives birth to an IBM
PC lover!  So there!

Some of the above is :-)  Love ya, Allan...

=Ned=

earleh@eleazar.dartmouth.edu (Earle R. Horton) (04/25/89)

     I have a nifty way to solve this problem once and for all, but it
requires the cooperation of the host application developer.  This
won't help you at all if you want to write 'XCMD' resources for the
present version of HyperCard, but may help if you are writing a host
application for which you are designing the code resource interface.

Application writer:

     Arrange to have in your A5-relative global space an unused area
of storage, located at the top of the global area and just below
register A5.  You can do this with the MPW and Aztec linkers by having
an array of whatever size which is (a) declared in your last-mentioned
object file on the link line and (b) referenced but not modified in
your source code.  While linking your program, arrange to have a link
map produced, and verify that the unused area of storage is, indeed,
at the top of the global data area.  

     Included in the specifications for your code resource type the
size of the free area below A5.  Custom code resources produced for
use with your application will use the area below A5 as their own
private global area, and will reference it normally with negative
short offsets off of A5.  These may initialize their global data in
the "normal" manner, i.e. the same method used for application code,
when you call them.  Make it extremely clear that anyone who uses more
A5-relative data space than this will crash your application.  The
idea here is that the APPLICATION WRITER, when he designs the program
at the start, specifically allows for an area of memory which can be
used by custom code resources to implement A5-relative global data.
(Your application uses the area below this for its own globals.)

     The code resource globals are even available from within Dialog
FilterProcs!  (Yes, I have this working in a program, and I am using
global variables, A5-relative string constants, A5-relative floating
point contstants, and all kinds of stuff like that from within my
custom code resources.)


Earle R. Horton

Graduate Student.  Programmer.  God to my cats.

afoster@ogccse.ogc.edu (Allan Foster) (04/28/89)

In article <2823@pegasus.ATT.COM> ech@pegasus.ATT.COM (Edward C Horvath) writes:
>AAAAARRRRGGGHHHH!!!  I have been DEFAMED!!! and by a FRIEND (or so I
>thought) to boot!
>
>Allan, I quoted some poor slob (who has been flamed enough) -- I CONDEMNED
>this position, I didn't endorse it!
>
>I assume this was an accident.  Else, I hope Maggie gives birth to an IBM
>PC lover!  So there!
>
>Some of the above is :-)  Love ya, Allan...
>
>=Ned=
AAARRRGGGHHHH  No Ned P L E A S E Not a PC LOVER!!!!!!!!!!!!!!

Sorry I thought you were on the other side,  my mistake!!
Glad to see you sticking up for poor old XCMDs, and the 
HIG Guidelines....

Sorry again, please take back the CURSE about the PC thing!

( I am hoping the first words it says are in 68000 assembly!)

Regards

Allan Foster - GURU




-- 
Allan Foster      UUCP  : tektronix!ogcvax!afoster
CSNet : afoster@cse.ogc.edu      GEnie  : A.FOSTER
AppleLink : UG0035                 MacNet : FOSTER