[comp.sys.amiga.programmer] Is OS supposed to zero the TOD clock registers intermittently?

pochron@cat17.cs.wisc.edu (David Pochron) (03/22/91)

In my (apparent) never-ending quest to figure out why my clock keeps going
haywire when I drag windows around, I decided to look at the 8520 registers
$BFD800, $BFD900, and $BFDA00 (CIAA-eventLSB, eventMid, and eventMSB for
those who remember mnemonics)  and see what happens to them when this "blessed"
event occurs.

What I found is when start banging madly on the drag bar, depth gadgets, and
resizer gadget before the window finishes refreshing, something (Intuition?
GfxLib? Layers? Timer.device?)  resets these registers to zero.  Why?


Now I did notice that the OS (Kickstart 1.2), resets these registers every
time whenever they reach a count of 65536.  Funny thing is, the OS designers
used a VBI to check when the registers reach that count and didn't use the
built-in 8520 TOD alarm interrupt.  Any reason for this?

This still doesn't explain why banging on the windows should reset this
count before it reaches the 65536 mark.


It looks as if the OS commands to read the time and date get it directly
from these registers, since you can POKE them and see the clock (CBM v2.22)
respond immediately.  They must also have a software structure also, though,
as I assume that when the VBI occurs and finds the count to have expired, it
updates the software structure and resets these hardware registers to zero.


The "haywire" effect doesn't happen too often with just Workbench icon windows,
but the Clock program (v.2.22) that comes on the Workbench disk seems to cause
it to happen much more often. (Esp. in analog mode.)  SID 1.06 causes it all
the time, but I tend to discount SID since Enforcer tells me it bangs on just
about every illegal memory location in the system! (Talk about bugs!)
It happens often while dragging the VLT review and CONMan console windows also.


So what does everyone think to problem is?

1) Programs that are stomping on the registers by accident?
   (Except that it has happened to me while using a right-out-of-the box
   KS 1.3 update disk, with no other programs running except Workbench.)

2) A real OS bug - probably a race condition when reading and updating the
   hardware/software structure clock values?

   Or perhaps gfx or layers is accidentally banging on the eventMSB register,
   causing the VBI to update the software clock structure at the wrong time.
   (POKEing eventMSB manually creates a VERY similar "haywire" effect!)

3) A hardware problem - A2630 timing problem, faulty 8520?
   I don't think it is a hardware problem though, since the 8520 seems to work
   fine, and the error is easily reproducible on my system.  Don't want to
   think about a problem with the A2630...


I am thinking of writing a "protector" for those locations, as it is very
easy for any program to accidentally write to them, and mess up the TOD clock.
If someone from Commodore responds to this message and says that window drags,
etc, etc. should NEVER reset these registers, then I can go ahead and write
a VBI to keep spare values for these registers and fix them if necessary.
If not, then patching the problem becomes much more complex...

This TOD clock thing is really messing up "make" and other time-dependent
utilities!

Thanks in advance...

-- 

       -- David M. Pochron   |
                             | Canada: One of the world's greatest mysteries..
pochron@garfield.cs.wisc.edu |

ccplumb@rose.uwaterloo.ca (Colin Plumb) (03/22/91)

pochron@cat17.cs.wisc.edu (David Pochron) wrote:
>In my (apparent) never-ending quest to figure out why my clock keeps going
>haywire when I drag windows around, I decided to look at the 8520 registers
>$BFD800, $BFD900, and $BFDA00 (CIAA-eventLSB, eventMid, and eventMSB for
>those who remember mnemonics)  and see what happens to them when this "blessed"
>event occurs.

Well, my 1.3 Hardware manual says that CIA A is BFEx01 and CIA B is
BFDx00, so you're looking at the CIA B TOD register, which is used
by the graphics.library to synchronise events to the video beam,
i.e. QBSBlit().

I can see why heavy graphics activity could use this counter heavily.

>This TOD clock thing is really messing up "make" and other time-dependent
>utilities!

If you're getting erratic time from the timer.device, then, yes, there's
a bug somewhere, but if you're doing something wierd with the CIA directly,
then the A/B confusion might be causing some problems...

>                             | Canada: One of the world's greatest mysteries..

If you're going to use a literary metaphor, than I guess the U.S. is
a slasher flick...
-- 
	-Colin

pochron@rt5.cs.wisc.edu (David Pochron) (03/23/91)

In my message 1572, I didn't give the correct address values for the TOD
clock registers...I guess that is what happens when you try to commit these
things to memory!  :-)

I meant to say registers at $BFE801, $BFE901, $BFEA01, the 60-hz event timer
registers, and not the Hbeam sync registers.

In any case, I have been using the above registers, but I just didn't
post the addresses here correctly.


I went ahead last night and wrote a program called "ClockGuard" and it catches
many of the times that something bangs a random value into the TOD clock
registers, tries to fix it, and flashes the screen yellow for a second.
(You wouldn't believe the number of programs that trash the clock registers!)

Problem is, even though it is detecting all the "bangs", it doesn't always
correct the problem.  This weekend I will post the source, and maybe some kind
soul can look it over.


-- 

       -- David M. Pochron   |
                             | Canada: One of the world's greatest mysteries..
pochron@garfield.cs.wisc.edu |

mykes@sega0.SF-Bay.ORG (Mike Schwartz) (03/24/91)

In article <1991Mar22.011550.23658@watdragon.waterloo.edu> ccplumb@rose.uwaterloo.ca (Colin Plumb) writes:
>pochron@cat17.cs.wisc.edu (David Pochron) wrote:
>>In my (apparent) never-ending quest to figure out why my clock keeps going
>>haywire when I drag windows around, I decided to look at the 8520 registers
>>$BFD800, $BFD900, and $BFDA00 (CIAA-eventLSB, eventMid, and eventMSB for
>>those who remember mnemonics)  and see what happens to them when this "blessed"
>>event occurs.
>
>Well, my 1.3 Hardware manual says that CIA A is BFEx01 and CIA B is
>BFDx00, so you're looking at the CIA B TOD register, which is used
>by the graphics.library to synchronise events to the video beam,
>i.e. QBSBlit().
>
>I can see why heavy graphics activity could use this counter heavily.
>

I wonder why they don't just use Copper interrupts at the desired scan
line?  This would be exactly in sync with the beam and would free up the
CIA timers for programs to use.

>-- 
>	-Colin

--
mykes

*******************************************************
* Assembler Language separates the men from the boys. *
*******************************************************

jesup@cbmvax.commodore.com (Randell Jesup) (03/24/91)

In article <1991Mar21.175806.23729@daffy.cs.wisc.edu> pochron@cat17.cs.wisc.edu (David Pochron) writes:
>
>In my (apparent) never-ending quest to figure out why my clock keeps going
>haywire when I drag windows around, I decided to look at the 8520 registers

>3) A hardware problem - A2630 timing problem, faulty 8520?
>   I don't think it is a hardware problem though, since the 8520 seems to work
>   fine, and the error is easily reproducible on my system.  Don't want to
>   think about a problem with the A2630...

	In the past, most cases I've heard of like this (things go haywire
when dragging things) have been caused by blown CIA chips.  I suspect that
if you were to swap them you'd see different behavior.

	It sounds very much like a CIA hardware bug to me (but remember, I'm
a software guy...).

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

pochron@cat52.cs.wisc.edu (David Pochron) (03/25/91)

In article <20076@cbmvax.commodore.com> jesup@cbmvax.commodore.com (Randell Jesup) writes:
>	In the past, most cases I've heard of like this (things go haywire
>when dragging things) have been caused by blown CIA chips.  I suspect that
>if you were to swap them you'd see different behavior.
>	It sounds very much like a CIA hardware bug to me (but remember, I'm
>a software guy...).

Well, I have found the source of the problem - and it is indeed hardware-
related.  Unfortunately, it is not as simple as replacing the 8520's - I took
my machine apart and swapped them this weekend, and it made no difference.

The decision for me to take the machine apart came because I did some more
tests and wrote a little "diagnostic" program and it was pretty obvious the
OS was off the hook.  (And any software, for that matter)

Here's what I did:  (I feel like an expert system!)

1) On a whim, I decided to try the old "Boxes" program that came with the
   1.2 KS release, and lo and behold, the Clock went crazy as soon as it
   started running!  Either there was a bug in "Boxes", or a bug in RectFill,
   or a Blitter problem.

2) To see if "Boxes" had the problem, I wrote a simple program in "C" that
   simply opened a window and forever called RectFill() with the x1,y1,x2,y2
   coords. in fixed positions.  Same thing happened - clock went crazy!
   "Boxes" was off the hook.

3) To see if it was the RectFill() call, I wrote my diagnostic program - in
   assembly, with direct access to the Blitter.  I opened a screen and
   set the blitter to do rectangular fills, (source A data set to $ffff, it
   and all other sources (B,C) turned off.  Source D pointed to my screen
   bitplanes.
       Oh, yes, and everything was done very legally - Own/Disown blitter
   calls were made and WaitBlit calls as well, and my own wait blit function
   which just loops around waiting for the blitter activity to finish.
       At first, the clock was fine...But, I thought, "This is not really
   a good test - the OS uses one of the other sources to combine rectangular
   fills with the existing screen data."  So I set up source B to read from
   some screen data I had on another screen at address 85000 (decimal)
   POW!  The clock started going crazy.

RectFill() was off the hook.


And what is stranger is this:

a) The more bits that are set in the SrcA blitter data register, the more
   likely memory locations in the CIA chips are likely to get trashed.
   Ie., a "%1010101010101010" would be fine, but a "%1111101111111111" would
   make it go haywire, and the combination of bits does not matter, as long
   as 13-15 bits are set, it will go on a banging spree.

b) When setting the source B read address ptr, only certain address ranges
   would cause problems.  I could read from locations like 2048 with no
   problems, but ranges like 65000-85000 banged on the CIA registers.

c) How the heck is it possible for the blitter to be banging on memory way
   up at $BFE001 anyway?  It should be impossible, yet it happens!

d) Again, none of this happens when the A2630 card is disabled.

Which leads me to believe the problem is:

1) The Agnus chip (original, 512K fat lady) is flaky.
2) The A2630 68030 card is flaky.  All of this started when I got the card
   last year.  I wish this were not the case, as it could be very expensive
   to track down the extact problem and get it fixed.  I'd much rather just
   replace an Agnus chip (with a nice 1meg one, too!)

My motherboard is still only rev. 4.2, but I don't know if it and the A2630
card are having a problem.

One thing I did notice though, is that the resistor between and just above the
two 8520's is burnt brown and bubbly.  It looks like a 47-something ohm
resistor.  (I can rember the exact R-number on the PC board.)
I will probably replace it today.

I also tried reseating the Agnus chip - didn't make any difference.

>-- 
>Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
>{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  


-- 

       -- David M. Pochron   |
                             | Canada: One of the world's greatest mysteries..
pochron@garfield.cs.wisc.edu |

darren@cbmvax.commodore.com (Darren Greenwald) (03/25/91)

I am somewhat unclear regarding this -

1.) You referred to "your" clock program - is this a problem with all clock
programs on your machine, one that you've written, or one specific one that
you like to use?

2.) Does the clock program look at the CIA hardware registers directly
and assume that a specific set of registers is == to TOD?

3.) BFDxxx is on CIAB, yet you referred to CIAA?

If #2, then the clock program is fundamentally flawed ... you can't look
at any hardware resource, and make assumptions about how it is being
used if your program isn't the owner of that resource.

If other clock programs work fine (e.g., the one that comes with the
system software), then its pretty clear that the system isn't completely
haywire, just the one piece of software.

If all clock programs exhibit the same bug, then I'd guess it to
be hardware problem.


--------------------------------------------------------------
Darren M. Greenwald | Commodore-Amiga Software Engineering   
                    | USENET: uunet!cbmvax!darren                       
--------------------------------------------------------------
Quote: "It would be impossible to discuss the subject without
        a common frame of reference." - Spock

daveh@cbmvax.commodore.com (Dave Haynie) (03/26/91)

In article <1991Mar24.204244.6123@daffy.cs.wisc.edu> pochron@cat52.cs.wisc.edu (David Pochron) writes:

>And what is stranger is this:

>c) How the heck is it possible for the blitter to be banging on memory way
>   up at $BFE001 anyway?  It should be impossible, yet it happens!

It is totally impossible.  The CPU can point over to the chip bus, but the
chip bus can't point toward the CPU bus, the lines just plain don't point
that way.  I'm not disputing your claim that it looks like this, just that
that isn't what's really happening.

>d) Again, none of this happens when the A2630 card is disabled.

Do you have a way to try another A2630 card...

>Which leads me to believe the problem is:

>1) The Agnus chip (original, 512K fat lady) is flaky.

Possible, but not likely.  There's isn't a whole lot Agnus could do to mess
with the CIA chips, and still work at all.  You'd basically have to have
inputs magically turned around as outputs for that to happen.  More likely
a problem with the Gary chip, is anything.  I doubt either is at fault.	

>2) The A2630 68030 card is flaky.  All of this started when I got the card
>   last year.  I wish this were not the case, as it could be very expensive
>   to track down the extact problem and get it fixed.  I'd much rather just
>   replace an Agnus chip (with a nice 1meg one, too!)

Well, the A2630 card is a better target than Agnus.  If it were, for example, 
to get some of the CIA interface signals confused, you might have this kind of
problem.  I have only known of one A2630 failure personally, and that was one 
of the TTL parts responsible for preconfiguration setup.  Most of the A2630 CIA 
interface is contained in a pair of PAL chips.  Of course, the A2630 does 
depend on lots of signals from the motherboard, so any real good motherboard 
mess up would confuse the A2630's operation.  While the A2630's CIA timing 
might differ ever so slightly from a real 68000's, they're both correct by
the 68000 spec.  If you partially zapped both 8520s, you might just be in
that range.  8520s do die in interesting ways.

>My motherboard is still only rev. 4.2, but I don't know if it and the A2630
>card are having a problem.

Shouldn't be any problem.  Most of the boards around here, including the
system I'm typing this on (A2000 w/A2630) were around that vintage when I was
developing the A2630.

>One thing I did notice though, is that the resistor between and just above the
>two 8520's is burnt brown and bubbly.  It looks like a 47-something ohm
>resistor.  (I can rember the exact R-number on the PC board.)
>I will probably replace it today.

Sounds like R310, which is a 47 ohm 1/2 watt resistor.  That's the one between
the "+5 USER" supply and the +5 on the parallel port.  How the heck did you
zap that one?  Even at a dead short to ground it shouldn't burn up like that.
Once it's replaced, check to see that you still have +5 on pin 14 of the 
parallel port.  If that's gone, you could be having all kind of problems (I 
doubt it is, or you'd lose your mouse too.  Whatever did that might have had
an adverse effect on the rest of the system, especially the CIA which drives
the parallel port.  Which is, incidently, the one with the TOD clock in it.
Well, actually I guess both do talk to the parallel port -- U300 drives the
parallel line and receives the STROBE* handshake, while U301 takes care of the
Paper out, busy, select, and acknowledge lines.  

>I also tried reseating the Agnus chip - didn't make any difference.
>
>>-- 
>>Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
>>{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
>
>
>-- 
>
>       -- David M. Pochron   |
>                             | Canada: One of the world's greatest mysteries..
>pochron@garfield.cs.wisc.edu |


-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
      "That's me in the corner, that's me in the spotlight" -R.E.M.