ken@gvax.cs.cornell.edu (Ken Birman) (08/14/90)
The following will probably be more common under V2.1, so I am posting this for the newsgroup as a whole... > From: rfinch@locke.water.ca.gov (Ralph Finch) > Date: Mon, 13 Aug 1990 20:48:53 PDT > Subject: Odd time warps > While running a client and server on the same machine, I get this > message occasionally: > ISIS client pid 794: time warp (40.000 secs)! > Seems odd because it's the same machine. I *think* that every time > the warp message was getting bigger. A "time warp" means that the process spent 40 seconds doing some sort of uninterruptable activity (see the ISIS manual section on when a new task can be scheduled). I.e. it sat and thought for 40 seconds or it waited for input from a user who sat and thought for 40 seconds, or it did file IO for 40 seconds... etc. Specifically, this message means that ISIS has an action to schedule at time <t> but ended up scheduling it at time <t+w secs>. If <w> is large enough ISIS prints this message. This can also happen when ISIS does a select with a timeout and wakes up much later than expected. I.e. it tells UNIX "block me, but not longer than 3 seconds", but the select wakes up 43 seconds later. In this case, a strong possibility is that something is leaking memory. For example, in V2.0 protos has a memory leak that loses 44 bytes per 50 cbcasts sent. This makes protos gradually grow until things congest, but in the mean time your UNIX may begin to page heavily, leading to long delays in application software. Perhaps your application has a much faster leak, (i.e. forgetting to free a message you create or malloc-ed memory you are obtaining). In particular, if your leak causes the total virtual memory in use on your machine to exceed 16Mbytes, SUN OS 4.1 starts to thrash (there is a bug fix from SUN). So, you suddenly see these huge delays both in the big process and in anything else on the same machine. To check for this, run "top" or "vmstat" or "ps l". To check for leaks of messages or memory managed by ISIS, use the "cmd snap" or "kill -USR2" approach to generate dumps from possibly affected processes and look at the first few lines, which give memory statistics. ISIS rarely has many messages in use and rarely uses more than 100k of memory. If you see 300 messages in use or 875k of memory allocated, you are on the trail of a leak... (Probably in your code -- mine is pretty leak-proof by now, even in V2.0). Ken