inst182@tuvie (Inst.f.Techn.Informatik) (07/17/90)
In article <628@philica.ica.philips.nl> adrie@beitel.ica.philips.nl (Adrie Koolen) writes: > The best solution to this problem would be to re-implement >the FS as a 'stated' FS which returns to the main FS loop to send or >receive messages. This approach requires quite complicated changes in the >FS. It would change the FS a lot, but I don't think it would be very complicated. FS would look like this: message_list waiting_requests, requests_being_serviced; message messagein, messageout, messageaux; for (;;) { receive (ANY, messagein) if (from_hardware (messagein)) { /* must be answer to previous request */ messageaux = find_by_device (requests_being_serviced, messagein.from); messageout = hardware2userprocess (messagein); send (messageaux.from, messageout); dequeue (requests_being_serviced, messageaux); /* device is free now, so send next request */ if ((messageaux = find_by_device ( waiting_requests, messagein.from)) != NOTFOUND) { messageout = userprocess2hardware (messageaux); send (messagein.from, messageout); dequeue (waiting_requests, messageaux); queue (requests_being_serviced, messageaux); } } else { if ((messageaux = find_by_device ( requests_being_serviced, messagein.to)) != NOTFOUND) { /* device busy */ queue (waiting_requests, messagein); } else { messageout = userprocess2hardware (messagein); send (messageout.to, messageout); queue (requests_being_serviced, messagein); } } } >FS. An easier method would be to create e.g. 3 FS threads. A FS, which >waits in the main loop for a message is said to be free. User processes >send messages to the FS, not knowing that there are 3 instances of the FS. >A free FS is selected, which handles the request. When it sends a message >to a task, the task replies to EXACTLY that FS thread. A FS thread cannot >be interrupted by another FS thread, only by tasks. This feature can be >used to garantee the consistency of the FS tables. This way, several >processes can use the FS to access different tasks concurrently. Yes that's possible, but I think one stated FS with queues is more elegant than a fixed number of FS threads. And you could schedule the requests to the same device (via elevator-algorithm, or something similar) more easily. > >Adrie Koolen (adrie@ica.philips.nl), >Philips Innovation Centre Aachen Peter J. Holzer (hp@vmars.tuwien.ac.at) DISCLAIMER: It has been a while since I looked at the Minix sources, so the above code segment is only a rough sketch, not something you could use. But when I manage to get 1.5 I will look at the problem more closely.
adrie@philica.ica.philips.nl (Adrie Koolen) (07/18/90)
In article <1685@tuvie> inst182@tuvie.UUCP (Inst.f.Techn.Informatik) writes: >In article <628@philica.ica.philips.nl> adrie@beitel.ica.philips.nl (Adrie Koolen) writes: >> The best solution to this problem would be to re-implement >>the FS as a 'stated' FS which returns to the main FS loop to send or >>receive messages. This approach requires quite complicated changes in the >>FS. > >It would change the FS a lot, but I don't think it would be very complicated. >... >Peter J. Holzer (hp@vmars.tuwien.ac.at) I agree with you in that the stated FS is more elegant than a threaded FS. But both solutions have to solve two main problems. The first is that many messages, sent from the FS to a task, are hidden in a deeply nested function. For instance, when a process issues an ioctl(), it arrives at do_ioctl() in fs/device.c, where a message is prepared, sent to the task via rw_dev() and the results are stored in the reply message for the user process. After the message has bee sent, the FS has to return to the main loop and when the task replies, the FS has to finish the request. I think it's best that every time a message is sent to a task, the address of a function to call when the reply message is received, must be stored in some table. That way, the FS can properly handle reply messages, received from tasks. The second problem is that the FS also uses the tasks for its own purposes. When reading a file, the FS has to read an i-node and possibly some indirect blocks to access the data in the file. Each time the FS calls a task, you must decide where to continue after the reply is received. And believe me, this requires a lot of rewriting. When Minix was young, the FS contained quite a lot of small bugs. They didn't occur often, so they were very hard to find. When you rewrite the FS, be it into a stated or a threaded FS, you will again create (mostly synchronisation) bugs, that are hard to find. I say we should go for it, but I think that a restructurization of the FS will be quite a big job and Andy will not like this (but he will have to, because Minix will continue to grow and with faster hardware, the deficiencies of the FS will become more and more apparent). Adrie Koolen (adrie@ica.philips.nl) Philips Innovation Centre Aachen
steve@cs.su.oz (Stephen Russell) (07/19/90)
In article <628@philica.ica.philips.nl> adrie@beitel.ica.philips.nl (Adrie Koolen) writes: > The best solution to this problem would be to re-implement >the FS as a 'stated' FS which returns to the main FS loop to send or >receive messages. [...] An easier method would be to create e.g. 3 FS threads. In article <1685@tuvie> inst182@tuvie.UUCP (Inst.f.Techn.Informatik) replied: >Yes that's possible, but I think one stated FS with queues is more elegant >than a fixed number of FS threads. The main loop of the stated FS is elegant, but the rest of the code is rather messy. The problem is that a single user transaction may result in multiple transactions with the device task, and these transactions often occur in `inconvenient' places. For example, consider scanning a directory for a matching file name (the Unix namei() function). This contains a loop that reads blocks. Somehow we must break out of this loop to get back to the main state machine, and return to the search loop with the context (the dir offset, the search string, etc) restored after the IO is completed. I'm sure the Minix FS contains many other examples of deeply nested IO. Making this work with a single-thread FS is complicated. Basically, the _whole_ FS must be rewritten to `flatten' it out, and the context of each transaction is kept in (static) structures. This is a major rewrite. The multi-threaded approach keeps almost all of the existing code, and lets the OS look after keeping track of the context of each transaction. The cost is extra stack space for each thread. In article <628@philica.ica.philips.nl> adrie@beitel.ica.philips.nl (Adrie Koolen) also wrote: >A FS thread cannot >be interrupted by another FS thread, only by tasks. This feature can be >used to guarantee the consistency of the FS tables. I'm not sure this is always true, though. It is possible that while one thread is blocked, the FS state may be changed by another thread in a way that causes problems. A new question: can the Minix IO tasks support a multi-thread FS? The major requirement is the ability to accept all pending requests messages, pick the `best' one to service, and reply to that request when it is finished. If not, the whole issue is moot. Cheers
brucee@runxtsa.runx.oz.au (Bruce Evans) (07/21/90)
In article <633@philica.ica.philips.nl> adrie@beitel.ica.philips.nl (Adrie Koolen) writes: >I agree with you in that the stated FS is more elegant than a threaded FS. >But both solutions have to solve two main problems. The first is that many >messages, sent from the FS to a task, are hidden in a deeply nested function. I prefer threads. They neatly solve this first problem. The nested context is automatically preserved with a separate stack for each thread, at least after many global variables are made local or put in the fproc table. (The defines in fs/param.h hide an alarming number of globals). Switching *from* a thread can easily be handled in rw_dev (note all physical i/o goes through there): while (waiting_for_reply_from(task_nr) || (r = send(task_nr, mess_ptr)) == E_LOCKED) /* Don't send if FS is sure to block. Try later if the send * failed due to deadlock prevention. */ unready_current_thread_and_run_another(); if (r != OK) panic("blah", r); unready_current_thread_and_run_another(); This assumes a dispatcher thread that is always ready to run. It just loops waiting for replies. When one arrives, it readies the thread that sent the corresponding request and any others in the wait loop with task_nr == replying_task_nr, and picks one to run. >The second problem is that the FS also uses the tasks for its own purposes. >When reading a file, the FS has to read an i-node and possibly some indirect >blocks to access the data in the file. Each time the FS calls a task, you >must decide where to continue after the reply is received. And believe me, Deciding where to continue is automatic with the above approach. Note that the restriction to at most one thread sending to a driver task simplifies many things. It is necessary because the device tasks are now single-threaded like FS. (This is not so harmful as for FS. It penalizes mainly devices with independent minors, and makes it impractical to put anything but short-term strategies in the drivers.) It probably makes competition for resources like disk buffers less of a problem.
graham@sce.carleton.ca (Doug Graham) (07/23/90)
In article <2013@runxtsa.runx.oz.au> brucee@runxtsa.runx.oz.au (Bruce Evans) writes: > >Switching *from* a thread can easily be handled in rw_dev (note all physical >i/o goes through there): > > while (waiting_for_reply_from(task_nr) > || (r = send(task_nr, mess_ptr)) == E_LOCKED) > /* Don't send if FS is sure to block. Try later if the send > * failed due to deadlock prevention. > */ > unready_current_thread_and_run_another(); > if (r != OK) panic("blah", r); > unready_current_thread_and_run_another(); > >This assumes a dispatcher thread that is always ready to run. It just loops >waiting for replies. When one arrives, it readies the thread that sent the >corresponding request and any others in the wait loop with task_nr == >replying_task_nr, and picks one to run. If FS must be left as a task, or pool of tasks, this is probably the best way to go. But I think you mispelled the name of the call that switches between threads. Rather than "unready_current_task_and_run_another()", it should be "sleep(task_nr)". The call that the dispatcher thread makes is "wakeup(task_nr)". :-) In any case, it sounds like you're proposing a synchronization mechanism that goes beyond message passing. Is this the case, or would you implement these new "primitives" via message passing in some way? I don't think it's this simple though. I mentioned in an earlier message, that it's quite possible that two user processes, and thus two FS threads, may be trying to read the same disk block. In this case, I don't think the second request should get as far as rw_dev. Higher level software would have to notice that the disk block was already being read in, and block the second thread until the disk block became available. Inodes and things would need to be locked as well. If this is not done, an FS thread that blocked for one reason or another, would not be able to make any assumptions about the state of the system after returning from the blocked state. It might be possible to write FS like that, but it sure would be difficult. ---- Doug.
brucee@runxtsa.runx.oz.au (Bruce Evans) (07/31/90)
In article <883@sce.carleton.ca> graham@sce.carleton.ca (Doug Graham) writes: >"wakeup(task_nr)". :-) In any case, it sounds like you're proposing a >synchronization mechanism that goes beyond message passing. Is this the case, >or would you implement these new "primitives" via message passing in some way? I was hoping to use a minor variant of a method I use in some real-time code. The only primitive is suspend(). That wakes up the next active thread on a circular queue. The thread either does some real work or tests a flag to see if it can proceed. >I don't think it's this simple though. I mentioned in an earlier >message, that it's quite possible that two user processes, and thus two >FS threads, may be trying to read the same disk block. In this case, >I don't think the second request should get as far as rw_dev. Higher You are right, there would have to be synchronization loops in a lot of places. They might as well be made into library routines with names like sleep() or wait() :-). The cache routines might end up looking like the ones described in Bach's book instead of the current simple ones :-(. What is there to worry about besides blocks and inodes? Probably everything that eventually calls rw_dev, i.e. everything. -- Bruce Evans Internet: brucee@runxtsa.runx.oz.au UUCP: uunet!runxtsa.runx.oz.au!brucee (My other address (evans@ditsyda.oz.au) no longer works)