Leisner.Henr@xerox.com (Marty) (01/21/88)
What follows is a summary of what I've been doing with my Minix system for the last 6 months on a PC-AT. I welcome comments and opinions. I feel a number of things I changed makes Minix a more robust system. I'm not sure how the following fits into the official Minix strategy. It kinda assumes you're reasonably familiar with the 286 architecutre and know: TSS = task state segment LDT = local descriptor table GDT = global descriptor table, etc. Enjoy.... Here is a summary of what steps I took to get a protected mode Minix system up and running. I made a number of changes to a system in real mode in order to prepare for a protected mode system. This is an outline of what I did. The scope of what I did became a rewrite of a number of sections. Whenever possible, I moved processer dependent code into seperate files whenever possible. I haven't done much with it in the last 6 weeks. I have the system running on a seperate AT in my office and use it regularly to unshar archives, untar various tar files, etc. I've had the system up and running for several weeks at a time with no major problems -- occassionally I get an occassional protection violation (which causes a core dump) and very occasssionally I get a double fault (I don't attempt an error recovery except core dump on an exception -- if the system produces the exception (as opposed to a user process) -- my system is dead (can't recover). I've made extensive changes to the kernel and the memory manager -- the file system hasn't been touched (well almost -- I had to change something from int to unsigned to get around an Aztec bug (my pipes leaked ;-)) I used Bach's Design of the Unix Operating System for reference at times (I wanted to see how sys V did some things). I pretty much am running a quasi-1.2 system. I haven't implemented the termcap stuff yet or the boot from hard disk patches (I'm gonna get back to this soon). Oh, by the way , I'm using an HP1631D Logic State Analyzer with a pod for the 80286 and an 80286 dissasembler. Don't leave home without one!! DESIGN GOALS I did a lot of kludgey things to get it to work. I wanted to do a number of things via the C preprocessor, but this often didn't work out the way I planned. My basic design included these goals: 1) fork would share text 2) The memory manager would start to think in terms of attaching regions to processes (where a region is a contiguous block of memory). 3) The memory manager would allocate memory in 256 byte pages (so a 16 bit number would span 16Mbytes of physical memory). 4) Run two hole lists to get at extended memory (one below 640K, one above 1 Mbyte). 5) Be able to run more complicated memory models (at a minimum 1 data segment, 1 code segment and 1 stack segment). 6) Take advantage of the processor architecture (i.e. perhaps use call-gates instead of always passing messages). 7) Start using 32 bit virtual addresses instead of T, D or S + 16 bit offset. 8) Be able to implement some reasonable form of shared memory. 9) Be able to use it as a real-time system (one task running on a 1kHz interrupt for process-control type things) 10) Try and cleanly split processor/system dependent from independent code. This isn't easy. It turned out the modularity of the system didn't always buy me that much because I had to often change the MM and kernel together. Or put in special code in the MM to be compatible with the way the kernel worked . Or ... (you get the picture)... 11) When appropriate, rewrite rather than patching. 12) Let the system structure dictate the data structures. I didn't want to go through translation between internal data structures and processor dependent data structures. The segmented architecture of the x286 throws too much stuff out the window (I think). 13) Write in C whenever possible. If possible use canned assembler routines to take advantage of architectural features C compilers generally won't employ (i.e. string instructions on x286 for block moves). 14) Maintain binary compatability with older Minix versions. 15) Be able to add server tasks easily. REAL MODE ENHANCEMENTS/CHANGES This is a list of changes made which were tested in real mode: Kernel: 1) used lidt instruction to define a new interrupt vector table. Reprogrammed interrupt controls to use 40H-4FH as where vectors are (interrupts 8H-FH are reserved for internal x286 exceptions). 2) took out reboot code (in order to reboot, I have to power down) 3) starting using disable/restore like Xinu (instead of lock/unlock) 4) used the Aztec port subroutines (inportb(port) returns the result of the in instruction -- I found the port_in(number, &result) a little strange -- I prefer something like result = port_in(number) 5) I generally have seperate subroutines to read/write from/to user/kernel space. I found this a little cleaner than doing to umaps followed by a phys_copy. I also am using the Aztec routine movblock(char far *src, char far *dst, int num_bytes) and limiting block moves to 64K. Since my process memory space is no longer contiguous, this appears to be ok. 6) My kernel relocates dynamically. I'm booting off DOS first, the kernel gets its code segment and treats that as its base address (instead of hardcoding the boot address at 0x600). I found this easier so I don't have to format a floppy disk each time a build a system. 7) trapped a few exceptions in real mode (i.e illegal opcode: interrupt 6, segment overrun, intr 13). This makes a system surprisingly more robust. 8) Initialize date/time off Cmos clock 9) changed head.asm to give fs, mm and init a stack after the end of BSS. 10) don't sort partitions in the winchester driver (so the hd numbers agreed with my dos fdisk). Besides the partition sorting was kinda dead code. 11) I wanted the kernel, mm and fs to use stacks at the end of their bss area (a patch to the a.out file was sufficient. The kernel assumes the stack starts at the end of the bss. This way the kernel could set up seperate stack segments if desired (I wasn't thrilled how kernel, mm, fs and init have their stacks embedded in their data area.) 12) Removed some if(pc_at) code -- since I only use PC-ATs and my enhancements are generally PC-AT specific. Memory Manager: 1) Memory manager allocates regions for code, text and data. How fork works becomes memory model dependent. For shared I&D, no sharing takes place between processes and both text, data and stack share 1 segment (maximum 64K). For split I&D, text is shared (not copied on fork()) and data/stack is created. 2) Since the memory manager knows where everything is located at, it does its own copying (rather than passing this onto the system task). 3) Brk is a problem. It doesn't apply much to 8086 architectures. The best you can hope to do is brk individual segments when more complicated memory models exist. I'm kludging brk now. I've communicated with other's who've implemented Unix on segmented architectures and the agreement is brk is pretty useless. 4) changes memory allocation size from 16 byte units to 256 byte units. 5) I kinda took out stack checking for the interim. I have enough other protection violations that I can tell when the stack causes problems. PROTECTED MODE IMPLEMENTATION Oh boy. This gets complicated. 1) kernel, mm and fs run code and data/stack out of GDT privelege level 0. 2) Each user level process runs out of an LDT with 2 selectors (currently) -- text and data/stack. User level processes run at level 3. 3) Each selector in each process's LDT is aliased as a GDT entry (read/write access) at level 0. This solves such mundane problems as: how does the FS load code on exec into an executable segment? MM also knows where these GDT entries are (essentially each region allocated maps to 1 GDT entry). 4) Added to the system task some new messages. They are MAKE_REGION, DESTROY_REGION. When the MM makes/destroys regions it needs to construct/delete GDT entries. Its somewhat kludgey but seemed reasonable at the time. 5) Certain initialization of the TTY console driver is necessary. Since we are now using virtual addressing, we don't have to be concerned about where the video ram is located once we have a virtual address. Video ram is accessed at level 0. 6) Interrupt initialization becomes somewhat more involved. All interrupts are currently task gates. I didn't see much of a reason to use trap and interrupt gates anywhere; since in Minix all context is saved on the way into an interrupt anyway. We may as well finish the task switch. This means the kernel now becomes a task (unprotected Minix it seems the kernel just assumes the identity of the caller. 7) System call server and task switching. After I drop into protected mode, I execute the following subroutine: /* this acts as a system call server -- it rest hangs the kernel task in the while * loop */ static void startup_protected_mode() { set_task_register(KERNEL_TSS_SELECTOR); disable(); while(1) { restart(); /* run new proc */ /* can only get here from context of previous scheduled * process -- calling semantics put function in CX, * src/dest in ax and message pointer in bx. */ sys_call(proc_ptr->proc_context.cx_image, cur_proc, proc_ptr->proc_context.ax_image, proc_ptr->proc_context.bx_image); } } I've totally munged the proc structure. I put the TSS in the proc table The LDTs also currently sit in the proc table. Restart looks like this now: PUBLIC void restart() { short ps; ps = disable(); /* no interrupt while task switching */ clear_out_backlink_chain(); if(cur_proc == IDLE) far_jump(0, BUILD_SELECTOR(IDLE_TSS_INDEX, GDT, 0)); else far_jump(0, proc_ptr->proc_tss_selector); /* do a task switch */ restore(ps); /* put interrupts back */ } I hope the above two code examples give a feel for what I'm doing. An Intr32 will cause a task switch into the kernel. A far jump to a Task State Segment starts a new task running and saves the old state. Pretty neat (by the way, a task switch on the 286 takes about 185 clock cycles. 8) Had to supply necessary 286 opcodes via codemacros for Aztec assembler. Supplied simple subroutine library to access these special opcodes from C. 9) Had to build GDT and IDT in real mode before switching into protected mode. 10) Removed some address space checks in the kernel. Replaced the umap mechanism with the following function: /* Checks to see if the selected virtual addresses are legal for the * selected process. * If TRUE, it will not generate exceptions. * If returns FALSE, it will cause problems (exceptions) if attempted. * Runs with current ldt of calling process. * * It basically takes the place of umap in a virtual addressed machine. */ int check_proc_addr_space(rp, address, selector, size) register PROC *rp; int selector; /* segment (to become selector) */ char *address; /* address within segment */ unsigned size; /* size of block within segment */ OBSERVATIONS The protection mechanisms make things much easier to debug when there are major malfunctions in software. Generally, the offending CS:IP is displayed, so it is possible to see where the failure occurred. This was also of use bringing up the system. Once I got I/O going to the screen, a large amount of the system become self-diagnostic. I did have one bad bug which required having a seperate idle task. Also kernel mode must run with interrupts disabled. I initially idled in the kernel halting (wait for interrupt), but eventually it caused glitches. I understood the problem well enough to know how to fix it (make idle a seperate task) but didn't do a thorough analysis on it. I totally changed dmp around. F1 reports virtual addresses, F2 reports the memory map. Kinda neat to see all my user tasks running in the same same virtual address space (LDT selector 0 for code, LDT selector 1 for data, both at level 3). With some optimization, certain things should be much faster (i.e. context switching). I'm not sure of performance with respect to the old kernel -- but I know the protection mechansisms buys much more reliability. Minor performance degradation is definitely worth it. DISTRIBUTION I'm not sure at the moment I have the facilities to release this. Everything I'm doing is based on Aztec C and I use a good percentage of the Aztec library for processor dependent stuff. I use a few simple Aztec subroutines (i.e. index, movblock, movmem, port access instructions) + some Xinu-like stuff (disable()/restore() instead of lock()/unlock()). When I recompiled the kernel on the Aztec 4.1 compiler, it crashed. There were some new (and innovative) bugs in the 4.1 compiler which my kernel tripped over. So I'm compiling the kernel with the 3.4 compiler and the memory manager with the 4.1 compiler. If you have 3.4, you should have no problems. I suppose I could supply specifications for the interested programmer to rewrite the Aztec supplied subroutines (their really trivial) or I could rewrite them myself (a few hours of work) or see if Manx will allow me to distribute a few hundred lines of copyrighted assembler. In addition, the Manx assembler supports some nice assembler features in a macro package (i.e. a procdef macro to automatically pull stuff of the stack). I generally find stuff like: mov ax,16[bp] kinda impossible to follow. The changes I've made were so extensive, I'm not sure it would be reasonable to post diffs. I'd be willing to put out source and binary versions of what I'm working with if its okay with Andy. To the kernel I've added the following files: 286opcod.asm -- provices C interfaces to 286 opcodes intr286.c -- interrupt support routines for managing IDT and handling exceptions mpx286.asm -- misc286.c -- assorted code (generally support routines to manage descriptor tables) klib286.asm -- 286 dependent assembler support (i.e. dma_read/write, read cmos ram) 286info.h -- defines template structures for 286 descriptors/TSS/etc. Provides a number of macros for a large number of predefined segments. 80286.h -- code macros for the assembler for 80286 special opcodes The above accounts for about 2000 lines of new, 286 specific code in addition to all the changes sprinkled in with the base system. To the memory manager, I added a file called region.c which started to allocate memory in regions. I also made extensive changes to the fork/exec implementation as outlined above. FURTHER WORK The following is a list of some things I'm going to be doing (in no particular order): 1) start replacing some system task messages with call gates (i.e. let mm or fs treat them as normal subroutine calls). 2) Develop a seperate interface between user procs and the kernel to treat all pointers at 32 bit virtual addresses. This would be in addition to the way the current kernel interface works. The kernel, mm and fs would start to treat address as 32 bit virtual address and the Interrupt 32 handler would have to repackage the 16 bit address into 32 bit virtual addresses. 3) Let kernel, mm and fs map in the user proc address space (in LDT) to access user memory. Prior to move to/from user/supervisor space, privelege has to examined to make sure OS won't cause a protection exception. 4) Encode the initial IP in the a.out files (instead of defaulting to 0). I find it a pain in the ass to have to link in crtso in front of every project. This looks like a simple enough enhacnement. 5) Develop a scheme for installable device drivers. Or at least a reasonable scheme to include device drivers with code, data and stack space seperate than the kernel. This may be handy for certain uses (ethernet?) 6) My boot loader sometimes hangs. Don't know why. Sometimes I need to powerdown a few times before I finally can boot. 7) Look into bringing up ethernet capability using the 3com ethernet board. I'd want to do most reasonable things (ftp, login, file server) via XNS and TCP/IP. I'd also think it would be spiffy to bring up a subset of Cornell Bridge on Minix (Bridge allows XDE to act as a windowed front end for a BSD 4.3 system running XNS). 8) Start using extended memory. marty ARPA: leisner.henr@xerox.com GV: leisner.henr NS: martin leisner:henr801c:xerox