[comp.os.minix] Announcement II - An Experimental 32-bit Kernel For The i386

mcm@maple.ucsb.edu (Marcelo Mourier) (11/13/90)

	About a month ago, I posted a message in which I announced the work
I'd been doing on an alternative 32-bit Minix for the i386.  To avoid confusion
with Bruce Evans' Minix-386, I'll refer to my version as "MEK".  Well, after
fighting the most nastiest bugs, and after modifing several things, I finally
got it up and running...!!!

	The purpose of this mail is to let the PC Minix community know that MEK
is available to hoever would like to experiment with it.  However, I don't know
which would be the best way to distribute it.  As Minix-386, MEK is based on PC
Minix 1.5; however, there's been so many changes in the source files (specially
in the kernel and memory manager), that posting cdif files is not a viable
alternative.  The most practical way to do it would be for me to send all the 
stuff in a tar file to one of the Minix archive sites (like plains.nodak.edu),
but this would most probably be against PH's copyrights of Minix sources :-(

	For those of you who missed the previous announcement, here are some
excerpts from it (updated to reflect the new changes), that will give you an
idea of MEK's features.

-------------------------------- begin ------------------------------------

	As I don't have Bruce's kernel installed, I'm not very familiar with
it.  However, I think that (correct me if wrong), even though he uses paging,
the memory management strategy used is still the same as in old Minix; eg.,
it is basically segment oriented.  There's one code and one data segment
(now up to 4GB long) both starting at virtual address zero.  These virtual
addresses are mapped into linear addresses by the 386's segmentation mechanism,
based on the memory map information in the code and data segment descriptors
stored in the process' LDT.  With paging enabled, these linear addresses are
finally mapped into physical addresses by the 386's paging mechanism. However,
this mapping is "constant", in that it never changes after being set up by the
kernel initialization routine.  Basically, it is used for skipping the various
holes in the PC's physical memory layout, thus presenting a "neat" physical
address space where the linear addresses can be mapped into.  In this way, the
relocation and protection of the various virtual address spaces is done by the
segmentation unit of the 386's MMU.

	In my experimental kernel (MEK) the memory management strategy is quite
different.  The story starts with the way virtual memory is laid out in a
process.  Each process has its own 4GB virtual address space, shared by its
three logical segments (text, data, stack) and by the kernel.  The kernel
occupies the top-most 8MB of each process' virtual address space; the remaining
4GB-8MB are left for the process' text, data, and stack segments.  The text
segment starts at virtual address zero, the data segment starts at the first
available 4MB boundary after the end of the text segment (e.g., at virtual
address 0x00400000 in most cases), and the stack segment starts (ends?) at 
virtual address 0xFF800000.  As I said before, the last 8MB of the process'
virtual address space are reserved for mapping the kernel.  This is how the
kernel is shared among all processes in the system.  Addresses 0xFF800000 to
0xFFBFFFFF hold the kernel's text segment, and addresses 0xFFC00000 to
0xFFFFFFFF hold the kernel's data segment.  The reason for starting the data
segment at the first 4MB boundary after the end of the text segment has to do
with code sharing.  By having the data segment start at that address, we are
separating the code page table and the data page table.  Therefore, code can be
shared among several processes (by sharing the PDE that points to the code page
table), without having to share any piece of the data segment.

	MEK doesn't use LDT's for defining a process' address space.  It uses
only the GDT, which contains six segment descriptors: null, kernel code,
kernel data, user code, user data, and TSS.  The kernel (tasks included) run
at CPL=0, and user processes (MM and FS included) run at CPL=3.  The kernel
descriptors define segments that span the whole 4GB virtual address space, and
that are based at linear address zero.  In this way, the kernel has access to
the whole address space of a process.  The user descriptors define segments 
that start at linear address zero and end at linear address 0xFF7FFFFF, thus
limiting the user's virtual address space size to 4GB-8MB.  As all segments
based at linear address zero, there's no distinction between virtual addresses
and linear addresses.  The relocation of the different virtual address spaces
is done by the paging unit of the 386's MMU.  In MEK each process has its own
set of page directory and page tables.  These pages are an integral part of 
the process's memory map information.  The set of pages is comprised of at
least four pages: the page directory table (PDT), one code page table (CPT),
one data page table (DPT), and one stack page table (SPT).  In addition, there
are the kernel code page table and the kernel data page table, which are shared
by all processes.  The proc structure has a new entry, p_pdt, which contains
the physical address of the PDT of the process.  When a process is restarted by
restart(), register CR3 in the 386 is reloaded with the value stored in the
p_pdt field of the process table entry of the process being restarted.  In this
way, the virtual -> physical mapping is switched to that of the new process.

	Managing address translations in this way has some interesting 
consequences.  For once, at any given time the kernel knows about only one
memory map (the one of the currently active process), which means that any
virtual address from a different process is meaningless for him.  Secondly, the
kernel ONLY knows about the physical memory used by itself and by the currently
active process; any other piece of memory is not directly addressable by him.
This has some nasty consequencies when a block of physical memory needs to be
copied to some arbitrary physical address (as during the fork sys call).  In
order to be able to access ANY page of physical memory, the kernel uses the
following trick.  Two pages in the kernel's data address space are reserved
for use as source and destination "window-pages".  The kernel dinamically maps
these pages to the corresponding physical pages he wants to access.  After the
mapping is set up, the kernel can copy the data by reading from the source
window and writing to the destination window.

	Accessing video memory is done in the same way.  A third page in the
kernel's data address space is reserved for the "video window", which is
mapped into the video RAM in the video adapter during system initialization.
Same for the BIOS vector table and BIOS data area, which are located in the
first page of physical memory.  This page frame is mapped into the kernel's
BIOS window-page, also during initialization.

-------------------------------- end ----------------------------------------

	At this point I'm looking for someone who's familiar with the GNU tools
(gcc, gas, gld, etc.) and who would like to help in the process of porting them
to MEK, in order to make it self supporting.  So far I've been working on MEK
by cross developing on SCO Unix V/386, using the Microsoft C compiler and
assembler that comes with it.  Doing it this way is a real pain, so I'd like to
have MEK have its own compiler as soon as possible...

	I've made a three-floppy MEK system (boot disk, RFS disk, UFS disk)
that contains most of the binaries needed to manage files, edit them, etc.
This system should boot in any PC-AT/386 with at least 2MB of RAM.   You won't
belive how _fast_ the system is when you have a 384-block buffer cache..!!!

	If anyone of you has experience crosscompiling the GNU tools, and some
extra time to help me porting them to MEK, please contact me.  We'll then see
the best way for me to send to you a copy of the system.


--
Marcelo Mourier (mcm@cs.ucsb.edu)


--
Marcelo Mourier (mcm@cs.ucsb.edu)

HBO043%DJUKFA11.BITNET@cunyvm.cuny.edu (Christoph van Wuellen) (11/16/90)

I think it is NOT against copyright to post cdiffs only.

This is what is done every day here.

C.v.W.