[comp.sys.ibm.pc] Unix emulation alternatives for IBM compatibles

hedrick@athos.rutgers.edu (Charles Hedrick) (02/10/88)

I have recently had occasion to explore various approaches to Unix
emulation on an AT clone.  I thought I'd summarize my experiences for
the benefit of any other Unix hackers who find themselves on an IBM
PC.  (Let me say that this should not be interpreted as an endorsement
of the IBM PC.  I still think there is a special place in the nether
regions reserved for the person who decided to use an 8088 instead a
68000 in the IBM PC.  Right beside the guy who designed the Ethernet
connector.)

I've tried 3 levels of compatibility: PICNIX, the MKS Toolkit, and
Microport System V.  (I used each of them for enough time to get a
feeling for them, but my permanent system is going to be System V.) They
all seem to be nice pieces of work, and they all have their place.  It
all depends upon how much of Unix you need, and how much you can afford. 
PICNIX has recently been posted to one of the IBM newsgroups.  It
supplies the basic Unix file-manipulation commands: mv, rm, cp, etc. 
There are about a dozen commands.  In effect it supplies about the same
set of commands that are built into DOS for dealing with files.  It
doesn't give you much capability that you wouldn't have under DOS
(though "strings" can be useful for certain kinds of hacking).  Here's a
list: cat, chlabel (changing volume labels), chmod, cp, cpdir, df, diff,
du, fgrep, grep, ls, more, mv, mvdir, ncd (cd renamed because DOS has cd
as a builtin), ndate (ditto), necho (ditto), nset (ditto), unset, ntime
(ditto), pwd, rm, show (BSD's which), strings, switchar, tee, touch, wc. 
The commands all do a reasonable emulation of the shell's command line
processing: they handle wildcards and even ` `.  Switchar is the most
interesting.  It lets you switch the character used for command options. 
None of my MS-DOS documentation describes this, but MS-DOS actually
allows either / or \ to be used in path names.  Normally you can't use /
because that gets interpreted as the beginning of an option.  With
switchar you can change the letter used for options to -, as it is on
Unix.  At that point / is useable in paths.  Many of the MS-DOS commands
seem to realize that when your option character is -, you are a Unix
user, and so they display path names using / instead of \.  Even those
that don't will accept / in names that you type.  The most serious
limitation is that diff can't be used to compare even moderately large
files.  Apparently the algorithm they use requires both files to be in
memory, and the program uses the small model.  The -h option helps, but
even with -h there are files I couldn't handle.  Note by the way that
PICNIX is shareware, but $15 has to be a great bargain. 

By the way, if you are going to use any of the Unix stuff, I recommend
setting buffers in config.sys to a nice large value.  Unix is notorious
for being disk-intensive, and even PICNIX and MKS toolkit are close
enough to Unix that you want to do as much as you can to speed up
disk I/O.  I am currently using buffers=50.

There are two other public or shareware programs that would seem to go
nicely with picnix: cshell and ndmake.  (Both are available on Simtel
somewhere in the MSDOS file area.) Ndmake looks quite good.  It appears
to be a recent Unix make (more recent that the versions of make on most
of the Rutgers Unix systems).  The only problem I've seen with it isn't
really its fault.  Makefiles tend to have long lists of files.  MSDOS
limits arguments to 128 characters, so you have to arrange your
makefiles to pass lists of files on separate lines, or put them into
indirect command files.  Cshell is intended as an MSDOS shell modelled
loosely after csh.  It does provide some moderately useful extensions. 
I found the ability to have aliases nice.  Unfortunately, the syntax
used for them isn't the same as under csh, and it doesn't seem to be
possible to pass arguments.  But the really fatal flaw isn't something
the author is likely to be able to do much about.  As in Unix, cshell
expands wildcards, and passes the list of files to the program.  With
the 128 character limit, this causes problems.  Like most Unix
programmers, I tend to like to grep through files looking for things. 
When I do "grep *.c", the resulting list of files is often longer than
128 characters.  The PICNIX grep has no problem when called from
COMMAND.COM, since grep does the expansion itself.  But if you use
cshell, the shell does the expansion, and then the list gets truncated. 
I spent several hours scratching my head about why certain routines
never seemed to be defined, until I realized that my grep's weren't
seeing all the files.   In my opinion, any shell that expands the
argument list and then tries to pass it in the normal MS-DOS way isn't
going to be useable.

The next level of compatibility is the MKS toolkit.  This is a product
of Mortice Kern Systems, in Waterloo Ontario (519-884-2251).  It costs
something like $150 (US).  They say it includes 110 Unix commands.  Note
that this is counting the ksh builtins.  But it is still a lot of
software for the money.  It has a Korn shell, and more or less all the
Unix utilities that you'd use in writing shell scripts, including awk
(the new one, with functions).  It is not all of Unix.  It doesn't have
my favorite dc (desk calculator), nor development tools such as make. 
It does have vi, and the vi looks like it has had a lot of work put into
it.  (I don't know vi, so I didn't use it -- I use Freemacs when I run
under MS-DOS.) They have done some very clever things with ksh.  They
handle the 128-character limit on argument lists correctly.  They expand
wildcards, as a shell is supposed to do, but they pass the result in a
piece of memory that they allocate out of high core.  They put a pointer
to it in the PSP.  This lets them pass the whole list.  Obviously this
only works with their software, but they supply most of the programs
that would be likely to want to use wildcards.  They also supply a way
for your C program to invoke their expander.  The next clever thing is
that they implement part of Berkeley job control.  You can ^Z out of
their vi, and then continue it using fg.  What they do is very clever. 
When you type ^Z to vi, it does a terminate-and-stay-resident.  It sets
a magic word in the PSP so that fg can tell it is continuable.  It also
puts a pointer in the PSP to a register save area.  This contains the
information needed for fg to restart the program.  I am a die-hard Emacs
fanatic, so this wasn't enough to convert me to vi.  At the end of this
message, I have included code that defines a primitive in Freemacs that
will suspend it in such a way that the MKS fg command can continue it. 
(Note that this only defines the primitive.  There are some tables you
need to add entries to in order to call this.) 

I don't have any dramatic bugs to report in the MKS toolkit, though
there are a couple of minor oddities.  Initially I had hoped that ksh
would give me an Emacs editing mode much like mencsh or tcsh.  It
probably isn't MKS's fault that it doesn't.  But I find it hard to get
used to a mode where ^U and ^W don't do delete line and word. 
Unfortunately ksh doesn't let you rebind keys.  Also, in mencsh and
tcsh, there are two commands that look at the directory.  One is
designed to "complete" a file name.  If there is only one name beginning
with what you have typed so far, it fills in the rest of the name.  If
not, it beeps.  The other is designed for when you can't remember a file
name.  It lists all the files beginning with what you have typed so far,
and then reprints the command up to the current character.  The MKS ksh
combines these into one command that ideally should do both but actually
does neither.  It supplies the whole list of files matching what you
have typed as arguments on the command line.  The problem is that
because it goes onto the command line, it isn't very readable.  The
result is typically a command line longer than the screen.  (Tcsh goes
to a new line and gives you something like "ls -C".) Altogether, I found
myself disabling the Emacs mode and using a keyboard enhancer that
supplies a few Emacs editing commands.  The other odd thing is that I
had a performance problem with their "ls".  I like ls to produce
columnated output, as it does under BSD.  They supply a -C option, so I
figured I'd just alias ls to ls -C.  That works, but is very slow. 
Whenever I typed "ls", the disk would go clunk several times.  I use ls
a lot, and it was annoying.  It turns out that they implement -C by
putting the output into a temporary file and calling "c" to columnate
it.  This means they are writing a temp file, reading a temp file, and
starting a program in a subfork.  I tried putting everything in a RAM
disk, and that didn't help much.  It turns out that the solution is
to use ls -x.  That sorts horizontally instead of vertically.  I don't
much care which way the output runs.  They are able to do that
sorting within ls, so it is fast.

Finally, if you want the real thing, there are several actual Unix ports
available for the PC.  The MKS toolkit is probably enough for most
people (indeed PICNIX may be).  However I am a Unix system programmer,
and I want to be able to move over standard Unix software that we get
off the net, and generally write Unix code.  I decided to use Microport
rather than the SCO Xenix mostly because it is cheaper.  For the AT
version, from Programmer's Connection, it is $169 for the operating
system and utilities and another $209 for C, Fortran, and the other
things you need to do program development (libraries, make, etc.).  As
far as I can tell, this is a full System V release 2.  Indeed it is
sometimes humorously close to the original VAX.  Some of the
documentation seems to have been produced by taking the VAX
documentation and replaced VAX by Intel 286.  I particularly like the
statement that I should consider using a separate disk for the root file
system if I run more than 20 users.  I haven't used it enough to know
how reliable it is going to be.  The major limitation as a program
development system that I have noticed is that they support only the
small and large memory models, not huge.  This could make certain
programs hard to do (a Fortran that is limited to 64Kbyte arrays seems
somewhat limited).  I had little trouble porting the System V version of
MicroEmacs using the large model.  (I had to change two things.  One was
where two include files were in the wrong order in the source.  As far
as I can see that would have failed anywhere.  The other was that when
you exit from Emacs, the terminal modes are not properly restored.  I'm
not sure whose fault this is, but I was able to fix it by doing a couple
of obvious things, and I haven't taken time to figure out exactly which
one fixed it.) Note that the runtime system is a great bargain.  There's
a huge amount of software for that price.  And the software development
option is certainly competitive if you compare it with other C
compilers.  However you're going to need a bit more of a machine to run
it than you do for the other options I have mentioned.  First, you have
to have an AT (not an XT).  Second, it has to have a real 16-bit disk
controller.  (I have a Leading Edge D2.  I was not pleased to discover
that it wouldn't run System V because they cheaped out and included an
XT disk controller.  I ended up having to replace my disk and
controller.  This made the project much more expensive than it should
have been.) Third, you need more memory.  I have 1M.  This is enough for
what I use it for, which is generally terminal emulation (they supply
kermit, which with their terminal driver works fine on our Unix systems
as a VT102) and editing moderate size files, plus now and then some C
hacking.  When I read a file over about 30KB into Emacs, things start to
slow down.  (To be honest, it behaves like it starts paging.  I guess
that can't be, since Sys Vr2 isn't supposed to page, but it sure
performs like that's what is happening.) Compilations are also somewhat
slower than they might be, though C is certainly still usable.  I have
some reason to think that both of these are because I don't really have
as much memory as I should.  Both Microport and Programmer's Connection
told me I should really have at least 1.5M of memory.  I probably will
add some more shortly.  One very cute feature of Microport is the
"virtual consoles".  You have N (4 by default) different consoles
available on your PC console.  To switch to console 2 type ALT-F2, etc.
These each have their own getty running, so you log in on each.
They are independent jobs.  It's almost as good as multiple windows.
(With the relatively low resolution used on IBM-compatible PC's, a
real window system would be sort of marginal.  This is probably more
useful.)

Now as promised, the hack to Freemacs to make it continuable under
MKS ksh.

testout	db	'f','o','o','$'

ps_savess	dw	?
ps_savesp	dw	?
ps_saveds	dw	?
ps_savees	dw	?
ps_size		dw	4000h
;cs is handled by the shell
psp_retseg	equ	06ch
psp_retadd	equ	06eh
psp_magic	equ	070h
ps_retaddr	dw	ps_retadd
ps_retadd	dw	contad
ps_retseg	dw	?

ps_prim:
	call	uninit_exit

	call	compact_buffers		;make room for the program.

	mov	dx,phd_seg		;subtract off the allocated segment.
	sub	bx,dx
	mov	ps_size,bx		;save size for tsr
;	mov	es,dx			;get es=allocated segment.
;	assume	es:nothing

;	mov	ah,4ah			;reduce ourself in size.
;	int	21h

	pusha
	mov	ax,sp
	mov	ps_savesp,ax
	mov	ax,ss
	mov	ps_savess,ax
	mov	ax,ds
	mov	ps_saveds,ax
	mov	ax,es
	mov	ps_savees,ax
	mov	ax,cs
	mov	ps_retseg,ax
	mov	bx,phd_seg
	mov	ds,bx			;est addressibility to psp
	mov	ax,cs			;save cs:ret addr
	mov	ds:psp_retseg,ax
	mov	ax,ps_retaddr
	mov	ds:psp_retadd,ax
	mov	ax,4321h		;save magic
	mov	ds:psp_magic,ax
	mov 	ah,31h			;tsr code
	mov	al,0			;normal return
	mov	dx,ps_size		;size we now want in para
	int	21h

;here if continued

contad:	mov	ax,ps_savess
	mov	ss,ax
	mov	ax,ps_savesp
	mov	sp,ax
	mov	ax,ps_saveds
	mov	ds,ax
	popa

	mov	bx,0ffffh		;now grab all of memory again.
	mov	es,phd_seg
	mov	ah,4ah			;see how much is available.
	int	21h
	mov	ah,4ah			;grab all of it.
	int	21h

	push	cs			;reset the fatal error address.
	pop	ds
	mov	dx,offset abort_fatal
	mov	ax,2524h
	int	21h

	mov	ax,33h*256+1		;turn break checking back off.
	mov	dl,0			;  in case someone turned it on.
	int	21h

	mov	ax,ps_savees
	mov	es,ax
	mov	ax,ps_saveds
	mov	ds,ax

;	assume	ds:data, es:data

;	mov	dx,offset two_crlfs
;	mov	ah,9
;	int	21h

	call	init_entry
	call	paint_screen
	jmp	return_null

brianc@cognos.uucp (Brian Campbell) (02/24/88)

In article <811@athos.rutgers.edu> hedrick@athos.rutgers.edu (Charles Hedrick) writes:
> It doesn't have my favorite dc (desk calculator),
> nor development tools such as make.

If anyone is interested, I have a copy  of dc that runs on PCs. I will mail
it  to  anyone  that is  interested,  or  post  it  if enough  interest  is
expressed. I also have  the source, but will have to  check with the author
to see if posting it is a possibility.

> I don't have any dramatic bugs to report in the MKS toolkit, though
> there are a couple of minor oddities.  Initially I had hoped that ksh
> would give me an Emacs editing mode much like mencsh or tcsh.  It
> probably isn't MKS's fault that it doesn't.  But I find it hard to get
> used to a mode where ^U and ^W don't do delete line and word.

Emacs was the first  editing mode I used also (I don't like  vi), but I now
use the vi editing mode simply because it allows me to use ^U and ^W as I'm
used to.

I haven't found any dramatic bugs in  the MKS toolkit either. I do have one
particular gripe though  -- many fullscreen programs, such  as freemacs and
list, exit  without clearing the  screen but by  leaving the cursor  on the
last  line. When  these programs  exit under  sh the  cursor is  invariably
placed on the  24th line (which just  happens to be the status  line of the
above mentioned programs).  Any text typed in at this  point overwrites the
previous contents of line 24 making things ugly and confusing.

> The other odd thing is that I
> had a performance problem with their "ls".  I like ls to produce
> columnated output, as it does under BSD.  They supply a -C option, so I
> figured I'd just alias ls to ls -C.  That works, but is very slow.

I noticed  this too, although I  never took the  time to figure out  why. I
have also  resorted to  using "ls  -x" (although, I  would much  prefer the
sorted-down format). Thanks for the explanation.
-- 
Brian Campbell        uucp: decvax!utzoo!dciem!nrcaer!cognos!brianc
Cognos Incorporated   mail: POB 9707, 3755 Riverside Drive, Ottawa, K1G 3Z4
(613) 738-1440        fido: (613) 731-2945 300/1200, sysop@1:163/8