avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) (05/04/88)
In article <631@vsi.UUCP>, friedl@vsi.UUCP (Stephen J. Friedl) writes: > > Note that porting ksh is not at all a task for the novice; it is > not (to put it politely) "maximally portable". What experience is that comment based on? My personal toolkit is based on ksh, and so I've brought ksh to every UNIX box I've worked on. It was NEVER more than one day's work; in most cases a simple make is enough to bring it up and have it work to the man page. Once, I've put it on a one-of-a-kind laboratory box with a very hybrid but mostly SVID-conforming system. After cpio'ing it in, I just typed make and got a working executable that performed flawlessly for over two years. In my book, ksh is THE paradigm of maximum portability. Either you have had an experience you ought to tell us more about, or you owe Dave Korn a public apology. Adam Reed (mtgzz!avr)
wesommer@athena.mit.edu (William Sommerfeld) (05/06/88)
In article <4063@mtgzz.UUCP> avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) writes: >In article <631@vsi.UUCP>, friedl@vsi.UUCP (Stephen J. Friedl) writes: >> >> Note that porting ksh is not at all a task for the novice; it is >> not (to put it politely) "maximally portable". > >What experience is that comment based on? I don't know about his experience, but I heard an interesting story from someone at Apollo. When they did their UNIX emulation for AEGIS, one of the things they wrote was a version of the stdio library. They worked from the published interface specifications for the library (the manual pages), not from existing source code. As a result, their definition for what a FILE * looked like internally was not exactly what is found in the SysV distribution in <stdio.h>. They put this into a release, and shipped it. Some time later, they got an irate phone call fron Korn, who complained that his shell didn't compile on apollos. Why? Portions of ksh went "around" the published interface to stdio, and mucked with the elements of the FILE * directly. Apollo reluctantly re-implemented stdio with a "compatible" header file. Korn should have known better. - Bill
guy@gorodish.Sun.COM (Guy Harris) (05/06/88)
> > Note that porting ksh is not at all a task for the novice; it is > > not (to put it politely) "maximally portable". > > What experience is that comment based on? My personal toolkit is > based on ksh, and so I've brought ksh to every UNIX box I've worked > on. It was NEVER more than one day's work; in most cases a simple > make is enough to bring it up and have it work to the man page. What version of the Korn shell is your experience based on? The latest "ksh-i" may have fixed some of the problems; however: 1) The previous version assumes that you can catch SIGSEGV and have the SIGSEGV handler grow the data space and return, causing the faulting instruction to be reexecuted. This is *not* true on a number of machines. The 68000 can't support this in general, and the 68010/68020/etc. make it a bit of work to support it. This one isn't Dave Korn's fault; a while ago, I think John Mashey claimed that he pushed Steve Bourne to use this "feature" in the Bourne shell, and the Korn shell inherited it from there. We fixed this a while ago in the Sun version of the Bourne shell; the fixes generally apply to the Korn shell as well. I tried running a fixed and non-fixed version of the S5R2 Bourne shell on a 3B2 (on which the aforementioned trick does work); I did something such as echo `cat *.c` in the Bourne shell source directory, and found that the changes, which consisted largely of explicit checks to see if the data space needed to be expanded, didn't make any noticeable performance difference. I have heard that this is fixed in "ksh-i". 2) The previous version does not use "getpwnam" to get home directories when expanding "~username" constructs. It does so to avoid doing "malloc"s in certain places; "malloc" in those places doesn't work because of the, umm, *quirky* way the Bourne (and Korn) shells manage the heap. This doesn't work very well on systems that have Yellow Pages, or in fact on any system where a simple-minded scan of "/etc/passwd" won't necessary find all the entries that "getpwent", "getpwnam", etc. would find. 3) The previous version assumed that the "_iob" structures for standard I/O were in one long contiguous block. The 4.3BSD standard I/O library puts the first 30 or so into such a block, and "malloc"s additional ones.
wolfgang@mgm.mit.edu (Wolfgang Rupprecht) (05/06/88)
In article <52159@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: > 3) The previous version assumed that the "_iob" structures for standard > I/O were in one long contiguous block. The 4.3BSD standard I/O > library puts the first 30 or so into such a block, and "malloc"s > additional ones. Ksh and Kcl (Kyoto Common Lisp) both have/had problems with assuming that all _iob's are always contigious. What *is* the "approved" method of finding all _iob's? I have used the internal 4.3BSD libc function findiop:_fwalk(function) to walk over and close all but the first 3 _iob's. Is there *legit* way to do this? --- Wolfgang Rupprecht ARPA: wolfgang@mgm.mit.edu (IP 18.82.0.114) 326 Commonwealth Ave. UUCP: mit-eddie!mgm.mit.edu!wolfgang Boston, Ma. 02115 TEL: (617) 267-4365
guy@gorodish.Sun.COM (Guy Harris) (05/07/88)
> Ksh and Kcl (Kyoto Common Lisp) both have/had problems with assuming > that all _iob's are always contigious. What *is* the "approved" > method of finding all _iob's? I have used the internal 4.3BSD libc > function findiop:_fwalk(function) to walk over and close all but the > first 3 _iob's. Is there *legit* way to do this? No.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/07/88)
In article <5146@bloom-beacon.MIT.EDU> wolfgang@mgm.mit.edu (Wolfgang Rupprecht) writes: >What *is* the "approved" method of finding all _iob's? There isn't one. This is an extremely implementation-dependent matter, subject to change even between C releases for the same system. The point of using <stdio.h> facilities to to encapsulate I/O streams as an abstract data type. If you really need to do something not supported by the standard interface, you should either avoid using <stdio.h> facilities or else work to get the standard interface extended to provide whatever it is that you really need to do. The last example of this that I know of is an extension of fflush() so that a NULL argument indicates a request to flush all open output streams. I got this adopted into the proposed ANSI C standard as probably the last "invention" to make it into the standard (although I proposed it over a year ago, it got overlooked until last meeting). Buffer flushing before a fork() has been the only time I have needed to cheat on the standard interface, and with this change to fflush() EVENTUALLY that cheat won't be necessary. Of course it will be a while before I can rely on fflush() supporting this new feature, so there will be an #if __STDC__ in the rare places where I need this capability. If your need to peek at the FILE implementation differs from this, I'm curious to hear what it is.
friedl@vsi.UUCP (Stephen J. Friedl) (05/07/88)
In article <4063@mtgzz.UUCP>, avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) writes: > In article <631@vsi.UUCP>, friedl@vsi.UUCP (Stephen J. Friedl) writes: > > > > Note that porting ksh is not at all a task for the novice; it is > > not (to put it politely) "maximally portable". > > What experience is that comment based on? My personal toolkit is > based on ksh, and so I've brought ksh to every UNIX box I've worked > on. [...] In my book, ksh is THE paradigm of maximum portability. > Either you have had an experience you ought to tell us more about, or > you owe Dave Korn a public apology. The version I worked with was ksh from the Toolchest in Fall 1986 on a friend's 3B5. We were going through it in the hopes of adding some features, so we took a lint pass over the code as a start; I usually do this with code I'm about to tear into. Lint did not paint a pretty picture. Uncast NULL pointer arguments to functions, functions that return(e) and return, pointer-returning functions not declared before their use, and expressions like (vague recollection here): *p = movstr(*q++, *p++); /* undefined eval order */ It did not take us very long to realize that we really didn't want to tackle this and later we bought Aspen's product for uport on an AT rather than even try it ourselves; I wonder if anybody from Aspen could comment on this port? I've not seen ksh source in a long while, and things may have changed for the better by now. It strikes me that a cleanup of this version of ksh would really be a big job (but then ksh-i was probably a big job too). Software quality can be measured in a lot of ways, and the outside view and the inside view may be radically different. I use ksh every day and cannot imagine living without it, but that version of ksh was not the paradigm of maximum portability. -- Steve Friedl V-Systems, Inc. (714) 545-6442 3B2-kind-of-guy friedl@vsi.com {backbones}!vsi.com!friedl attmail!vsi!friedl
lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (05/07/88)
In article <645@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes: >In article <4063@mtgzz.UUCP>, avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) writes: >The version I worked with was ksh from the Toolchest in Fall 1986 >on a friend's 3B5. We were going through it in the hopes of >adding some features, so we took a lint pass over the code as a Thats some wealthy friend ! >start; I usually do this with code I'm about to tear into. Likewise. >Lint did not paint a pretty picture. I had a conversation with Korn about this once. He said that if he made ksh lint free that it would lose portability! I was pretty suprised, but I'll believe him. >It did not take us very long to realize that we really didn't >want to tackle this ... I made two changes to ksh, one was a simple change for the default TMOUT parameter The other was to reset the IFS to " \t\n" on startup and to ignore the value inherited from the environment (I did the same for sh and that was easy to find). I got the answer from Korn, since I couldn't figure it out. Turned out to be a one liner. Back to the stdio hassle with ksh, Korn also told me once that one of the biggest problems in making sh pportable was that the stdio *implementations* differed so much. I'm sure he would have done the "portable thing" if it were possible, but it must not have been possible. Isn't all the world System V :-)
guy@gorodish.Sun.COM (Guy Harris) (05/08/88)
> I had a conversation with Korn about this once. He said that > if he made ksh lint free that it would lose portability! I > was pretty suprised, but I'll believe him. I'm pretty surprised, too, but I'm not sure *I* believe him. The complaints listed (null-pointer arguments not properly cast, functions returning pointers not properly declared, etc.) reduce portability when fixed only if "portability" means "portability to systems with horribly broken C compilers". They *increase* portability to systems with *valid* C implementations where: 1) pointers and "int"s are not the same size 2) null pointers aren't represented by all-zero bit patterns 3) functions returning pointers, and functions returning "int"s, return their results differently There may be cases where making it completely free of "lint" complaints may cause problems; however, all the problems listed above can be fixed without impairing portability, and the first of them, at least, is a *real* problem on *many* implementations. > Back to the stdio hassle with ksh, Korn also told me once that > one of the biggest problems in making sh pportable was that the > stdio *implementations* differed so much. I'm sure he would > have done the "portable thing" if it were possible, but it must > not have been possible. Isn't all the world System V :-) The only way that making something portable is made difficult by differences in *implementations* is if you're depending on particular details of the implementation. If you want to make code portable, you avoid doing that wherever possible; you code to the specification, not the implementation. If you absolutely *must* depend on those details, you had better be prepared to give up portability. Period. (And, frankly, I don't think that Korn had to depend on those details; I redid the Korn shell's use of standard I/O to depend far less on the implementation just so I *could* get it to work on a system where not all "FILE" structures came from a single array.)
fox@alice.marlow.reuters.co.uk (Paul Fox) (05/10/88)
In article <4063@mtgzz.UUCP> avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) writes: >In article <631@vsi.UUCP>, friedl@vsi.UUCP (Stephen J. Friedl) writes: >> >> Note that porting ksh is not at all a task for the novice; it is >> not (to put it politely) "maximally portable". > >Either you have had an experience you ought to tell us more about, or >you owe Dave Korn a public apology. > > Adam Reed (mtgzz!avr) Well, I managed to get it running on Xenix/386 very easily. Xenix/286 was a different kettle of fish. In fact I couldnt spare the time to do the work. I was pretty much appalled that the korn-shell would not work. It core dumps on start-up, and I strongly suspect that there is some dependancy in there expecting pointers and int's to be the same size. Could have been due to the malloc routines. Anyway, I do not expect these sorts of problems to be present in such a widely used and distributed piece of software. The shell-script/makefile is an ab*rtion if ever I saw one. Thumbs down for this one, I'm afraid. ===================== // o All opinions are my own. (O) ( ) The powers that be ... / \_____( ) o \ | /\____\__/ _/_/ _/_/ UUCP: fox@alice.marlow.reuters.co.uk
rdavis@convex.UUCP (Ray Davis) (05/10/88)
I just finished tracking down why people couldn't get ksh to work correctly under Convex UNIX 6.1. There were two symptoms. First, history was garbaged up. History entries looked like single characters. Second, command line editing didn't work at all. Convex UNIX is mostly BSD 4.2 with some 4.3 stuff, SUN NFS stuff, and our own support for huge memory, disk and process models. The first problem came about because Convex supports 256 file descriptors. ksh thinks _NFILE is going to be an unsigned char. I explicitly defined it to be 20 as a test, and this fixed the history file problem. The second problem happened because ksh has it's own _filbuf() function instead of using the one in the stdio library. Well, in support of 256 file descriptors, Convex decided to allocate stdio buffers dynamically *when they are actually used*. The Convex _filbuf checks file->_base to see if it is null, and if so calls another function to get a buffer. After adding these two lines to the ksh _filbuf, command line editing works like a champ. Ray Davis Convex Computer Corp, Richardson TX {uunet, allegra, ihnp4, sun, uiucdcs}!convex!rdavis, 214/952-0521
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/12/88)
In article <341@alice.marlow.reuters.co.uk> fox@alice.UUCP (Paul Fox) writes: ... >Well, I managed to get it running on Xenix/386 very easily. Xenix/286 was >a different kettle of fish. In fact I couldnt spare the time to do the work. I ported it to Xenix/286 (2.1.3) by typing "make" and reading a magazine for a while. I don't recall any problems at all (this is the ksh-i versions, the older versions took some doing). I have ported it to Sun, Ultrix, Alliant, 3B1, Convex, and Encore. All were relatively easy, but the Convex version still doesn't have working line editing, due to a bug I don;t have time to chase. It doesn't ALWAYS self install so smoothly, but the changes are usually in the makefiles (and scripts) rather than the code. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (05/12/88)
Guy Harris writes: >There may be cases where making it completely free of "lint" complaints may >cause problems; however, all the problems listed above can be fixed without >impairing portability, and the first of them, at least, is a *real* problem on >*many* implementations. Okay, if there is another ksh release and if I get to beta test it, I will lint the code and fix as many of the bugs as I can. >The only way that making something portable is made difficult by differences >in *implementations* is if you're depending on particular details of the >implementation. If you want to make code portable, you avoid doing that >wherever possible; you code to the specification, not the implementation. I think Korn placed a big emphasis on ksh's performance. I would gladly give up a bit of portability to gain performance, Korn may feel the same. I would rather have portability and high performance but I'll give up portability if that was necessary. To me, it is more important to make ksh users happy than ksh hackers. -- Larry Cipriani, AT&T Network Systems and Ohio State University Domain: lvc@tut.cis.ohio-state.edu Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (weird but right)
guy@gorodish.Sun.COM (Guy Harris) (05/12/88)
> I think Korn placed a big emphasis on ksh's performance. I would gladly > give up a bit of portability to gain performance, Korn may feel the same. > I would rather have portability and high performance but I'll give up > portability if that was necessary. To me, it is more important to make ksh > users happy than ksh hackers. People often toss portability in favor of "performance" when they really don't gain much performance by doing so. I have yet to see any evidence that on a modern machine, at least, the SIGSEGV trick for growing a process's data space on demand buys you anything. (I tried echo `cat *.c` in the Bourne shell source directory, both with the vanilla S5R2 Bourne shell and one modified to explicity test whether the data segment had to be grown, on a 3B2/400; the performance was the same for both shells.) Given that I have worked on a variety of machines, with different standard I/O implementations, different byte orders, different sizes for pointers and integers, different levels of support for catching SIGSEGV and returning from the signal handler, etc., *I'd* gladly give up a bit of performance to gain portability. You can't make "ksh" users very happy if "ksh" doesn't work at all....
avr@mtgzz.UUCP (XMRP50000[jcm]-a.v.reed) (05/13/88)
In article <10806@steinmetz.ge.com>, davidsen@steinmetz.ge.com (William E. Davidsen Jr) writes: > In article <341@alice.marlow.reuters.co.uk> fox@alice.UUCP (Paul Fox) writes: > >Well, I managed to get it running on Xenix/386 very easily. Xenix/286 was > >a different kettle of fish. In fact I couldnt spare the time to do the work. > I ported it to Xenix/286 (2.1.3) by typing "make" and reading a magazine > for a while. I don't recall any problems at all (this is the ksh-i > versions, the older versions took some doing). When ksh was first being ported to the 286 here at AT&T, we found enough bugs in the Intel/Microsoft C compiler to fill a 7-page document. Ksh-i includes work-arounds for the known compiler bugs. However, porting difficulties that result from buggy compilers are definitely not ksh's fault. Adam Reed (mtgzz!avr)
dgk@ulysses.homer.nj.att.com (David Korn[eww]) (05/20/88)
There have been some recent articles concerning the portability of ksh.
Since I wrote ksh, let me comment about portability in general and
about ksh specifically.
Portability is a complex issue; enough to write a book about. In fact,
Mark Horton is writing a book about portability called "How to Write
Portable Software in C". The book should be published by Prentice Hall
sometime next year.
I have worked on many UNIX and UNIX look-alike systems during the last
twelve years. One of the most annoying things to me is how difficult
it is to port software to each of these environments. Ironically, low
level languages, such as C tend to be more portable than higher level
languages such as shell. Makefiles tend to be least portable of all.
There are two distinct considerations concerning portability. One
concern is how portable is the code across different systems. The
second, is how portable is the code with respect the the various
compilers or interpreters that it may run with. The first is a
design goal, I refer to as design portability, the latter is an
implementation issue, I refer to as implementation portability.
To write a portable program, you must design it to be as system
independent as possible and then implement it in a way that the code
ports to as many environments as possible. The importance of
design portability is that if the program becomes universally available,
then users will not have to be concerned with implementation
portability issues when they use the program.
To achieve a very high degree of implementation portability in C
requires doing things in less than ideal ways. For example, some
implementations of C allow only externals of no more than six case
insensitive characters. To be portable, you have to constrain the
name space for external routines. You also can't use many of the
recent (not very recent) features of the language. I have found
problems with using void, structure assignment, non-unique structure
names, and enumerations. There are some C pre-processors that allow
#ifdef but do not allow #if constructs.
To write portable shell scripts, you cannot use shell functions,
# for comment, a colon within expansions (for example ${name:-bar}),
pattern classes which use ! for negation, and a number of other features
that we often take for granted. The Bourne shell was not designed to be
system portable and relies on the underlying system for carrying out
basic tasks. For example, the echo command is not part of the Bourne
shell on some systems and since the behavior is incompatible on
different systems, any script that uses echo may not be portable.
Another prime example is test.
One of the goals of ksh is to be able to write shell scripts that
are portable across environments. The full benefit to this can
only be realized when ksh is readily available everywhere. To meet the
design goal, the shell had to have enough built-in capability that
useful scripts could be written without relying on the host environment.
This is one reason why test is a built-in command and why the print
built-in was added. The echo command, while widely used, varies on
different systems. I know of at least four variants around.
To become readily availabe, the code and makefiles for ksh must
port easily to other environments.
I decided to base the shell I/O on standard I/O several years ago.
This is a decision I have frequently regretted because it has caused
many portability problems and been the source of many of the
bugs that have been reported. I chose to use standard I/O
for two reasons. First of all, I wanted ksh to port to non-UNIX
systems. Secondly, I wanted to make it easy to add built-in commands.
I did not feel that it should be necessary to rewrite a command to make
it a built-in command. I envision being able to add built-ins at run time.
When I changed the code to use standard I/O there was no clear
description of what was the interface and what was private
to the implementation. There were and still are holes in the
specification of this interface that makes it necessary to
muck around with the interface in order to be used by the
shell. Let me list a few of the problems that I encountered:
1. How do I find the stream, given the file number. This
is required in order to implement the construct 3<&4.
2. How can I use a stream after a fork()? How does the
parent and child synchronize? For example, the program
read line;cat < file
should behead one line of the file. Does, the C program
main()
{
char buffer[80];
gets(buffer);
system("cat");
}
work correctly on your system when the input comes from a file?
3. What is the state of stdio after a longjmp()?
4. Can I duplicate a stream? How can I move a stream to another
file descriptor?
5. Is it legal to use the same buffer for more than one stream?
Copying buffers each time the shell forks can be expensive.
Why not use one buffer for all the output.
6. How can I create a stream from a string? After all, I can
print to a string, why shouldn't I be able to read from one?
I should be able to implement eval by calling the parser with
input from a stream corresponding to a string.
7. How do I hook the stdio library to the shell editing code?
8. How can I tell whether a buffer has been written or not? How
can I tell what the last character written was?
To use standard I/O within ksh, I had to make some decisions.
Some of the decisions that I made were wise ones, some were not.
At one point I wrote a stdio library that conformed with an early
ANSI C draft and added extensions so that it could be used by ksh
without mucking with its internals. I decided not to use it because
it is not .o compatible and thus might conflict with a routine
that happend to get loaded in by a weak dependency. Fortunately,
I was able to get ksh working with the native stdio on most
configurations. The list of operating systems that ksh runs with
is quite impressive. It has even ported to some rather non-UNIX
like targets such as OS-9.
One reason that ksh has been so widely ported is that the makefiles
automatically configure in most instances. My experience is that
most people who are responsible for bringing up software do not
know how to configure them. If the software does not get built on first
try, it often does not get built at all. I know that I have trouble
figuring out how to answer questions when I get software to install.
Often, a month or so after using some software package, it dawns on me that
the reason that some feature isn't working is because I misunderstood
a question that was asked when I built the software.
One of the comments about ksh is that while externally it presents
a portable interface, internally, the code is hard to follow because
of all the conditional compilation. Overall, this is a valid
criticism and one that I have been trying to rectify for the next
release. In the past, there were two basic flavors of UNIX, BSD and
System V. The way ksh configures itself is to look at its environment
and see what files are there and then to deduce what flavor of UNIX
is running. For example, /vmunix on BSD and /unix on System V.
Because of the number of hybrid systems, this strategy no longer
works. POSIX and ANSI hasn't helped. The result is to have even
more variants. The next version of ksh uses a differently strategy which
I summarize here:
1. Use POSIX 1003.1 as the standard in writing the code. Define
macros as needed to map POSIX into each implementation.
2. Generate an include file that defines the feature variants.
For example, ksh needs to know whether signals are automatically
reset when caught. At compile time a test program is run
that tests this feature and then appends a define constant to
this file indicating whether signals get reset or not.
3. Use a shell script to build ksh, not make. The shell script
uses only features that were in V7 Bourne shell and is
highly portable. Dependency checking is needed to maintain
a tool, not to build it initially. I use nmake (4th generation
make) to maintain ksh since it handles dependency checking and
conditional compilation automatically.
Some of the comments about ksh refer to bugs, such as referencing through
location zero. These are bugs that were discovered after ksh was put
into the UNIX Toolchest over two years ago. As I become aware of bugs,
I fix them. However, the UNIX Toolchest is for non-supported software
and therefore bug fixes do not find their way into Toolchest.
To achieve maximum benefit from the design portability of ksh requires
it to be available on all machines. This would eliminate uses having
to concerned about implementation portabilities when writing shell
scripts. I have done as much as I can to achieve this. I have gone
to great lengths to make ksh port easily to new systems. I distribute
ksh throughout AT&T. I am nearly finished writing a book that specifies
the ksh language.
However, I have no control over the release of ksh outside of AT&T.
Personally, I think that ksh should come with the UNIX system,
just as HyperCard comes with MacIntosh systems.
David Korn
ulysses!dgk