housel@ea.ecn.purdue.edu (Peter S. Housel) (01/16/88)
-Or- How I Spent My Christmas Vacation Over the Christmas vacation I had plenty of time for messing around with Minix, and did. I installed all of the 1.2 updates, and most of the subsequent official ones. In addition, I put in the fixes for the only two system-level bugs I knew about: the "line 6617" fix to mm/signal.c and Bing Bang's fix for fs/pipe.c. It was deemed wisest, however, to limit the kernel changes to the officially posted versions, since anything else would make future fixes a pain to apply. (This meant no process groups fixes or serial tty driver, but I decided I could live with this for the present.) Because of this, and the fact that I definately didn't want to be making changes to mm, fs, and kernel without some sort of version control system, I concentrated on user-level utilities and libraries. The results were the fixes below, a cheap revision control program, a replacement set of timezone routines, and a working copy of "smail". All in all, it was a lot of fun (the most enjoyment I've gotten out of a micro in a long while), and I hope other people can benefit from the work. Some of these topics may have been covered before, but 1) I wasn't able to read news over last summer 2) I can't be expected to remember everything I have read 3) they haven't been "officially" fixed. TAIL There is a minor bug in tail.c. when you use "tail +n" to filter out lines n through the end of the file, it counts the first line as line "0" instead of line "1". The fix for this is simple: 112c112 < --- > ++count; /* (we start on line 1) */ LS If you're used to BSD or SysV, I'm sure you condsider the lack of a columnar ls a major pain. I'm sure several other people have made the same sort of changes to ls that I have, adding columnar display, making dot-files list only if -a or -A is specified (or the user is the superuser), and a "-F" option. My resulting code is nothing to be proud of, however; it breaks down in a few cases that I'm willing to live with. But, if nobody posts anything better, I'll post mine. LOGIN login.c has two problems. The first is an annoyance - it doesn't set up the environment properly. Variables like HOME, SHELL, and USER (and TERM, if you're using the ANSI sequences) should be set in login and propogated from there, instead of letting the shell set them incorrectly. The code below sets HOME, USER, and TERM. The second involves signals. The init process, which forks off logins, works like this: a) fork off a login for each terminal in /etc/rc b) ignore all signals and get down to work c) whenever a login process dies, start up a new one The result of this is that your first login starts out with all signals having the default actions, but subsequent ones end up ignoring all of the ones that the shell doesn't mess with - SIGQUIT and SIGKILL. This is probably best fixed in login, by doing a SIG_DFL on all of the signals immediately before the exec. (It took a lot of kernel printf's to find this one.) The following diffs are for the 1.2 login.c (which, by the way, was posted on 11 Aug 87 as article 1570@botter.cs.vu.nl). --------------------------cut---------------------------------------------- 6a7,16 > char user[15] = "USER="; > char home[30] = "HOME="; > > char *env[] = > {user, > home, > "TERM=minix", > (char *)0 > }; > 53,60c63,74 < chdir (pwd->pw_dir); < if (pwd->pw_shell) { < execl(pwd->pw_shell, "-", 0); < } < execl("/bin/sh", "-", 0); < write(1,"exec failure\n",13); < } < } --- > strcat(user, buf); > strcat(home, pwd->pw_dir); > chdir (pwd->pw_dir); > for(n = 1; n <= NR_SIGS; ++n) > signal(n, SIG_DFL); > if (pwd->pw_shell) { > execle(pwd->pw_shell, "-", (char *) 0, env); > } > execle("/bin/sh", "-", (char *) 0, env); > write(1,"exec failure\n",13); > } > } --------------------------cut---------------------------------------------- EXEC The V7 manual isn't specific either way, but it seems to me that the exec() routines should pass the current environment on to child processes instead of a null environment. (i.e. execl(...,0) should behave like execle(..., 0, environ).) The BSD 4.3 manual DOES say specificly that this is what is done. I haven't put this in yet (for fear of possible undesirable side effects); what does everybody else think? CC/CEM/CG Like most people, I'm still using the 1.1 C compiler. I don't know if these are fixed for 1.2, but... The first problem is that constructs like: char months[12][3] = { "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" }; include the nulls ("\0") at the end of the initializer strings, which is incorrect. The second problem is more subtle. The following simple program: main() { int x = 10; int y = ((x + 1) + 15) / 16; } produces: Error: bad argument type 249 from within the code generator (/usr/lib/cg). (This arose while trying to compile bison. Don't try it; it needs an assembler with more than eight significant symbol characters. I have my doubts that it will fit in 128K separate I/D anyway.) LIBC libc.a has been the source of several pains. The first was that with all of the sources and ".s" files in /usr/src/lib, there were too many files for ls to handle. I solved this by moving libc files into /usr/src/lib/libc, and then splitting those up into subdirectories: aux (C compiler library routines like xor.s and rmu4.s), stdio (the usual stdio routines), sys (system calls), kernel (kernel/mm/fs specific library routines), gen (generic calls like strcpy), and zone (newctime routines - see my next posting). The second was with recompiling the library. Makefiles for each of the subdirectories help condsiderably, but the main problem is finding the right order for the ".s" files within the library. (I was tempted to invest some time in writing lorder and tsort, but laziness and the problem of invoking lorder with more than 1024 bytes of arguments stopped me.) GETC/FSEEK/FTELL The way getc() is currently implemented makes it impossible for ftell() to work correctly. As it is now, ftell() returns zero both before and after the first character has been read from a stream. A semi-fix seems to have been placed in fseek(), but it doesn't do much good. The following diffs make everything OK, so far as I can tell. ----------------------------getc.c.diff---------------------------------- 16c16 < if (--iop->_count <= 0){ --- > if (iop->_count <= 0){ 38,40c38,40 < return (*iop->_ptr++ & CMASK); < } < --- > return (--iop->_count, *iop->_ptr++ & CMASK); > } > ---------------------end-of-getc.c.diff---------------------------------- ----------------------------fseek.c.diff--------------------------------- 21,23c21 < pos += count - lseek(fileno(iop), 0L,1) - 1; < /*^^^ This caused the problem : - 1 corrected < it */ --- > pos += count - lseek(fileno(iop), 0L, 1); ---------------------end-of-fseek.c.diff--------------------------------- PRINTF A new doprintf.c was recently posted; this posting added '%%' and '*' argument fields. The only thing I know of that it lacks is the '%ld', '%lx', and '%lo' constructs - '%D', '%X', and '%O' have to be used instead. A lot of real-world programs don't do this, however. Here's yet another simple patch. ----------------------------doprintf.c.diff------------------------------- 93c93,96 < switch (*format) { --- > if ( (c = *format) == 'l') > c = *++format + ('A' - 'a'); > > switch (c) { ---------------------end-of-doprintf.c.diff------------------------------- SCANF On page 291 of my 7th Edition Unix manual says "The scanf functions return the number of successfully matched and assigned input items", as well as "The success of literal matches and suppressed assignments is not directly determinable." The Minix scanf() function doesn't agree with this; it increments its return value for each literal character match and suppressed assignment. To make it compatible with the standard, apply the following diff: ----------------------------scanf.c.diff--------------------------------- 165d164 < ++done; 237,238c236,238 < if (done_some) < ++done; --- > if (done_some) { > if(do_assign) ++done; > } 253c253 < if (done_some) --- > if (done_some && do_assign) 267c267 < if (done_some) --- > if (done_some && do_assign) 302c302 < if (done_some) --- > if (done_some && do_assign) ---------------------end-of-scanf.c.diff--------------------------------- DIVISION(!!!) There are two problems with division, the first being in the kernel and the second in the "aux" libraries. Minix 1.1 and beyond have a routine called div_trap() which is called when the processor detects a division overflow during a DIV or IDIV instruction. Unfortunately, all it does is print a message on the console; on return from div_trap() the instruction is executed again and an endless stream of "Division overflow" messages result. If it is a foreground process you can hit DEL and kill it; if it is a background process you probably won't be able to see what you're typing well enough to find out the pid and kill it. It is easy enough to put some teeth into this trap, however. Just add '#include "../include/signal.h"' to the end of the #include statements in kernel/main.c, and put these lines immediately after the printf() inside div_trap(): if(cur_proc > LOW_USER) { cause_sig(cur_proc, SIGFPE); unready(proc_ptr); /* probably */ } (Thanks to Marty Leisner, martyl@rocksvax.UUCP, for inspiring this in one of his postings.) Now division-by-zero errors will cause those core dumps that everybody knows and loves. The core file itself will be of limited usefulness, since there's no "adb" and the register values aren't included - but the endless stream of messages goes away and you know when something has gone wrong. There's definately something wrong with the long remainder routines. For the unsigned case, the following program should get a division overflow trap: main() { unsigned long x = 86401L; unsigned long y = x % 86400L; } I can't remember any examples for the signed remainder case offhand, but it doesn't work much better. Hopefully, somebody who understands these routines can fix them better. What I did was to make a couple of changes to dvu4.s to make it work as rmu4.s, and wrap sign-changing code around .rmu4 to make it into rmi4.s. They're a bit ugly, and don't make as effective use of the division instructions as they could, but they work - a major point in their favor. Strip comments, libpack, and install these in libc. (8086/8088 mnemonics Copyright 1981 by Intel Corp. :-) --------------------------------rmu4.s------------------------------------- .define .rmu4 yl=2 | divisor LSW yh=4 | divisor MSW xl=6 | dividend LSW xh=8 | dividend MSW .rmu4: mov si,sp | set 'frame pointer' mov bx,yl(si) | bx=divisor LSW mov ax,yh(si) | ax=divisor MSW or ax,ax | is divisor one significant word? jne L7 | no, do hard case xor dx,dx | dividend for first division - MSW=0 mov cx,xl(si) | cx=dividend LSW mov ax,xh(si) | ax=dividend MSW (LSW for first divison) div bx | divide - dx=remainder, ax=result xchg ax,cx | prep for 2nd divison ax=dividend LSW, ax=MSW result div bx | divide - dx=overall remainder, ax=result LSB xor bx,bx L9: ret 8 | return with cx=quotient MSB, ax=quotient LSB L7: mov di,ax xor bx,bx mov ax,xl(si) mov dx,xh(si) mov cx,#16 | 16 bit loop index L1: shl ax,#1 rcl dx,#1 rcl bx,#1 cmp di,bx ja L3 jb L2 cmp yl(si),dx jbe L2 L3: loop L1 jmp L9 L2: sub dx,yl(si) sbb bx,di inc ax loop L1 | loop over 16 bits jmp L9 -----------------------end-of-rmu4.s---------------------------------------- ------------------------------rmi4.s---------------------------------------- .define .rmi4 yl=4 | divisor LSW yh=6 | divisor MSW xl=8 | dividend LSW xh=10 | dividend MSW .rmi4: push bp | save frame pointer mov bp,sp | set for this frame sub si,si | clear negation flag mov ax,yh(bp) | get divisor MSW test ax,ax je L4 | is it zero? inc ax | or -1? jne L7 | neither; do hard case mov bx,yl(bp) neg bx jmp L4a L4: mov bx,yl(bp) | bx=divisor LSW L4a: mov cx,xl(bp) | cx=dividend LSW mov ax,xh(bp) | ax=dividend MSW (LSW for first divison) test ax,ax | check dividend jge L5a | need to negate? neg ax neg cx sbb ax,*0 inc si L5a: xor dx,dx | high order of first dividend = 0 div bx | divide - dx=remainder, ax=result xchg ax,cx | prep for 2nd divison ax=dividend LSW, ax=MSW result div bx | divide - dx=overall remainder, ax=result LSB xor bx,bx L5: test si,si | check negation flag je L9 | none done, return neg bx neg dx sbb bx,*0 L9: pop bp | restore framepointer ret 8 | return with bx=remainder MSB, dx=remainder LSB L7: dec ax | correct for increment jge L6 | need to complement? neg ax | yes, do it neg yl(bp) sbb ax,*0 L6: mov di,ax xor bx,bx mov ax,xl(bp) mov dx,xh(bp) mov cx,#16 | 16 bit loop index test dx,dx jge L1 | need negation? neg dx | yes, do it, neg ax sbb dx,*0 inc si | and set the flag L1: shl ax,#1 rcl dx,#1 rcl bx,#1 cmp di,bx ja L3 jb L2 cmp yl(bp),dx jbe L2 L3: loop L1 jmp L5 L2: sub dx,yl(bp) sbb bx,di inc ax loop L1 | loop over 16 bits jmp L5 -----------------------end-of-rmi4.s---------------------------------------- FOPEN/FDOPEN Two problems. fopen() doesn't have the correct behavior for the "a" mode - it requires that the file pre-exist, and won't creat() it if it does not. This also disagrees with the V7 manual. To fix, change the case 'a': to case 'a': if (( fd = open(name,1)) < 0 ) if(errno != ENOENT || (fd = creat(name, PMODE)) < 0) return(NULL); lseek(fd,0L,2); break; The second problem is that there is no fdopen() in the offical distributions or diffs. One has been posted to this newsgroup, however. ========================================================================= Peter S. Housel housel@ei.ecn.purdue.edu ...!{inuxc,decvax,...}!pur-ee!housel
kent@tifsie.UUCP (Russell Kent) (01/19/88)
I compliment Peter on his thorough and precise postings!! (Don't we all wish that all postings were so jam-packed with info!! :-). Unfortunately, I must take issue with this particular statement: > CC/CEM/CG > Like most people, I'm still using the 1.1 C compiler. I don't know > if these are fixed for 1.2, but... > The first problem is that constructs like: > > char months[12][3] = { > "Jan", "Feb", "Mar", "Apr", "May", "Jun", > "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" > }; > > include the nulls ("\0") at the end of the initializer strings, which > is incorrect. > Peter S. Housel housel@ei.ecn.purdue.edu > ...!{inuxc,decvax,...}!pur-ee!housel I refer you to the C bible (Kernighan & Ritchie) page 84: "It is an error to have too many initializers." ... "Character arrays are a special case of initialization; a string may be used instead of the braces and commas notation: char pattern[] = "the"; This is shorthand for the longer but equivalent char pattern[] = { 't', 'h', 'e', '\0' };" Based on the above statements, then the compiler, when faced with Peter's "months" multi-dimensional array should have belched something to the effect of: ###:Invalid initialization: too many initializers In all other respects, Peter's postings are accurate to the best of my knowledge. Russell Kent Phone: +1 214 995 3501 Texas Instruments UUCP address: P.O. Box 655012 MS 3635 ...!convex!smu!tifsie!kent Dallas, TX 75265 ...!ut-sally!im4u!ti-csl!tifsie!kent -- Russell Kent Phone: +1 214 995 3501 Texas Instruments UUCP address: P.O. Box 655012 MS 3635 ...!convex!smu!tifsie!kent Dallas, TX 75265 ...!ut-sally!im4u!ti-csl!tifsie!kent