jeffrey@algor2.UUCP (Jeffrey Kegler) (04/01/89)
In article <354@cbnewsc.ATT.COM> isaac@cbnewsc.ATT.COM (isaac.j.champagne) writes: > It seems like the "how to" of programming may be more important than > lots of theories. To become a "great" C programmer, I feel the best way is to study the works of the masters. I am very interested in the answers others have to the following: What ts the best C code you have ever read? Let me impose these restrictions: 1.) The code should be published, public domain or freerighted (in the GNU style), so that anyone wishing to study it has reasonable access to it. 2.) Of course, we are all too modest to start posting our own code. 3.) It should be large and complex enough so that someone who has waded through it and understood can feel he now "knows" C. A collection of small programs qualifies, but not a 3 page programming pearl. 4.) Please nominate only stuff you have closely read from beginning to end. I have worked for some time on code for which I cannot make this claim (not that I would nominate it anyway). This requirement in combination with the previous means the best of us will only know of a handful of such examples, at most. 5.) I want nominations, but not votes. One or two authoritative and reasoned postings from some of our net mavens (we know who they are) will carry a lot more weight with me, at least, that a lot of people saying "I have heard John Doe's _The_Nerd's_Complete_Database_Library_in_C_ is good." -- Jeffrey Kegler, President, Algorists, jeffrey@algor2.UU.NET or uunet!algor2!jeffrey 1788 Wainwright DR, Reston VA 22090
hascall@atanasoff.cs.iastate.edu (John Hascall) (04/02/89)
In article <433@algor2.UUCP> jeffrey@algor2.UUCP (Jeffrey Kegler) writes: >In article <354@cbnewsc.ATT.COM> isaac@cbnewsc.ATT.COM (isaac.j.champagne) writes: >> It seems like the "how to" of programming may be more important than >> lots of theories. >To become a "great" C programmer, I feel the best way is to study >the works of the masters. >I am very interested in the answers others have to the following: What ts >the best C code you have ever read? Let me impose these restrictions: I'd like to cast a negative vote for the 4.2 BSD Un*x source, wholly unreadable. :-) John Hascall ISU Comp Center
libes@cme.nbs.gov (Don Libes) (04/04/89)
>>I am very interested in the answers others have to the following: What ts >>the best C code you have ever read? > > I'd like to cast a negative vote for > the 4.2 BSD Un*x source, wholly unreadable. :-) I'd like to cast both positive and negative votes for Gosling's emacs. And who could forget display.c? I have reproduced the beginning of it here for your amusement. /********************************************************\ * * * Ultra-hot screen management package * * * \********************************************************/ /**************************************************************** /-------------\ / \ / \ / \ | XXXX XXXX | | XXXX XXXX | | XXX XXX | \ X / --\ XXX /-- | | XXX | | | | | | | I I I I I I I | | I I I I I I | \ / -- -- \-------/ XXX XXX XXXXX XXXXX XXXXXXXXX XXXXXXXXXX XXXXX XXXXX XXXXXXX XXXXX XXXXX XXXXXXXXX XXXXXXXXXX XXXXX XXXXX XXX XXX ************** * BEWARE!! * ************** All ye who enter here: Most of the code in this module is twisted beyond belief! Tread carefully. If you think you understand it, You Don't, So Look Again. ****************************************************************/ Needless to say, the comment was not an exageration. Don Libes libes@cme.nbs.gov ...!uunet!cme-durer!libes
djones@megatest.UUCP (Dave Jones) (04/04/89)
From article <433@algor2.UUCP>, by jeffrey@algor2.UUCP (Jeffrey Kegler): > > To become a "great" C programmer, I feel the best way is to study > the works of the masters. > > I am very interested in the answers others have to the following: What ts > the best C code you have ever read? Let me impose these restrictions: > [...] > > 2.) Of course, we are all too modest to start posting our own code. > Whhhh??? Have you been reading this group very long? I would be surprised if there were ONE contributer to this group whom modesty would prevent from informing you that he wrote the best code he had ever read -- the best damn code ANYBODY ever read, dammit! I mean, one simply can't allow unrestrained modesty to hinder the furthering the One True Way, can one? What vanity that would be! If you allowed people to submit their own code, you would be overwhelmed with entries. By restricting it to OPC (Other People's Code) you will only get entries from cliques. (I'll show him yours if you'll show him mine.) Now if you really want to see an example of the best code possible, send me a private email request. You see, I have some things in my personal library... Dave J.
mouse@mcgill-vision.UUCP (der Mouse) (04/14/89)
In article <1115@muffin.cme.nbs.gov>, libes@cme.nbs.gov (Don Libes) writes: [the >>> lines are someone else; I forget who] >>> I am very interested in the answers others have to the following: >>> What ts the best C code you have ever read? Very hard question. I've seen so little good code, and I remember the bad code for longer. On thought, I'm not sure I've seen any really good code on the size scale you're looking for (and that includes my own code, too :-). > I'd like to cast both positive and negative votes for Gosling's > emacs. And who could forget display.c? I have reproduced the > beginning of it here for your amusement. > [skull-and-crossbones comment from display.c. Following words have > been reformatted to save lines. Orignal was much prettier.] > All ye who enter here: Most of the code in this module is twisted > beyond belief! Tread carefully. If you think you understand it, > You Don't, > So Look Again. He wasn't kidding, either. I once tried to rewrite it preparatory to, I think, making the cost formulas conform more closely to reality for the Sun console (where, for example, insert and delete operations have high overhead and slightly *negative* per-count cost). All I succeeded in doing was slowing it down by about a factor of four, and introducing some bugs to boot. I ended up retrofitting the new cost formulas into the old code, and fudging to pretend that insert/delete operations were purely overhead. Someday when I have a spare month or two, perhaps.... der Mouse old: mcgill-vision!mouse new: mouse@larry.mcrcim.mcgill.edu
chris@mimsy.UUCP (Chris Torek) (04/16/89)
In article <1500@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes: [re jag's display.c skull-and-crossbones:] >>All ye who enter here: Most of the code in this module is twisted >>beyond belief! Tread carefully. If you think you understand it, >> You Don't, >> So Look Again. >He wasn't kidding, either. I once tried to rewrite it preparatory to, >I think, making the cost formulas conform more closely to reality for >the Sun console (where, for example, insert and delete operations have >high overhead and slightly *negative* per-count cost). All I succeeded >in doing was slowing it down by about a factor of four, and introducing >some bugs to boot. Part of the problem was that the description of the algorithm was contained in a CACM article; display.c had only a few hints in comments as to what was going on. But it was not actually all that hard to work out. My version of his display.c does have separate overhead and per-count costs; it turns out, though, that on the Sun console, if you have direct access to the frame buffer, it is faster just to rewrite. If anyone is curious: what is going on with the `M' matrix is surprisingly simple. You fill the thing with a bunch of arrows pointing either to the left, up, or backwards diagonally. Each arrow thus points at a neighbour arrow, except for those along the left edge and top row. To fill the matrix, set up the arrows along the left edge to point up; set those along the top to point left. Set the one at (0,0) to `done': * < < < < ^ . . . . ^ . . . . ^ . . . . ^ . . . . (This represents a four-line screen.) An up arrow represents a line deletion; a left arrow is an insert; and a back arrow is a rewrite in place. The row index (called `i') represents the contents of the current screen at line `i', and the column index (called `j') represents the contents of the desired screen at line `j'. Then fill in each `.' with the arrow that gives the least cost for transforming the screen according to an insert, delete, or rewrite given i and j. If the filled-in matrix looks like this: * < < < < ^ \ < \ \ ^ ^ \ ^ \ ^ \ ^ ^ \ ^ ^ ^ \ \ then we would (start at the bottom right) move back, up, up, back, left, left, and done, rewriting line 4 in place, deleting line 3, deleting line 2, writing line 1 in place, inserting (moving 1 down to 2), writing 1 in place again, inserting again (moving 1 and 2 to 2 and 3), and finally writing line 1 in place again. All those other unused arrows are there only because we did not know, before we finished, whether they would be used. The inserts and deletes we gather together along the way, since some terminals can do multi-line operations, so we really get `rewrite 4, goto 2, 2*delete, goto 1, rewrite, goto 2, 2*insert, rewrite 2, goto 3, rewrite 3, done'. (This is actually a stupid sequence and would never occur in practise.) What makes it hard is getting the right arrows into the cells. Each cell also has a cost (in terms of `characters sent to the terminal') attached to it. The cost for a delete is simply the cost of that delete: if an ESC-M deletes a line, the cost for delete is 2. (This ignores padding, which complicates matters; more on that later.) The cost for an insert is the cost of that insert (2 for, e.g., ESC-L) plus the cost for drawing line j (the one you want to show on the screen). The cost for moving straight back (diagonally upward) is just the cost for rewriting line i into line j. If the text on line i matches that on line j, this is free; otherwise it should be the number of characters required, but as an optimisation, the code just uses the number of characters on line j (otherwise it might have to compare n^2 lines; as it is, the comparison is `if the hashes match, the lines are the same and the rewrite is free'). As an optimisation, instead of having each `up' or `left' arrow point to its neighbour cell, if that cell is also an up, or is also a left, we have it point to where its neighbour points (leapfrogging over all the same sort of arrows, so to speak). This gives a nice way to see how many inserts or deletes are to be used. On some terminals (ANS X3.64), insert or delete operations have a fixed overhead no matter how many lines are inserted or deleted, but have a variable additional factor. For instance, ESC [ 4 L might insert four lines, with essentially the same cost as ESC [ 1 L (inserts 1 line), except for padding. The padding cost depends on how far above the bottom of the screen the cursor is at the time of the operation: to insert four lines, the terminal must move (rows-4) lines downward, and then must clear the four new lines. On a 30-line screen, this affects at least eight lines (23,24,25,26 move down to 27,28,29,30; then clear 23,24,25,26) and at most 30 (1,2,3,...,26 move to 5,6,7,...,30; then clear 1,2,3,4). The same rule applies to padding for deletion. On some stupidly-coded terminals (H19 in ANSI mode), padding is remarkably significant: the H19 inserts 6 lines by inserting 1 line 6 times! This can take quite a while and must be accounted for. It is simplest to define a fixed cost, a variable (per n lines) cost, with each split into `overhead' and `padding due to being above the bottom of the screen': let c_f = <fixed cost> c_n = <per-operation cost> n = <number of lines to insert or delete> k = <rows on screen> + 1 - <position of operation> then cost = n * (c_n.overhead + k * c_n.padding) + c_f.overhead + k * c_f.padding der Mouse's Sun costs are then c_f = (overhead = <large>, padding = ?) c_n = (overhead = <slightly negative>, padding = 0) while those on a fast terminal that uses ESC-L and ESC-M for each insert and delete are c_f = (overhead = 0, padding = 0) c_n = (overhead = 2, padding = 0) Within Emacs, the padding cost represents actual NULs sent to the terminal to keep it from sending ^S/^Q flow control; but even if flow control is available, the pad costs should still be counted, as they represent time taken by the terminal to do the operation. So: a comment in my display.c: The algorithm used is a dynamic programming one, where M[i,j] = min (M[i-1,j]+dcost, # UP, implies delete M[i,j-1]+icost+redraw cost, # LEFT, implies ins+redraw M[i-1,j-1]+rewrite cost) # BACK, implies rewrite (It is not necessary to count a redraw cost for UPs as they force an equivalent number of LEFTs at some point.) The rewrite cost for a line is its redraw cost, unless line i matches line j, where it is free. The UP and LEFT cases must also add the cost for doing an insert or delete. This is actually rather tricky: the cost for an ID varies with the position above the bottom of the screen, and sometimes with the number of line IDs. As it turns out we can just look at the UP'th or LEFT'th cell to see whether this is the first of a sequence. After we have finished putting UPs, LEFTs, and BACKs in the matrix, we start at the lower right corner and work backward, following the arrows up, left, or back, eventually reaching the top left. Going up implies a deletion; going left implies insertion and rewrite; and every back implies a rewrite of old line i to new line j. There is an exception: UPs along the right edge of the matrix, and LEFTs along the bottom, are merely establishing a starting position and do not cause I/D operations (doing I/D there would be pointless anyway). (LEFTs do still cause redraws.) Note that the elements along the left edge (j=0) are constant, and that the elements along the top (i=0) are constants plus the redraw sum for all lines <= j. As it happens, we can precompute a great deal of these factors. The code to fill in the M cost matrix is order n^2, so getting the constant factor down helps (n is typically between 24 and 60). A new algorithm would help more, but this one is provably optimal if the cost factors are correct (which, alas, they are not). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris