bturner@hpcvlx.HP.COM (Bill Turner) (07/20/89)
> ...The experiments showing > the superiority of the Dvorak keyboard are mainly old, were done by people who > were actively trying to prove that superiority, and are not up to modern > experimental standards. Recent data - I've been told the references by some- > one who works in the field, but I'm afraid I don't have them - show that there > is actually only a small difference in typing speed between Dvorak and Scholes > typists. One study that I know of was done in part by Donald Norman (author of "The Phsychology of Everyday Things" and professor in the Department of Cognitive Science at UCSD) -- the paper is "Why Alphabetic Keyboads Are Not Easy to Use: Keyboard Layout Doesn't Much Matter" by Donald A. Norman and Diane Fisher (Human Factors, vol. 24 no. 5, 1982, pp 509-519) --Bill Turner (bturner@hp-pcd.hp.com) HP Corvallis Information Systems
kolstad@prisma (07/22/89)
I wasn't able to mail this to Mr. Leichter: ----------------------------------------------------------------------- In comp.text, you say: > ...But it turns out that that model is just > plain wrong! ... > A side-effect of the Scholes layout is to place many > of the common "units" on alternating hands, which makes typing them easier. > Dvorak, on the other hand, tends to place many units under the SAME hand, > which interferes with typing. I am not a real fan of the Dvorak keyboard but knew someone who could hit 160 WPM (Andrew Shapira: shapira@docsun.rpi.edu). Because he could type a couple per cent faster than I, my ego was bruised and I tried the keyboard for a bit (at the behest of Dan Kopetzky, I believe). At any rate, while I never became proficient at all, in the limited number of tests I did (i.e., writing letters like this one), I found that your thesis that the units are under the same hand is not born out. It is understand that one will always have a few combinations that turn out that way (witness the word `recede' on the QWERTY keyboard), nevertheless the number of digrams and trigrams that were true alternation appeared to me to be very high on the Dvorak keyboard. If one divides the keyboard like this (I copied this keyboard from an earlier article and split it as best I could) and ran /usr/dict/words through a trivial script: left right / , . P Y --- F G C R L A O E U I --- D H T N S ; ' Q J K X --- B M W V Z tr "'PYAOEUIQJKXFGCRLDHTNSBMWVZpyaoeuiqjkxfgcrldhtnsbmwvz" \ llllllllllllrrrrrrrrrrrrrrrlllllllllllrrrrrrrrrrrrrrr < /usr/dict/words Then we have a file which tells which fingers get used (here's an excerpt): l lll <-- obviously bad lllr llrrlr llrlr lrl <-- the best we can do lrlrl <-- the best we can do lrlrl <-- the best we can do lrlrlr <-- the best we can do lrlrlrl <-- the best we can do Now if we count the transitions, we should be able to measure the `goodness' of a keyset. (I'm doing this in real time as I type, and I have to think about this for a moment. For you, it will be appear to be an instant cuz you'll get this all at once!) Let's make a chart: number words w/n `alternations' length of word 0 1 2 3 4 5 ... 2 x x 3 x x x 4 x x x x 5 x x x x x 6 x x x x x x 7 and so on... The program appears as Appendix A below. tr script < /usr/dict/words | program yields: 0 1 2 3 4 5 6 7 8 9 10 11 2 n= 131 48 83 3 n= 775 60 517 198 4 n=2152 29 864 1163 96 5 n=3093 16 462 1902 679 34 6 n=3794 3 130 1698 1619 329 15 7 n=3929 0 23 913 2013 896 82 2 8 n=3484 0 7 366 1299 1441 347 22 2 9 n=2970 0 1 121 735 1292 694 121 6 0 10 n=1883 0 0 22 287 680 673 195 26 0 0 11 n=1052 0 0 0 66 248 429 266 42 1 0 0 12 n= 542 0 0 1 13 70 169 203 76 10 0 0 0 13 n= 260 0 0 0 2 20 55 98 66 15 4 0 0 0 14 n= 102 0 0 0 1 5 15 27 34 17 3 0 0 0 0 15 n= 39 0 0 0 0 3 1 10 12 9 4 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 [29 uninteresting cases of >16 character words omitted] Now, it doesn't look too bad. I can't think of a quick metric that says `oh it's obvious this is great'. Let's quickly write another tr script for qwerty to have some raw data to compare: tr "qwertyasdfgzxcvbyuiophjklnm.'&QWERTYASDFGZXCVBYUIOPHJKLNM" \ llllllllllllllllrrrrrrrrrrrrrrllllllllllllllllrrrrrrrrrrr \ < /usr/dict/words > /tmp/alternates Now how does QWERTY do? 0 1 2 3 4 5 6 7 8 9 10 11 2 n= 131 68 63 3 n= 775 186 388 201 4 n=2152 266 837 790 259 5 n=3093 211 795 1168 749 170 6 n=3794 142 696 1217 1164 482 93 7 n=3929 85 398 983 1249 850 322 42 8 n=3484 39 231 616 979 929 505 164 21 9 n=2970 18 135 421 691 782 565 280 71 7 10 n=1883 7 46 154 339 453 460 283 111 29 1 11 n=1052 3 17 56 135 238 270 173 109 42 9 0 12 n= 542 0 1 8 50 92 142 111 90 36 9 3 0 13 n= 260 0 0 5 11 36 39 68 58 32 8 3 0 0 14 n= 102 0 0 0 6 6 16 25 29 10 6 4 0 0 0 15 n= 39 0 0 0 1 1 5 7 11 10 2 1 1 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 [29 uninteresting cases of >16 character words omitted] Unfortunately, I must admit that there doesn't seem to be a tremendous obvious difference in the alternating behavior. There is some, but it's not just overwhelming. Consider the most common words, those of 7 letters: 0 1 2 3 4 5 6 dvorak: 7 n=3929 0 23 913 2013 896 82 2 qwerty: 7 n=3929 85 398 983 1249 850 322 42 Now qwerty has a few more perfect words, a bunch more almost perfect but also has dramatically more `poor' words (0 and 1 alternations). Let's calculate the average alternations: dvorak: ( 0*0+ 23*1+ 913*2+ 2013*3+ 896*4+ 82*5+ 2*6) / 3929 = 3.02723 qwerty: (85*0+ 398*1+ 983*2+ 1249*3+ 850*4+ 322*5+ 42*6) / 3929 = 2.89462 This shows a very slight (4.38%) improvement for dvorak. I'll go back and modify the program to calculate this for us (see Appendix C): l n qwerty dvorak 2 n= 131 0.4809 0.6336 3 n= 775 1.0194 1.1781 4 n=2152 1.4842 1.6162 5 n=3093 1.9586 2.0818 6 n=3794 2.3761 2.5762 7 n=3929 2.8946 3.0272 8 n=3484 3.3789 3.5250 9 n=2970 3.7832 3.9912 10 n=1883 4.3542 4.4302 11 n=1052 4.8042 4.9743 12 n= 542 5.4244 5.5277 13 n= 260 5.9769 6.0269 14 n= 102 6.3627 6.4804 15 n= 39 6.9231 6.8974 Well, the dvorak keyboard wins every time -- but not by much! Maybe what we REALLY want to know is how much time we spend off the home row ... could that be the REALLY important metric? Let's translate into keyboard row numbers for dvorak: tr "/,.PYFGCRLAOEUIDHTNS;'QJKXBMWVZ&pyfgcrlaoeuidhtnsqjkxbmwvz" \ 1111111111222222222223333333333411111112222222222333333333 \ < /usr/dict/words > /tmp/alternates And let's modify the program to calculate on-home-row -vs- off-home-row (see Appendix D). [My buddy just pointed out to me that few people type the dictionary and we should use more realistic text like a book or a newgroup/notesfile. OOPS. I'll just continue on this tack for now.] OK, that done, let's also make a tr script for the qwerty keyboard for rows: tr "qwertyasdfgzxcvbyuiophjklnm.'&QWERTYASDFGZXCVBYUIOPHJKLNM" \ 111111222223333311111122233324111111222223333311111122233 \ < /usr/dict/words > /tmp/alternates Running the home key calculation program yields (with a bit of text editing for ease of reading): qwerty dvorak 2 n= 131 nhome= 75 = 28.63% nhome= 157 = 59.92% 3 n= 775 nhome= 747 = 32.13% nhome= 1367 = 58.80% 4 n=2152 nhome=2914 = 33.85% nhome= 5196 = 60.36% 5 n=3093 nhome=4892 = 31.63% nhome= 9328 = 60.32% 6 n=3794 nhome=6778 = 29.78% nhome=14335 = 62.97% 7 n=3929 nhome=8080 = 29.38% nhome=17297 = 62.89% 8 n=3484 nhome=8279 = 29.70% nhome=18078 = 64.86% 9 n=2970 nhome=7376 = 27.59% nhome=17359 = 64.94% 10 n=1883 nhome=4841 = 25.71% nhome=12474 = 66.25% 11 n=1052 nhome=2827 = 24.43% nhome= 7604 = 65.71% 12 n= 542 nhome=1579 = 24.28% nhome= 4323 = 66.47% 13 n= 260 nhome= 796 = 23.55% nhome= 2260 = 66.86% 14 n= 102 nhome= 335 = 23.46% nhome= 933 = 65.34% 15 n= 39 nhome= 126 = 21.54% nhome= 389 = 66.50% 16 n= 15 nhome= 50 = 20.83% nhome= 157 = 65.42% 17 n= 6 nhome= 20 = 19.61% nhome= 70 = 68.63% 18 n= 4 nhome= 19 = 26.39% nhome= 46 = 63.89% 20 n= 1 nhome= 5 = 25.00% nhome= 11 = 55.00% 21 n= 2 nhome= 8 = 19.05% nhome= 26 = 61.90% 22 n= 1 nhome= 5 = 22.73% nhome= 12 = 54.55% Well, it appears that the dvorak keyboard stays on the home row about 60-65% of the time and that qwerty keyboard stays on the home row about 20-30% of the time (for the most part). That would be a factor of 2x improvement of home row keys. Not bad. I'll bet that's the big difference. [electroencephalography is the 22 letter word, by the way]. So, in summary: * Alternation is just a bit better (pretty much always) * Home row keys are phenomenally better placed Now we know. Thanks for providing fodder for this interesting exercise. ps: Re-reading your note and this one, I find that I might have been a bit more clever about my treatment of common digrams and trigrams. Oh well. ============================= program listings (appendices) follow ======= APPENDIX A ----------------------------------- tr script < /usr/dict/words | program #include <stdio.h> int nlengths[40]; int nalternates[40][40]; main () { char buf[512]; int l; /* length of this word */ int n; /* counter of alternates */ int i, j; char *p; char thishand; while (gets (buf) != NULL) { l = strlen (buf); if (l < 2) continue; nlengths[l]++; p = buf; thishand = *p++; for (n = 0; *p; p++) if (*p != thishand) { *p = thishand; n++; } nalternates[l][n]++; } for (i = 2; i < 40; i++) { if (nlengths[i] == 0) continue; printf ("%2d n=%4d ", i, nlengths[i]); for (j = 0; j < i; j++) printf ("%4d ", nalternates[i][j]); printf ("\n"); } exit (0); } ------------------------------------------------------- APPENDIX B The actual tr script for dvorak: tr ".'PYAOEUIQJKXFGCRLDHTNSBMWVZ&pyaoeuiqjkxfgcrldhtnsbmwvz" \ lllllllllllllrrrrrrrrrrrrrrrrlllllllllllrrrrrrrrrrrrrrr \ < /usr/dict/words > /tmp/alternates The actual tr script for qwerty: tr "qwertyasdfgzxcvbyuiophjklnm.'&QWERTYASDFGZXCVBYUIOPHJKLNM" \ llllllllllllllllrrrrrrrrrrrrrrllllllllllllllllrrrrrrrrrrr \ < /usr/dict/words > /tmp/alternates ------------------------------------------------------- APPENDIX C The program which computes average alternations: #include <stdio.h> int nlengths[40]; int nalternates[40][40]; main () { char buf[512]; int l; /* length of this word */ int n; /* counter of alternates */ int i, j; char *p; char thishand; while (gets (buf) != NULL) { l = strlen (buf); if (l < 2) continue; nlengths[l]++; p = buf; thishand = *p++; for (n = 0; *p; p++) if (*p != thishand) { *p = thishand; n++; } nalternates[l][n]++; } for (i = 2; i < 40; i++) { double sum; if (nlengths[i] == 0) continue; sum = 0; printf ("%2d n=%4d ", i, nlengths[i]); for (j = 0; j < i; j++) sum += j * nalternates[i][j]; printf ("%6.4f\n", sum/nlengths[i]); } exit (0); } -------------------------------------------------------
cosell@bbn.com (Bernie Cosell) (07/29/89)
In article <10500004@prisma> kolstad@prisma writes: }If one divides the keyboard like this (I copied this keyboard from an }earlier article and split it as best I could) and ran /usr/dict/words }through a trivial script: } } ... } }Now if we count the transitions, we should be able to measure the }`goodness' of a keyset. (I'm doing this in real time as I type, and I }have to think about this for a moment. For you, it will be appear to }be an instant cuz you'll get this all at once!) Let's make a chart: This is the right kind of analysis, but absoluetly the *wrong* way to compute it. The problem is that the every word appears in /usr/dict/words with equal probability (that is, once), but the probabilities in normal English are nothing of the like [e.g., a scan of /usr/dict/words will not balance the fact that 'the' occurs a LOT more than cwm, although both have now contributed the same 'weight' to your stats]. Try rerunning your results over English, instead of a dictionary. A reasonable and easy way to do this is pick a mostly-text newsgroup (one that doesn't have a lot of "tty graphics" and acronyms and odd words and such), and run your program over the bodies of the message in it (e.g., talk.politics.misc would be pretty good, but comp.dcom.telecom is all filled with NXX's and ISDNx ahd LATAs and such that'll skew the stats). Beyond that, your numbers, while good intentioned, really aren't very useful even as a rough comparison vehicle because the underlying distribution of words from which they gather their statistics is so utterly wrong. /Bernie\
jwright@atanasoff.cs.iastate.edu (Jim Wright) (07/29/89)
In article <43460@bbn.COM> cosell@BBN.COM (Bernie Cosell) writes: | Try rerunning your results over English, instead of a dictionary. A | reasonable and easy way to do this is pick a mostly-text newsgroup (one that | doesn't have a lot of "tty graphics" and acronyms and odd words and such), | and run your program over the bodies of the message in it (e.g., | talk.politics.misc would be pretty good, but comp.dcom.telecom is all filled | with NXX's and ISDNx ahd LATAs and such that'll skew the stats). ^^^ :-) I would suggest that Usenet is not the place to look for good examples of the English language. Just one supporting argument... The word "ahd" would yield different results than "and". Since the QWERTY keyboard is used by most of the population under study, this is an obvious bias. (I don't claim innocence in this matter! :-) -- Jim Wright jwright@atanasoff.cs.iastate.edu
albaugh@dms.UUCP (Mike Albaugh) (08/15/89)
From article <66814@yale-celray.yale.UUCP>, by Horne-Scott@cs.yale.edu (Scott Horne): > Yes, QWERTY was designed to slow down typists to prevent them from jamming > the machines. (It doesn't fulfill that purpose, though: I can jam a manual > typewriter.) Oddly enough, some keys very often pressed in sequence are > located next to each other (_e.g._, `e' and `r', `i' and `o'). So where did this particular urban legend start. That is, can anyone post a specific reference by someone who _knows_ (Scholes or one of his contemporaries) that explicitely states that the QWERTY keyboard was designed to slow down typists. I saw a book review of a book by someone at Bell Labs that mentioned, off hand, de-bunking this particular myth, but it seems to be as resilient as the one about the poodle in the microwave. > So why aren't we all typing on Dvorak keyboards today, now that we have > computer terminals that don't ``jam''? Old habits die hard.... Alas, > alack.... You have obviously never used CPM on a low-budget system, if you think computer keyboards don't "jam" :-) Mike | Mike Albaugh (albaugh@dms.UUCP || {...decwrl!turtlevax!}weitek!dms!albaugh) | Atari Games Corp (Arcade Games, no relation to the makers of the ST) | 675 Sycamore Dr. Milpitas, CA 95035 voice: (408)434-1709 | The opinions expressed are my own (Boy, are they ever)