packer@amarna.gsfc.nasa.gov (Charles Packer) (02/23/90)
The latter part of this message consists of a list of the most
popular 120 newsgroups ordered so that similar newsgroups are
adjacent to each other. Similarity here means the likelihood
that they have the same people posting to them. This list was
created as follows:
1. Define popularity of each newsgroup to be the number of
individuals posting to it during some interval (in this case, a
two-week period last December). Sort a list of all newsgroups
descending on this count. Select the top 120 newsgroups.
2. Make a list of all individuals posting to the selected
newsgroups. Sort this list descending on number of newsgroups to
which each posted. Select the top 1000 "most widely posting"
individuals.
3. Build a matrix in which columns are newsgroups, rows are
individuals. A one in cell (j,k) means that individual j
has posted to newsgroup k. A zero means no posting occurred.
The matrix will have 120 columns, 1000 rows.
4. Reorder the columns and rows to bring as many of the non-zero cells
close to the diagonal as possible. This has the effect of bring
similar columns close to each other, and similar rows close to
each other.
The list follows. The number to the left of a newsgroup is its
column position. The number to the right is how many of the 1000
individuals posted to it. The matrix itself, aggregated into a
120 column print file, 100 lines long, is available from me by
e-mail.
1 soc.culture.korean 18 61 alt.activism 40
2 soc.culture.china 22 62 alt.drugs 39
3 soc.culture.japan 28 63 sci.environment 46
4 soc.culture.indian 16 64 rec.backcountry 25
5 soc.culture.taiwan 12 65 rec.games.frp 29
6 rec.music.gdead 16 66 rec.video 49
7 rec.sport.baseball 31 67 misc.legal 94
8 rec.sport.basketbal 39 68 rec.motorcycles 29
9 rec.sport.football 70 69 rec.autos 125
10 rec.sport.misc 26 70 misc.consumers 102
11 rec.sport.hockey 18 71 misc.misc 25
12 rec.bicycles 17 72 rec.autos.tech 56
13 rec.gambling 15 73 sci.med 36
14 rec.skiing 25 74 alt.folklore.comput 123
15 talk.politics.midea 21 75 misc.consumers.hous 24
16 rec.music.bluenote 18 76 news.groups 100
17 rec.arts.tv.soaps 27 77 alt.callahans 23
18 rec.puzzles 19 78 rec.audio 69
19 alt.rock-n-roll.met 15 79 rec.photo 23
20 rec.music.folk 19 80 misc.invest 26
21 rec.arts.tv.uk 26 81 rec.aviation 34
22 rec.arts.drwho 31 82 sci.math 18
23 rec.games.misc 26 83 sci.electronics 40
24 alt.rock-n-roll 36 84 rec.ham-radio 53
25 rec.music.misc 72 85 rec.radio.shortwave 28
26 rec.music.cd 49 86 sci.space.shuttle 27
27 rec.arts.anime 25 87 sci.space 43
28 rec.arts.tv 113 88 sci.physics 33
29 alt.cult-movies 26 89 sci.astro 24
30 rec.arts.movies 129 90 news.admin 42
31 rec.arts.comics 57 91 news.misc 28
32 alt.romance 24 92 comp.protocols.tcp- 29
33 alt.peeves 27 93 comp.misc 51
34 rec.arts.sf-lovers 115 94 comp.sys.amiga 79
35 rec.arts.books 45 95 comp.sys.mac 147
36 talk.abortion 14 96 comp.sys.ibm.pc 148
37 rec.music.classical 34 97 misc.wanted 28
38 rec.games.video 36 98 comp.sys.next 39
39 rec.arts.startrek 91 99 rec.music.makers 29
40 alt.sex.bondage 23 100 alt.religion.comput 34
41 rec.humor 112 101 comp.sys.mac.progra 28
42 alt.sex 108 102 misc.forsale 50
43 soc.motss 52 103 comp.sys.mac.hardwa 35
44 misc.kids 29 104 rec.music.synth 28
45 rec.pets 38 105 gnu.misc.discuss 31
46 soc.singles 58 106 comp.arch 24
47 rec.food.cooking 43 107 comp.sys.atari.st 31
48 soc.men 61 108 comp.lang.c 63
49 soc.women 84 109 comp.unix.questions 56
50 rec.food.veg 36 110 comp.unix.wizards 47
51 talk.religion.newag 25 111 comp.binaries.ibm.p 24
52 talk.religion.misc 39 112 comp.sources.wanted 47
53 alt.flame 51 113 comp.os.vms 19
54 talk.bizarre 56 114 comp.unix.xenix 34
55 sci.skeptic 30 115 comp.unix.i386 44
56 talk.politics.theor 23 116 comp.windows.x 34
57 rec.travel 36 117 comp.text 22
58 talk.politics.misc 136 118 comp.sys.amiga.tech 16
59 misc.headlines 98 119 comp.sys.hp 20
60 talk.politics.guns 50 120 comp.sys.apple 13