[comp.sys.next] Indexing in the Digital Librarian

SEB@pucc.Princeton.EDU (Scott E. Barron) (08/03/89)

I am trying to add a folder of large text files to the Digital Librarian on
a NeXT.  I cannot find any sufficient documentation for doing this, so the
process I used is as follows:

-drag the folder icon from the browser to an empty window in the DL
-select the new entry as the target
-select targets from the DL menu
-select index info... from the targets menu
-click on create index

The files I am trying to index are very large, but not as big as the works
of Shakespeare.  Yet, the DL locked up the first time I tried this, and the
second time it completed the index but the index exceeded 18MB!!  The index
on Shakespeare is less than 4MB.  What am I doing wrong?

Furthermore, the first time I tried to do a search on the newly created index,
the system locked up.  I used the control-control-~ keys to see what msgs
were being sent.  The following was repeated many times:

  >IO error on pageout: error=28
  >vnode_pageout: failed!

Any hints or suggestions as to what is happening?

I have also ventured into the realm of creating icons.  This is no easy
task since icons for the DL must be created upside down and inverted.  What
is the reasoning behind this?  Is this some kind of joke?  I was able to
create an icon and assign it to the entry I added to the DL by adding a line
to the .targets file.  That was simple enough, but not very intuitive.

My biggest complaint about the NeXT is the lack of thorough documentation and
the need for non-UNIX ways to handle problems/errors.  I know, I know--this is
still beta system software and 1.0 will be the answer to all problems.  I
guess I'll have to wait and see....  Another problem is that there is no
indicator light to show when the printer is receiving data or when the hard
disk is being accessed.

Onward...

Scott E. Barron                                                      IXTHUS
Advanced Technology
Princeton University

jgreely@oz.cis.ohio-state.edu (J Greely) (08/04/89)

In article <9187@pucc.Princeton.EDU> SEB@pucc.Princeton.EDU (Scott E. Barron) writes:
>I am trying to add a folder of large text files to the Digital Librarian on
>a NeXT.  I cannot find any sufficient documentation for doing this, so the
>process I used is as follows:

Known problem.  I tried to index the Internet RFCs (which run to about
19 meg), and never could get them in.  Your procedure is correct, but
I don't think you're going to get results under 0.9.  If you want to
experiment further, try the shell-level program index(1).  It's
not terribly stable, but it gives a bit more flexibility in indexing.

>The files I am trying to index are very large, but not as big as the works
>of Shakespeare.

Do you mean that the total file size is smaller than 8 meg, or that
the individual files are no larger than any in Shakespeare?  The
largest directory I have successfully indexed had only a total of 2
meg of text, with file sizes ranging from 45 bytes to 250 Kbytes.  The
index is a respectable 800K.

>Yet, the DL locked up the first time I tried this, and the
>second time it completed the index but the index exceeded 18MB!!  The index
>on Shakespeare is less than 4MB.  What am I doing wrong?

It's possible that the text contains a great many words that seem
important enough to index (see pword(1)), or that the indexing options
were oddly set.  Another possibility is that the indexing program
failed, but claimed to succeed.

>Furthermore, the first time I tried to do a search on the newly created index,
>the system locked up.  

Sounds like the third option!  Was this the same copy of the Librarian
that you created the index under?  Had any other aggresively
memory-hungry programs been running? 

>My biggest complaint about the NeXT is the lack of thorough documentation and
>the need for non-UNIX ways to handle problems/errors.

I've found the existing documentation to be quite reasonable, despite
its very pre-release nature.  Meanwhile, parsing the second half of
your sentence (with some difficulty), I gather you want more visual
administration tools.  They are, as far as I know, in the process of
being written/debugged.  The goal is to have as few tasks as possible
that require "old-fashioned" Unix administration.  My back-and-forth
with the bug handling group about the console (excuse me, the "Mach
Window") has convinced me that they want *everything* done from the
Workspace, preferably in a visual fashion.

>Another problem is that there is no indicator light to show when the
>printer is receiving data or when the hard disk is being accessed.

Funny, I thought both of these were obvious.  When the cube is making
coffee, the hard disk is being accessed.  When the coffee beans are
being ground, the optical disk is in use.  When the screen freezes,
it's preparing to send data to the printer, and when the cargo jet
takes off, it's about to begin printing.  Simple.


				NeXT Support Guru for sale
				or rent.  Inquire within.
-=-
J Greely (jgreely@cis.ohio-state.edu; osu-cis!jgreely)

SEB@pucc.Princeton.EDU (Scott E. Barron) (08/04/89)

In article <JGREELY.89Aug3152612@oz.cis.ohio-state.edu>, jgreely@oz.cis.ohio-state.edu (J Greely) writes:

>>The files I am trying to index are very large, but not as big as the works
>>of Shakespeare.
>
>Do you mean that the total file size is smaller than 8 meg, or that
>the individual files are no larger than any in Shakespeare?  The
>largest directory I have successfully indexed had only a total of 2
>meg of text, with file sizes ranging from 45 bytes to 250 Kbytes.  The
>index is a respectable 800K.

The folder that I dragged to the DL contained 65 folders, and those
folders contained files ranging from 2K to 15K.  The total size of the
files does not exceed the size of the files in the Shakespeare folder.

>
>>Yet, the DL locked up the first time I tried this, and the
>>second time it completed the index but the index exceeded 18MB!!  The index
>>on Shakespeare is less than 4MB.  What am I doing wrong?
>
>It's possible that the text contains a great many words that seem
>important enough to index (see pword(1)), or that the indexing options
>were oddly set.  Another possibility is that the indexing program
>failed, but claimed to succeed.

Are there indexing options that can be set from within the DL?  Or are
those options available only when using index(1)?


>
>>Furthermore, the first time I tried to do a search on the newly created index,
>>the system locked up.
>
>Sounds like the third option!  Was this the same copy of the Librarian
>that you created the index under?  Had any other aggresively
>memory-hungry programs been running?
Yes.
No other programs were running.

>>My biggest complaint about the NeXT is the lack of thorough documentation and
>>the need for non-UNIX ways to handle problems/errors.
>
>I've found the existing documentation to be quite reasonable, despite
>its very pre-release nature.

Then could you please tell me where I could find documentation on the
DL and specifically about indexing?  How about creating icons
for the DL?  I find the documentation to be geared more for UNIX
programmers/developers (such as yourself), rather than for general
users.  Also note that the documentation should not be "very pre-release"
considering that 1.0 is supposed to be available shortly, and that the
NeXT is (or will soon be) commercially available.

>I gather you want more visual
>administration tools.  They are, as far as I know, in the process of
>being written/debugged.  The goal is to have as few tasks as possible
>that require "old-fashioned" Unix administration.  My back-and-forth
>with the bug handling group about the console (excuse me, the "Mach
>Window") has convinced me that they want *everything* done from the
>Workspace, preferably in a visual fashion.

Sounds promising....  They will most definitely have to implement such
tools if this machine is to be successful in the business place.  I
would be very surprised if BusinessLand sells many NeXTs in the near
future.  Businesses tend to be more conservative and will be reluctant
to invest in a yet unstable machine.

>
>>Another problem is that there is no indicator light to show when the
>>printer is receiving data or when the hard disk is being accessed.
>
>Funny, I thought both of these were obvious.  When the cube is making
>coffee, the hard disk is being accessed.  When the coffee beans are
>being ground, the optical disk is in use.  When the screen freezes,
>it's preparing to send data to the printer, and when the cargo jet
>takes off, it's about to begin printing.  Simple.

This is a cute analogy to a real problem.  I wrote a 3 page document
in WriteNow and then tried to print it.  The print dialog box disappeared
with no message, so I assumed it was going to print.  Well, it never did!
I later discovered that the toner was low.  Of course, there is no
indicator light to tell me when its low, and no way to know if the
printer ever received any data to print.  I wasted a lot of time and
energy trying to figure out what was wrong.  That's a problem!!

While trying to index a target in the DL, the machine hung.  The cursor
indicated that the disk was being accessed and I heard some "grinding"
every now and then, but the program had locked up.  There was no way
to tell what was going on!  I had to call in the nearest UNIX guru to
help me figure out what was going on.  It would be nice to have a light
on the hard disk so that user knows what is happening at such times.
That's a problem!!

Are there other non-UNIX folk out there who have used the NeXT
and have general user comments about the documentation, interface,
programs, browser, etc?


Scott Barron                                                IXTHUS
Advanced Technology
Princeton University

chari@nueces.UUCP (Christopher M. Whatley) (08/05/89)

In article <9204@pucc.Princeton.EDU> SEB@pucc.Princeton.EDU writes:
| In article <JGREELY.89Aug3152612@oz.cis.ohio-state.edu>,
| jgreely@oz.cis.ohio-state.edu (J Greely) writes: 
| Are there indexing options that can be set from within the DL?  Or are
| those options available only when using index(1)?

Actually, the Librarian just runs index. Do a ps while your machine is
at the mercy of the Librarian in "indexing-processor-hog-mode".

| >>My biggest complaint about the NeXT is the lack of thorough documentation and
| >>the need for non-UNIX ways to handle problems/errors.

It would be great to have more "non-Unix" ways to handle things but, I
hope they don't go too far since the Unix ways have such wounderful
flexibility that would surely be lost in a point and click interface.
Right now, I love the schizo interface. I can be lazy with the mouse
or I can be fast and efficient with the keys.

| Then could you please tell me where I could find documentation on the
| DL and specifically about indexing?  

Search for index in the Librarian, There is an interesting document
about "indexing from the developers point of view.

|				How about creating icons
| for the DL?  

For your own personal lib, look in ~/.NeXT/Targets/targets for a
setup file of each lib entry. You can change the icon name there
after you make your eps icon in that directory. For the system, you
do the same in /NextApps/Librarian.app/targets.

|	I find the documentation to be geared more for UNIX
| programmers/developers (such as yourself), rather than for general
| users.  

Good observation. The machine is beta "for developers and aggressive
end users". Apparently, you and I fall into the latter category. Right
now, you can expect no more than to have to fend for yourself. I suggest 
you locate the campus support people with these problems since they are 
the ones who have direct access to Next via the ask_next e-mail address.

Chris
-- 
Chris Whatley			chari@nueces.cactus.org
P.O. Box 50254			!nueces!chari@cs.utexas.edu
Austin, TX 78763		chari@walt.cc.utexas.edu
512/499-0475