[comp.lang.perl] archfs -- a virtual file system based on comp.archives data

rodney@dali.ipl.rpi.edu (Rodney Peck II) (02/08/91)

  I've been playing with perl some more and I came across an idea for
something that I think is pretty neat.

  Basically, it takes the data from the headers of the messages in 
comp.archives and generates a virtual filesystem.  It's sort of like
restore's interactive mode where you can cd and ls the directories.  At
the leaf nodes, are files which are named after the date they were posted
to the net.

  There are a couple of things that you can do to these files.  The 'article'
command will take that file and fetch the original message from the
usenet spool and give it to more so you can read it.
  The other thing you can do is give the 'ftp' command.  This will try to
read the header and find the hostname, pathname, etc and execute an ftp
command to get the file and put it in your /tmp directory.  It doesn't quite
work yet because I can't get interaction with FTP down properly.  I'm
planning to use Randal's expect like perl code to handle this.  Ironically,
I lost the pointer to it so I have to go poke around until I find it or
someone mails me a message telling me where I can find it.

  Anyway, go ahead and try this out.  Just unshar it, and change the
first bit to point to your spool directory (mine's /usenet/spool).

  the real question is: who's brave enough to take this idea and expand
it into an actual virtual filesystem that the regular command line can
work on?  It would be really nice to be able to say something like
"cp /ftp/hackers/words/file ~/jargonfile.txt"

Let me know about any good changes or additions or perl code suggestions.
I'm still new to this perl stuff.

Rodney

--------------cut here---------------------------------------------
#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  archfs
# Wrapped by rodney@dali.ipl.rpi.edu on Thu Feb  7 21:57:22 1991
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'archfs' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'archfs'\"
else
echo shar: Extracting \"'archfs'\" \(8053 characters\)
sed "s/^X//" >'archfs' <<'END_OF_FILE'
X#!/usr/local/bin/perl
X'di';
X'ig00';
X#
X# $Header: /h1/rodney/bin/RCS/archfs,v 1.4 91/02/07 21:56:00 rodney RelNet $
X#
X# $Log:	archfs,v $
X# Revision 1.4  91/02/07  21:56:00  rodney
X# released to the net
X# 
X# Revision 1.3  91/02/07  21:51:30  rodney
X# before adding the expect part
X# 
X# Revision 1.2  91/02/07  21:22:25  rodney
X# added usenet reading directly
X# 
X# Revision 1.1  91/02/07  20:46:46  rodney
X# Initial revision
X# 
X
X# arcfs looks at the archive names from the usenet articles and makes 
X# a mythical filesystem that you can cd and ls sort of like restore -i
X# a special command 'article' lets you get the article from the news
X# spool and look at it with more
X
X$archive = '/usenet/spool/comp/archives';
X$parse++;
X#
X$inodenum = 0;
X@dir[0] = "/:-1";
X
Xif ($parse) {
X  opendir(DIR,$archive);
X  @articles = grep (!/^\./,readdir(DIR));
X  closedir(DIR);
X
X  $count = $#articles+1;
X  print "Getting header info from ",$count," articles.\n";
X  $|++;
X  for (@articles) {
X    $file = $archive . '/' . $_;
X    print --$count,".. ";
X    $head = 0;
X    $base = '/'.$_;
X    open (ARCHIVE, $file) || die "$0: can't open $file: $!\n";
X    while (<ARCHIVE>)
X    {
X      last if (/^$/ && $head++);  # end of second header no archive name entry.
X      next unless /^Archive-name: (.*)$/;
X      &create($1,$base);
X      last;  # that's all we needed
X    }
X    close (ARCHIVE);
X  }
X  $|--;
X}
Xelse
X{
X  open (ARCHIVE,'processed') || die "$0: couldn't open processed:$!\n";
X  while (<ARCHIVE>)
X  {
X    @dir[$inodenum++] = $_;
X  }
X  close (ARCHIVE);
X}
X
X$dot = 0;
X
Xprint "Comp.archives filesystem ready.\n%";
Xwhile (chop($_ = <STDIN>))
X{
X  last if /^exit/;
X  &save if /^save/;
X  &ls if /^ls/;
X  &cd($1) if /^cd (.*)/;
X  &article($1) if /^article *(.*)/;
X  &ftpget($1) if /^ftp *(.*)/;
X  print "%";
X}
X
X
X# return the data field of a file
Xsub getdata {
Xlocal ($art) = @_;
Xlocal ($node) = &getdir($art);
X  if ($node == -1) 
X  {
X    print "file not found.\n";
X    return;
X  }
X  local ($name,$dotdot,$data) = @dir[$node] =~ split(/:/);
X  unless ($data =~ m+^/.*$+)
X  {
X    print "$name is not a file.\n";
X    return;
X  }
X  $data;
X}
X
X# parse the archive header and try to ftp the file to the tmp
X# directory.
Xsub ftpget {
X  local ($dir) = @_;
X  local ($data) = &getdata($dir);
X  if ($data eq '')
X  {
X    print "can't ftp.\n";
X    return;
X  }
X  local ($head) = 0;
X  open (ART,$archive.$data) || die "couldn't open $data: $0";
X  local($archiveline);
X  while (<ART>)
X  {
X    last if /^$/ && $head++; # end of header  
X    if (/^Archive: (.*)$/)
X    {
X      $archiveline=$1;
X      last;
X    }
X  }
X  close(ART);
X  $_ = $archiveline;
X  print $_,"\n";
X  unless (/(.*):(.*) \[(.*)\]/)
X  {
X    print "No readable information in the header.  Sorry.\n";
X    return;
X  }
X  local ($host) = $1;
X  local ($path) = $2; 
X  local ($addr) = $3;
X
X  $path =~ m+.*/(.*)+;
X  local ($file) = $1;
X  
X  open (FTP,"|ftp -n $addr");
X  print (FTP,"user anonymous\n");
X  print (FTP,"password `hostname`");
X  print (FTP,"binary\n");
X  print (FTP,"get $path /tmp/$file\n");
X  print (FTP,"bye\n");
X  close (FTP);
X  print "The file is in /tmp/$file\n";
X}
X
X# fetch the article for this node and give it to more
Xsub article {
X  local ($dir) = @_;
X  local($data) = &getdata($dir);
X  system ("more ${archive}$data") if ($data ne '');
X}
X
X# return the name of the inode
Xsub name {
Xlocal ($node) = @_;
Xlocal ($nodename, @foo);
X ($nodename,@foo) = @dir[$_] =~ split(/:/);
X $nodename;
X}
X
X# get inode for file in this dir
Xsub getdir {
Xlocal ($dir) = @_;
X# break up this dir's inode list
X    local($nodes) = @dir[$dot];
X    local(@nodes) = $nodes =~ split(/:/);
X
X# special case for '..'
Xif ($dir eq '..') {
X  return @nodes[1];
X}
X
X# special case for ''
Xif ($dir eq '') {
X  return @nodes[2] if $#nodes == 2;
X  print "There is more than one file here, please specifiy.\n";
X}
X  
X# look up the dir in the current dir's inodes
X    local(@ls) = @nodes[2..$#nodes]; 
X  for (@ls) {
X    return $_ if (&name($_) eq $dir);
X  }
X  return -1;  # file not found
X}
X
X# cd to the directory
Xsub cd {
Xlocal ($dir) = @_;
X
Xif ($dir =~ m+(.*)/(.*)+)
X{ # it's a pathname
X  local ($first) = $1;
X  local ($rest) = $2;
X  if ($first eq '') 
X  { # absolute pathname
X    $dot = 0;
X  } else
X  { # relative pathname
X    &cd($first);
X  }
X  &cd($rest);
X  return;
X}
Xlocal($foo) = &getdir ($dir);
Xif ($foo == -1) {
Xprint "$dir not found\n";
X} else { $dot = $foo;}
X}
X# list the contents of the current directory
Xsub ls
X{
X    local($nodes) = @dir[$dot];
X    local(@nodes) = $nodes =~ split(/:/);
X    local(@ls) = @nodes[2..$#nodes]; 
X    local($max) = 0;
X    local($len);
X    local(@files);
X
X    for (@ls)
X    {
X      ($nodename,@foo) = @dir[$_] =~ split(/:/);
X      unshift (@files,$nodename);
X      $len = length $nodename;
X      $max = $len if ($len > $max);
X    }
X
X    @files = sort @files;
X    local ($cols) = int(80 / ($max + 2) - 1 + .5);
X    local ($line, $c);
X    local ($rows) = int(($#ls+1)/$cols + .5);
X    $rows++ if ($rows * $cols < $#ls + 1);
X    for ($line=0;$line < $rows; $line++)
X    {
X      for ($c=0; $c < ($cols); $c++)
X      {
X        $nodename = @files[$line + $rows * $c];
X        print $nodename, ' ' x ($max + 2 - length $nodename);
X      }
X      print "\n";
X    }
X}
X
X# save the directory tree to a 'processed'
Xsub save {
X  open (SAVE,'>processed') || die "$0: couldn't open processed:$!\n";
X  for (@dir) {
X    print SAVE $_,"\n";
X  }
X  close (SAVE);
X  print "wrote $#dir nodes.\n";
X}
X
X
X# returns the next inode
Xsub newnode {
X  ++$inodenum;
X}
X
X# create a file, making the directory tree as needed
X# call as '&create(pathname,filedata)'
Xsub create {
Xlocal ($path, $data, $dot) = @_;  # $dot is an internal var which holds the
X			   # current inode -- it's zero from the top.
X  $dot = 0 if ($dot == 0);
X  if ($path =~ m-(\w+)/(.*)-)
X  { # it has directories to descend
X    local ($dirname) = $1;
X    local ($restpath) = $2;
X    local ($in);
X    @nodes= @dir[$dot] =~ split(/:/);
X    for (@nodes[2..$#nodes])
X    {
X      ($nodename,@foo) = @dir[$_] =~ split(/:/);
X      ($in = $_, last) if ($nodename eq $dirname);
X    }
X    if ($in == 0) 
X    {
X      $in = &newnode;
X      @dir[$in] = "$dirname:$dot";  #make new inode
X      @dir[$dot] .= ":$in";         #append to cur dir
X    }
X    &create($restpath,$data,$in);
X  }
X    else
X  { # it's a file
X    $in = &newnode;
X    @dir[$in] = "$path:$dot:$data";     #make new inode
X    @dir[$dot] .= ":$in";               #append to cur dir
X  }
X}
X    
X###############################################################
X
X    # These next few lines are legal in both Perl and nroff.
X
X.00;                       # finish .ig
X 
X'di           \" finish diversion--previous line must be blank
X.nr nl 0-1    \" fake up transition to first page again
X.nr % 0         \" start at page 1
X'; __END__ ##### From here on it's a standard manual page #####
X
X.TH ARCHFS 1 "February 7, 1991"
X.AT 3
X.SH NAME
Xarchfs \- browse the comp.archives filesystem
X.SH SYNOPSIS
X.B archfs
X.SH DESCRIPTION
X.I Archfs
Xtakes the articles in your usenet spool directory for comp.archives and
Xcreates a sort of filesystem from the archive name headers provided by
XEd.
X
XYou can cd and ls this filesystem like it was a normal one.  In this way
Xit is modeled after the restore -i command.
X
XIn addition, you can use the
X.B article
Xcommand to apply the pager to the article from the spool.  Presumably,
Xyou can then read the message and ftp it if you like.
X
XThe
X.B save
Xcommand writes the contents of the database into the file called 
Xprocessed in the current directory.  You can save a lot of time in
Xparsing if you use this once after reading the spool.  Unset the $parse
Xvariable to have it read the processed file instead of looking through
Xcomp.archives.
X
XFinally, there is the start of the code to handle automatic ftp fetching
Xof files.  It does everything up to where it has to talk to ftp itself.
XFor that, I need Randal's expect in perl thing and I've misplaced my 
Xpointer to it.
X
X.SH ENVIRONMENT
XNo environment variables are used.
X.SH FILES
XNone.
X.SH AUTHOR
XRodney Peck II
X.SH "SEE ALSO"
Xcomp.archives
X.SH DIAGNOSTICS
X
X.SH BUGS
X
END_OF_FILE
if test 8053 -ne `wc -c <'archfs'`; then
    echo shar: \"'archfs'\" unpacked with wrong size!
fi
chmod +x 'archfs'
# end of 'archfs'
fi
echo shar: End of shell archive.
exit 0


-- 
Rodney

daniel@world.std.com (Daniel Smith - you know, that West Coast one...) (02/09/91)

	This sort of reminds me of something I recently posted to alt.sources
called "ls2dir".  It takes an ls-lR format file and spits mkdir and cat > file
commands to create a representation of the original ls-lR tree.  The files
created contain one line of data: the portion of the ls-lR line minus
the filename.  This makes it easy to browse around a "filesystem" and
perhaps a useful idea would be to combine it with archfs.  It would create a
neat tool where archive listing, ls-lR listings, (and possibly some intelligent
guesswork on paragraphs specifying ftp sites and file names?) could
be used as a basis for generating "go fetch this" commands (uucp, ftp,
or mail archive server).  Hmmm...perhaps a few specific scripts tied
together with a nice front end could make for a "one stop shopping
for files/packages" tool.  Just thinking out loud :-)  I'd post ls2dir
here (it's only ~25 lines of shell script) but I don't happen to have it at
this site (world.std.com)).  It's about 1-2 months old.

				Daniel
-- 
daniel@island.com  .....Daniel Smith, Island Graphics, (415) 491 0765 x 250(w)
daniel@world.std.com ...4000 CivicCenterDrive SanRafael MarinCounty CA 94903
dansmith@well.sf.ca.us .I must write this, or Island will take away my coffee.
Could we continue with the petty bickering? I find it most intriguing-Data:TNG