dkb@cs.brown.edu (Dilip Barman) (10/02/90)
What I'm trying to do seems fraught with problems and I thought I'd ask my question again with more details. Thanks to those who suggested fill-pointers and treating a string as an array of characters, but I am still having problems. What I want to be able to do is parse sentences from an input file, delimited by periods. Other than periods, only a-z, A-Z, slash (/), and dash (-) are considered non-white space. Words are delimited by white space. What I want to create is not a string but a list component (I discovered setting *print-escape* to nil to disable double quotes and this should help). So, in reading "This is sentence 1. This is * sentence2." I am trying to create: ( (THIS IS SENTENCE (NUMBER 1)) (THIS IS SENTENCE2) ) How can I coerce the string into being a list component?? This has me stumped! Thanks in advance. Dilip Barman dkb@cs.brown.edu U.S. mail: Brown University Home: 40 Everett Avenue Dept. of Computer Science, Box 1910 Providence, RI 02906 Providence, RI 02912 (401)863-7666 (401)521-9731
miller@cam.nist.gov (Bruce R. Miller) (10/02/90)
In article <51809@brunix.UUCP>, Dilip Barman writes: > What I'm trying to do seems fraught with problems and I thought I'd > ask my question again with more details. Thanks to those who suggested > fill-pointers and treating a string as an array of characters, but I > am still having problems. What I want to be able to do is parse sentences > from an input file, delimited by periods. Other than periods, only a-z, A-Z, > slash (/), and dash (-) are considered non-white space. Words are delimited > by white space. What I want to create is not a string but a list component > (I discovered setting *print-escape* to nil to disable double quotes and > this should help). So, in reading > "This is sentence 1. > This is * sentence2." > I am trying to create: > > ( (THIS IS SENTENCE (NUMBER 1)) > (THIS IS SENTENCE2) > ) > > How can I coerce the string into being a list component?? This has > me stumped! Thanks in advance. Why, CONS it onto something, of course! In lisp ANYTHING can be a `list component' (cons "FOO" NIL) -> ("FOO") (cons 1 (cons "FOO" NIL)) -> (1 "FOO") etc. So, read characters and put them into a string, using the functions mentioned, stop when you get to anything you consider a delimiter, cons that onto a result and keep going till you get a period, then return the reverse of the result. This gives ("THIS" "IS" "SENTENCE2") And your homework will be done in no time. If you really want symbols instead of strings, use INTERN. And dont bother with *print-escape*; that only affects how things print, not how they are read. bruce
eliot@phoenix.Princeton.EDU (Eliot Handelman) (10/02/90)
In article <51809@brunix.UUCP> dkb@cs.brown.edu (Dilip Barman) writes: ; ;How can I coerce the string into being a list component?? This has ;me stumped! Thanks in advance. If READ-FROM-STRING acted reasonably, you could use it like this: (defun string->list (string) (let ((words '()) (index 0)) (loop (multiple-value-bind (word next-index) (read-from-string string nil nil :start index) (setq index next-index) (if word (push word words) (return (nreverse words))))))) > (string->list "It was a dark and stormy night") (IT WAS A DARK AND STORMY NIGHT) Unfortunately, READ-FROM-STRING throws an error if it sees read-macros, like commas. The solution is to write your own version, which reads characters from a string (using SCHAR), preprocesses special characters (like comma), detects the end of the word, then hands the string and indicies to SUBSEQ, which operates on the string. Intern the result, push on a list, NREVERSE when done and voila! It really is a pain to to this in CL. It was so much easier in the old Franz Lisp, because strings and atoms were identical. --eliot
moore%cdr.utah.edu@cs.utah.edu (Tim Moore) (10/02/90)
In article <2990@idunno.Princeton.EDU> eliot@phoenix.Princeton.EDU (Eliot Handelman) writes: >In article <51809@brunix.UUCP> dkb@cs.brown.edu (Dilip Barman) writes: >> [How do I turn a string sentence into a list?] >If READ-FROM-STRING acted reasonably, you could use it like this: > >(defun string->list (string) > (let ((words '()) (index 0)) > (loop > (multiple-value-bind (word next-index) > (read-from-string string nil nil :start index) > (setq index next-index) > (if word > (push word words) > (return (nreverse words))))))) >Unfortunately, READ-FROM-STRING throws an error if it sees read-macros, >like commas. The solution is to write your own version, which reads >characters from a string (using SCHAR), preprocesses special characters >(like comma), detects the end of the word, then hands the string and >indicies to SUBSEQ, which operates on the string. Intern the result, >push on a list, NREVERSE when done and voila! Rather than rewrite the reader, the solution is to do some readtable hacking. For example: (defvar sentence-read-table (copy-readtable)) (set-macro-character #\. #'(lambda (stream char) '|.|) nil sentence-read-table) (set-macro-character #\, #'(lambda (stream char) '|,|) nil sentence-read-table) ;;; ... and so on. (defun string->list (string) (let ((words '()) (index 0) (*readtable* sentence-read-table)) (loop (multiple-value-bind (word next-index) (read-from-string string nil nil :start index) (setq index next-index) (if word (push word words) (return (nreverse words))))))) (string->list "Alas poor Yorick, I knew him well Horatio.") (ALAS POOR YORICK |,| I KNEW HIM WELL HORATIO |.|) If you are willing to sacrifice a character, a 4 line hack (plus readtable initialization) that does the same thing is: (defun string->list2 (string) (let ((*readtable* sentence-read-table)) (with-input-from-string (s (concatenate 'simple-string string "`")) (read-delimited-list #\` s)))) >It really is a pain to to this in CL. It was so much easier in the >old Franz Lisp, because strings and atoms were identical. > >--eliot It's not that hard in Common Lisp. CL's extensive macro character syntax can get in the way, but a one-time setup of a new read table gets around this. In some sense strings and symbols are equivalent, as many CL string functions will take a symbol argument and coerce it to a string. Tim Moore moore@cs.utah.edu {bellcore,hplabs}!utah-cs!moore "Ah, youth. Ah, statute of limitations." -John Waters