calvin@sequent.UUCP (Calvin Goodrich) (09/25/90)
...for the unix.gods out there. i have a file that has a whole mess of null characters in it ('bout 1/2 a meg). is there any way (preferably a shell script) to strip them off? thanx, calvin.
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/26/90)
In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes:
: ...for the unix.gods out there. i have a file that has a whole mess of
: null characters in it ('bout 1/2 a meg). is there any way (preferably
: a shell script) to strip them off?
If your tr works like mine, you can just say
tr '' '' <foo >bar
Other possibilities:
sed '' <foo >bar
perl -pe 's/\0//g' <foo >bar
Larry Wall
lwall@jpl-devvax.jpl.nasa.gov
karl_kleinpaste@cis.ohio-state.edu (09/26/90)
calvin@sequent.uucp writes:
i have a file that has a whole mess of
null characters in it ('bout 1/2 a meg). is there any way (preferably
a shell script) to strip them off?
tr -d '\0'
ted@nmsu.edu (Ted Dunning) (09/26/90)
In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes: ...for the unix.gods out there. i have a file that has a whole mess of null characters in it ('bout 1/2 a meg). is there any way (preferably a shell script) to strip them off? tr without any options will do it. (it's a bug ... it's a feature ...) -- ted@nmsu.edu +---------+ | In this | | style | |__10/6___|
jik@athena.mit.edu (Jonathan I. Kamens) (09/26/90)
In article <42900@sequent.UUCP>, calvin@sequent.UUCP (Calvin Goodrich) writes: |> ...for the unix.gods out there. i have a file that has a whole mess of |> null characters in it ('bout 1/2 a meg). is there any way (preferably |> a shell script) to strip them off? Well, since "tr" deletes NULLs from its input, you could do "tr '' '' < filename > filename.nonulls". Then again, "sed" apparently also deletes NULLs, so you could do something similar with it: "sed -n p < filename > filename.nonulls". Both of those solutions are pretty much just hacks that rely on the fact that tr and sed delete NULLs. There's probably a more correct (i.e. it's doing what it's doing explicitly, rather than relying on a fluke in a program) solution in perl, but I'm religiously against posting perl scripts to the net, since so many other people do it so much better than I :-). -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
brister@decwrl.dec.com (James Brister) (09/26/90)
On 25 Sep 90 18:59:26 GMT, lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) said: > In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes: > : ...for the unix.gods out there. i have a file that has a whole mess of > : null characters in it ('bout 1/2 a meg). is there any way (preferably > : a shell script) to strip them off? > Other possibilities: > sed '' <foo >bar I try to avoid this unless I know there'll be a new line fairly frequently (which in a file full of nulls is *usually* unlikely). sed (at least my version or it) has a line length limit that can cause problems here. > perl -pe 's/\0//g' <foo >bar Of course :-) James -- James Brister brister@decwrl.dec.com DEC Western Software Lab., Palo Alto, CA {uunet,sun,pyramid}!decwrl!brister
lugnut@sequent.UUCP (Don Bolton) (09/26/90)
In article <9651@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: >In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes: >: ...for the unix.gods out there. i have a file that has a whole mess of >: null characters in it ('bout 1/2 a meg). is there any way (preferably >: a shell script) to strip them off? > >If your tr works like mine, you can just say > > tr '' '' <foo >bar > >Other possibilities: > > sed '' <foo >bar > perl -pe 's/\0//g' <foo >bar AWK AWK ACKKKK :-) awk -f filebelow <oldlist >newlist { for (i = 1; i <= NF; i = i + 1) { if (i >= NF) printf("%s",$i) else printf("%s ", $i) } printf("\n") } course I assume the "null" characters are just blanks here
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/27/90)
In article <42947@sequent.UUCP> lugnut@sequent.UUCP (Don Bolton) writes: : In article <9651@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: : >In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes: : >: ...for the unix.gods out there. i have a file that has a whole mess of : >: null characters in it ('bout 1/2 a meg). is there any way (preferably : >: a shell script) to strip them off? : > : >If your tr works like mine, you can just say : > : > tr '' '' <foo >bar : > : >Other possibilities: : > : > sed '' <foo >bar : > perl -pe 's/\0//g' <foo >bar : : AWK AWK ACKKKK :-) : : awk -f filebelow <oldlist >newlist : : { for (i = 1; i <= NF; i = i + 1) : { if (i >= NF) : printf("%s",$i) : else : printf("%s ", $i) : } : printf("\n") : } ACKKKK is right. This simply dumps core on my machine. Probably line length limitation. The sed solution apparently works because nulls are weeded out on input and never put into the pattern buffer. No source handy, alas... Larry
lugnut@sequent.UUCP (Don Bolton) (09/27/90)
In article <9677@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: >In article <42947@sequent.UUCP> lugnut@sequent.UUCP (Don Bolton) writes: >: In article <9651@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: >: >In article <42900@sequent.UUCP> calvin@sequent.UUCP (Calvin Goodrich) writes: >: >: ...for the unix.gods out there. i have a file that has a whole mess of >: >: null characters in it ('bout 1/2 a meg). is there any way (preferably >: >: a shell script) to strip them off? >: > >: >If your tr works like mine, you can just say >: > >: > tr '' '' <foo >bar >: > >: >Other possibilities: >: > >: > sed '' <foo >bar >: > perl -pe 's/\0//g' <foo >bar >: >: AWK AWK ACKKKK :-) >: >: awk -f filebelow <oldlist >newlist >: >: { for (i = 1; i <= NF; i = i + 1) >: { if (i >= NF) >: printf("%s",$i) >: else >: printf("%s ", $i) >: } >: printf("\n") >: } > >ACKKKK is right. > >This simply dumps core on my machine. Probably line length limitation. Hmmmm.. did you try cat oldlist | awk -f filebelow > newlist ? Thats the way I've been running it. Also, on line 2 >= NF can be changed to == NF (this was my first venture into deeper awk actions) That shouldn't be the cause of the core dump though. >The sed solution apparently works because nulls are weeded out on input >and never put into the pattern buffer. No source handy, alas... > >Larry
jik@athena.mit.edu (Jonathan I. Kamens) (09/28/90)
In article <42947@sequent.UUCP>, lugnut@sequent.UUCP (Don Bolton) writes: |> awk -f filebelow <oldlist >newlist |> |> { for (i = 1; i <= NF; i = i + 1) |> { if (i >= NF) |> printf("%s",$i) |> else |> printf("%s ", $i) |> } |> printf("\n") |> } |> |> course I assume the "null" characters are just blanks here First of all, the assumption that the nulls are supposed to represent blanks in the text is faulty, and is (as far as I can tell) in no way a valid assumption given the data that was provided by the original poster. Furthermore, there is no reason to make that assumption, since other posters have posted solutions which do not. Note that the original poster did not say that he wanted to replace the nulls with spaces (which is what your solution does), he said that he wanted to remove them altogether. Second, as Larry Wall already pointed out, your solution will coredump on a lot of systems. Third, your solution deletes extra space between words. If I have a line which appears as "foo bar" in the input, it will appear as "foo bar" in the output. Fifth, the awk on my system (4.3BSD) loses anything on the line after the first null. Therefore, "foo^@^@^@bar" turns into "foo". Presumably, your version doesn't do this, else you wouldn't have posted your solution, so you have portability concerns. There are still other versions of awk (e.g. GNU awk) that keep nulls intact. Sixth, the awk code you posted is suboptimal in at least three different ways. For example, if you look runs from 1 to NF, how can i ever be greater than NF inside the body of the loop? Here's a piece of code that does the same thing (although, like I've said, I don't think it's the right thing to do): { for (i = 1; i < NF; i++) printf("%s ", $i) printf("%s\n", $NF) } -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/28/90)
In article <43048@sequent.UUCP> lugnut@sequent.UUCP (Don Bolton) writes:
: Hmmmm.. did you try cat oldlist | awk -f filebelow > newlist ? Thats
: the way I've been running it.
Still dumps. As I said, probably a line length limitation, which cat
would have no effect on. Perhaps you're running gawk? Dave is doing
a good job with that. Nawk (the version I have, anyway) complains
about "input record `...' too long".
Tsk, tsk. Well, at least it checks. Give it half credit.
Larry
lugnut@sequent.UUCP (Don Bolton) (09/28/90)
In article <1990Sep27.170227.5257@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: >In article <42947@sequent.UUCP>, lugnut@sequent.UUCP (Don Bolton) writes: >|> awk -f filebelow <oldlist >newlist >|> >|> { for (i = 1; i <= NF; i = i + 1) >|> { if (i >= NF) >|> printf("%s",$i) >|> else >|> printf("%s ", $i) >|> } >|> printf("\n") >|> } >|> >|> course I assume the "null" characters are just blanks here > > First of all, the assumption that the nulls are supposed to represent blanks >in the text is faulty, and is (as far as I can tell) in no way a valid >assumption given the data that was provided by the original poster. >Furthermore, there is no reason to make that assumption, since other posters >have posted solutions which do not. > This is true, alas, I work with RDBMS products such as Oracle and Informix and am used to seeing nulls represented as blank spaces. > Note that the original poster did not say that he wanted to replace the >nulls with spaces (which is what your solution does), he said that he wanted >to remove them altogether. > Actualy what my program does do is strip out multiple blanks and replaces them with one blank space. > Second, as Larry Wall already pointed out, your solution will coredump on a >lot of systems. > This is not a point I would have considered, as it runs fine on my machine and is really merely a modified example from the awk programming language book I have. > Third, your solution deletes extra space between words. If I have a line >which appears as "foo bar" in the input, it will appear as "foo bar" >in the output. > Which was my intent. (I did do *something* right) :-) > Fifth, the awk on my system (4.3BSD) loses anything on the line after the >first null. Therefore, "foo^@^@^@bar" turns into "foo". Presumably, your >version doesn't do this, else you wouldn't have posted your solution, so you >have portability concerns. There are still other versions of awk (e.g. GNU >awk) that keep nulls intact. > Don't know bout this one.... > Sixth, the awk code you posted is suboptimal in at least three different >ways. For example, if you look runs from 1 to NF, how can i ever be greater >than NF inside the body of the loop? Here's a piece of code that does the >same thing (although, like I've said, I don't think it's the right thing to >do): > i cannot be greater than NF, this bozo hosehead here tried to use an assignment operand as an equality operator, in a fit of "what the fu**", I tossed in the > and bingo it ran. 16 months ago I was a telemonkey (read telemarketer) with <NULL programming experience. Because of the application generators associated with the RDBMS packages I found RDBMS programming to be easy and a LOT more enjoyable than dialing for dollars. I'm still bumping my way through shell programming, though not an expert, I can do whatever I need to with it. awk is something I just recently started playing with and the program you saw was my first forray beyond {print "some text", $1} useage.. I'll learn, thanks for the pointers.. > { > for (i = 1; i < NF; i++) > printf("%s ", $i) > printf("%s\n", $NF) > } > >-- >Jonathan Kamens USnail: >MIT Project Athena 11 Ashford Terrace >jik@Athena.MIT.EDU Allston, MA 02134 >Office: 617-253-8495 Home: 617-782-0710 Half lug, half nut