vwa0201@marst2 (Larry Baca) (02/28/91)
I want to do a quick generic search of a rather large data base and I want to do the search based on certain record cols. If I have records that look like this: abc defghi j klmnop qrst uvw x y z a bc defg hij lmnop qrst uvw x yz ab c defg hi j klmn op qrst u vwxyz And say I want to find only the records with (a) in col1, (nop) in col19-21, (v) in col29 and (y) in col34. I want to do this in a script where the record cols and params are left to to the users choice. I tried doing this with: -- -- while true do a=`line <file` || break do cuts of $a and compare to given params..... -- -- But this was slower than Sadams SCUDS. Maybe 'C' is a better way to go but if it can be done with AWK and is still reasonably fast, I would like to know about it. Thank you for any ideas you may have. -- /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ LARRY BACA, marst2!lbaca DAASO-VWA AIS, DEFENSE AUTOMATIC ADDRESSING OFFICE, WESTERN DIVISION DDTC TRACY, TRACY CA. 95376-5057 AUTOVON 462-9391 COMERCIAL 832-9391
jik@athena.mit.edu (Jonathan I. Kamens) (02/28/91)
It seems to me that if your database contains simply lines with certain characters in each column, and you want to search for lines matching a specified pattern of characters, the simplest thing to use is grep. You gave an example of finding "only the records with (a) in col1, (nop) in col19-21, (v) in col29 and (y) in col34." How about: grep "^a................. .......v....y" database-filename Read the man page for grep if you don't understand the periods in the regular expression above. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
tchrist@convex.COM (Tom Christiansen) (02/28/91)
Sounds like a job for perl. If you don't like this, enjoy your C program. --tom -- "UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things." -- Doug Gwyn Tom Christiansen tchrist@convex.com convex!tchrist
haroldt@paralandra.yorku.ca (Harold Tomlinson) (02/28/91)
Sorry to post this to the net, but, I could not reach the above addr
via email.
In article <354@marst2> vwa0201@marst2 (Larry Baca) writes:
:> I want to do a quick generic search of a rather large data base and I want
:> to do the search based on certain record cols. If I have records that look
:> like this:
:>
:> abc defghi j klmnop qrst uvw x y z
:> a bc defg hij lmnop qrst uvw x yz
:> ab c defg hi j klmn op qrst u vwxyz
:>
:> And say I want to find only the records with (a) in col1, (nop) in col19-21,
:> (v) in col29 and (y) in col34.
:>
:> I want to do this in a script where the record cols and params are left to
:> to the users choice.
:>
----- Transcript of session follows -----
550 <vwa0201@marst2>... Host unknown
----- Unsent message follows -----
Received: by paralandra.yorku.ca (5.57/Ultrix3.0-C)
id AA17172; Thu, 28 Feb 91 08:55:21 EST
To: vwa0201@marst2
Subject: Awk db search question.
Date: Thu, 28 Feb 91 08:55:18 -0500
From: haroldt@paralandra.yorku.ca
I don't think I fully understood what you were asking. Did you want
column input (as in Sas column input) or variable columns?
You wrote:
abc defghi j klmnop qrst uvw x y z
a bc defg hij lmnop qrst uvw x yz
ab c defg hi j klmn op qrst u vwxyz
Let's say there is a col1 (a string). What did you want in col1 for
each of these rows? abc, a, ab? or a,a,a?
May I suggest that you look into the substring function for AWK.
======================================================================
=== Harold Tomlinson ===
== Computing and Communications Services ==
= YORK UNIVERSITY =
= haroldt@paralandra.yorku.ca =
= 416- 736-5257-33802 =
======================================================================
--
======================================================================
=== Harold Tomlinson ===
== Computing and Communications Services ==
= YORK UNIVERSITY =
= haroldt@paralandra.yorku.ca =
= 416- 736-5257-33802 =
======================================================================
campbell@lotus.com (Jim Campbell) (03/09/91)
Hi, I am posting this response to the net because "marst2" is unknown any way I try it ...... In article <354@marst2> you write: >I want to do a quick generic search of a rather large data base and I want >to do the search based on certain record cols. If I have records that look >like this: > >abc defghi j klmnop qrst uvw x y z >a bc defg hij lmnop qrst uvw x yz >ab c defg hi j klmn op qrst u vwxyz > >And say I want to find only the records with (a) in col1, (nop) in col19-21, >(v) in col29 and (y) in col34. > >I want to do this in a script where the record cols and params are left to >to the users choice. > >I tried doing this with: > >-- >-- >while true >do > a=`line <file` || break > do cuts of $a and compare to given params..... > You don't need awk, or any Scuds, to do this.... Try this: grep '^a.\{17\}nop.\{7\}v.\{4\}y' <file> Note that none of the lines you gave fall into the column specfications you gave. I modified the second line by removing one of the spaces preceding the "u", and which made the second line conform to your specification, and this grep command matched that line. Here is my input file: 1234567890123456789012345678901234567 abc defghi j klmnop qrst uvw x y z a bc defg hij lmnop qrst uvw x yz ab c defg hi j klmn op qrst u vwxyz Bonne chance! -- Jim Campbell, Lotus Development Corporation | harvard!ima \ 1 Rogers St., Cambridge, MA 02142 | >!lotus!campbell 617/693-5652 | uunet /