andrewl@olivea.olivetti.com (Andrew Law) (07/01/88)
Hello, I have two files:
file a file b
with 'string 1' #n with 'string 1' #1
'string 5' #n 'string 2' #2
'string 3' #3
'string 4' #4
'string 5' #5
What I would like to do is to read file a and b. If there is
a match on the string in a to a string in b, the #n will be
substituted by the appropiate # number from file b.
Any suggestions on how to do with shell scripts, awk or sed.
Please send mail to andrewl@olivea.olivetti.com
Thank you very much.smileyf@ucscb.UCSC.EDU (Shutoku Shia) (07/03/88)
Here's one way to do it using awk. I had to make several assumptions
about the input format in order to do it:
(1) the two input files are named "file.a" and "file.b" (double quotes
not included).
This means the command to invoke the awk script is like the
following:
%awk -f fun.awk file.a file.b
(2) the field delimiter in awk has been set to "' " (double quote not
included). This means the string cannot have "' " inside the it.
(*) there is a new version of awk, however, I tested the following
awk script using the old version on BSD UNIX 4.3 running on
VAX 11/750.
% cat fun.awk
BEGIN {
file_a = "file.a"
file_b = "file.b"
FS = "' "
list_a_size = 1
list_b_size = 1
}
FILENAME == file_a {
list_a[list_a_size++] = $2
}
FILENAME == file_b {
list_b[list_b_size] = $2
number[list_b_size++] = $3
}
END {
for(a = 1; a < list_a_size; ++a) {
for(b = 1; b < list_b_size; ++b)
if (list_a[a] == list_b[b])
print list_a[a] ":" number[b]
}
}
% cat file.a
with 'string 1' #5
'string 5' #1
% cat file.b
with 'string 1' #1
'string 2' #2
'string 3' #3
'string 4' #4
'string 5' #5
% awk -f fun.awk file.{a,b}
string 1: #1
string 5: #5
%
%
Shutoku Shia
-----------------------------------------------------------------------
| Bitnet: smileyf@ucscf.bitnet | formerly in |
| Internet: smileyf@ucscf.UCSC.EDU | Dept. of Cmp. & Info. Sci. |
| Arpanet: smileyf@ucscf.UCSC.EDU | Univ. of Calif., Santa Cruz |
| Uucp: ...!ucbvax!ucscc!ucscf!smileyf | |
-----------------------------------------------------------------------