Re: [問題] 檔案字串比對
※ 引述《cp3cp3 (侵掠如火、不動如山)》之銘言:
: 若我有一個檔案,是這樣的資料格式
: AB EF CCA,XDE,PPC ACE,DDE
: AC DG ACE ACE,DDE,CCA
: DC AS CCA,XDE,PPC,FDS,JKL CCA,XDE,PPC,FDS
: 第一欄的物件對應到第三欄
: 第二欄的物件對應到第四欄
: 第三欄和第四欄內的物件個數是不固定的,但至少>=1
: 我想要利用第三欄和第四欄的資訊算出每一個record的交集和連集數目,
: 要如何寫比較好?
: 謝謝!
底下是我的寫法,不知道有沒有會錯意 @_@
======================Code=========================
#!/usr/bin/env perl
use strict;
open my $filehandle, "testfile.txt";
my %records;
while (<$filehandle>) {
my @cols = split /\s+/;
$records{$cols[0]} = [split /,/, $cols[2]];
$records{$cols[1]} = [split /,/, $cols[3]];
}
close $filehandle;
generate_data(%records);
sub generate_data {
my %records = @_;
my @names = sort keys %records;
print "Record A\tRecord B\tUnion\tIntersection\n";
for my $first (0..$#names-1) {
for my $second ($first+1..$#names) {
my %count;
my $firstname = $names[$first];
my $secondname = $names[$second];
for (@{$records{$firstname}}, @{$records{$secondname}}) {
$count{$_}++;
}
printf "%8s\t%8s\t%5d\t%12d\n",
$firstname, $secondname,
scalar keys %count, # union
scalar grep { $count{$_} > 1 } keys %count; #intersection
}
}
}
==================Sample Output=========================
Record A Record B Union Intersection
AB AC 4 0
AB AS 4 3
AB DC 5 3
AB DG 5 1
AB EF 5 0
AC AS 5 0
AC DC 6 0
AC DG 3 1
AC EF 2 1
AS DC 5 4
AS DG 6 1
AS EF 6 0
DC DG 7 1
DC EF 7 0
DG EF 3 2
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 140.112.214.6
推
06/07 01:16, , 1F
06/07 01:16, 1F
→
06/07 01:19, , 2F
06/07 01:19, 2F
→
06/07 10:37, , 3F
06/07 10:37, 3F
推
06/07 12:30, , 4F
06/07 12:30, 4F
→
06/07 17:13, , 5F
06/07 17:13, 5F
討論串 (同標題文章)
Perl 近期熱門文章
PTT數位生活區 即時熱門文章
10
113