我基本上正在撰寫一個更好的 sed 版本(我需要它來完成一些 ETL 作業)。
cat file.txt | transform [regex] [replaceor] [regex] [replaceor] [regex replaceor]...
my %transforms = @ARGV; # convert array (regex,replace,regex,replace) pairs into hash
# want something like this:
# my %transforms = map { qr/$_/ => $_[ ]} @ARGV; # grab TWO element of @ARGV at a time
while (my $content = <STDIN>) {
while (my ($scan, $print) = each(%transforms)) {
# $print could be code. Still deciding on that.
my $scan = qr/$scan/; # WANT TO AVOID re-compiling the regex every time
my $transformed = $content =~ s/$scan/$print/re; #
print $transformed;
}
}
是的,我可以用蠻力和許多其他方法來做到這一點,但是從一個陣列中抓取多個專案已經出現了好幾次,我想知道它是否有技巧。唔。雙地圖呢?
uj5u.com熱心網友回復:
pairmap一次從陣列中獲取 2 個專案:
use List::Util qw(pairmap);
my %transforms = pairmap { qr/$a/, $b } @ARGV;
uj5u.com熱心網友回復:
你可以使用
use List::Util qw( pairmap );
my %transforms = pairmap { qr/$a/ => $b } @ARGV;
或者
my %paired_args = @ARGV;
my %transforms = map { qr/$_/ => $paired_args{$_} } keys( %paired_args );
但哈希鍵始終是字串。以上等價于
use List::Util qw( pairmap );
my %transforms = pairmap { my $re = qr/$a/; "$re" => $b } @ARGV;
這實際上等效于以下內容(或類似內容):
my %transforms = pairmap { "(?^u:$a)" => $b } @ARGV;
您希望每個模式編譯一次,但這并沒有實作。您實際上導致每個模式都被編譯一次!
您實際上并不需要按鍵查找值,因此陣列陣列可以解決問題。
use List::Util qw( pairmap );
my @transforms = pairmap { [ qr/$a/, $b ] } @ARGV;
while ( my $content = <STDIN> ) {
for ( @transforms ) {
my ( $scan, $print ) = @$_;
$content =~ s/$scan/$print/;
}
print $content;
}
請注意我沒有使用/e. 如果您是為了上班而使用/e,$1那不是正確的方法。請改用String::Substitution。
例如,您可以替換
# Doesn't support $1 and such. Doesn't require `\` to be escaped.
$content =~ s/$scan/$print/;
和
use String::Substitution qw( sub_modify );
# Supports $1 and such. Requires `\` to be escaped.
sub_modify( $content, $scan, $print );
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/422443.html
標籤:
下一篇:正則運算式:查找和替換
