我想在我的 Perl 程式中添加拼寫檢查。看起來Text::Aspell應該做我需要的,但它只提供檢查單個單詞的功能。
use strict;
use warnings;
use Text::Aspell;
my $input = "This doesn't look too bad. Me&you. with/without. 1..2..3..go!";
my $aspell = Text::Aspell->new();
$aspell->set_option('lang', 'en');
print "$input: ", $aspell->check($input), "\n";
這列印:
This doesn't look too bad. Me&you. with/without. 1..2..3..go!: 0
很明顯它只需要單個單詞,那么我如何將文本分成單詞?一個簡單split的空白處:
foreach my $word (split /\s/, $input) {
next unless($word =~ /\w/);
print "$word: ", $aspell->check($word), "\n";
}
這會導致標點符號沒有空格的問題:
This: 1
doesn't: 1
look: 1
too: 1
bad.: 0
Me&you.: 0
with/without.: 0
1..2..3..go!: 0
我想我可以提到標點符號:
foreach my $word (split qr{[,.;!:\s#"\?&%@\(\)\[\]/\d]}, $input) {
next unless($word =~ /\w/);
print "$word: ", $aspell->check($word), "\n";
}
這會得到合理的輸出:
This: 1
doesn't: 1
look: 1
too: 1
bad: 1
Me: 1
you: 1
with: 1
without: 1
go: 1
但似乎很笨拙,我想知道是否有一種更簡單的(我撰寫的代碼更少,不那么脆弱)的方法。
如何拼寫檢查文本?
uj5u.com熱心網友回復:
Text::Aspell沒有檢查整個字串的選項,而是只檢查單個單詞。我建議不要自己拆分字串,而是建議使用一個已經為您執行此操作的模塊,例如Text::SpellChecker. 例如:
use strict;
use warnings;
use Text::SpellChecker;
use feature 'say';
my $input = "This doesn't look too bad. Me&you. with/without. 1..2..3..go!";
my $checker = Text::SpellChecker->new(text => $input);
$checker->set_options(aspell => { 'lang' => 'en' });
while (my $word = $checker->next_word) {
say "Invalid word: $word";
}
或者,
my $checker = Text::SpellChecker->new(text => $input);
$checker->set_options(aspell => { 'lang' => 'en' });
if ($checker->next_word) {
say "The string is not valid.";
} else {
say "The string is valid.";
}
該模塊的檔案顯示了如何以互動方式替換錯誤單詞:
while (my $word = $checker->next_word) {
print $checker->highlighted_text,
"\n",
"$word : ",
(join "\t", @{$checker->suggestions}),
"\nChoose a new word : ";
chomp (my $new_word = <STDIN>);
$checker->replace(new_word => $new_word) if $new_word;
}
如果您想自己單獨檢查輸入字串的每個單詞,您可以查看如何Text::SpellCheck將字串拆分為單詞(這是由next_word函式完成的)。它使用以下正則運算式:
while ($self->{text} =~ m/\b(\p{L} (?:'\p{L} )?)/g) {
...
}
uj5u.com熱心網友回復:
以下代碼片段使用不包含字母的正則運算式并將'句子拆分為單詞。
您可以根據自己的意愿擴展正則運算式。
use strict;
use warnings;
use Text::Aspell;
my $regex = qr/[^'a-z] /i;
my $input = "This doesn't look too bad. Me&you. with/without. 1..2..3..go!";
my $aspell = Text::Aspell->new();
$aspell->set_option('lang', 'en');
printf "s: %d\n", $_, $aspell->check($_) for split($regex, $input);
輸出
This: 1
doesn't: 1
look: 1
too: 1
bad: 1
Me: 1
you: 1
with: 1
without: 1
go: 1
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/316266.html
