我有一個 PHP 腳本，它使用該preg_match_all函式從文本檔案中回傳所有匹配項。但是，我希望該函式只檢查每行中從位置 3 開始、長度為 11 位（基本上，結束位置為 13）的匹配項，而不是在整行中查找匹配項，因為這將回傳錯誤結果.

腳本：

<?php
$file = 'masterfile.out';
$searchfor = '02354098780';

// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*\$/m";

// search, and store all matching occurrences in $matches
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   echo substr(implode("\n", $matches[0]),2,11);
   echo substr(implode("\n", $matches[0]),166,11);
}
else{
   echo "No matches found";
}
?>

文本檔案示例資料：

I0023540987805R01  ABC                         GHI                       OLirrt                 000000000000000100EA 0812160070451700   1098833   1990041300000001086000000000108600000000000996000000000032100000000000000000000000000000000000000000000000000000000000000000000006589000000000000000                                     P0012B    
 0000002032902R01  DEF                         JKL                       KLijuI                 000000000000000100EA 0812160070451700   1029132   1997010800000002396000000000239600000120002326000000000000000000000000000000000000000000000000000000000000004560000000000000000000000000987600000000                                     A203SD

uj5u.com熱心網友回復：

對于少量字符，您可以將正則運算式錨定到行首：

'#^..([0-9]{13})#'

將搜索 13 位數字，忽略從行首 (^) 開始的前兩個字符 (.)，包括第三個。

在這種情況下：

<?php
$file      = 'masterfile.out';
// $pattern   = '#^..([0-9]{11})#m'; // Any 11 digits
$pattern   = '#^..(02354098780)#m';   // Exactly these 11

// the following line prevents the browser from parsing this
// as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exists)
$contents = file_get_contents($file);

if (preg_match_all($pattern, $contents, $matches, PREG_PATTERN_ORDER)){
   echo "Found matches:\n";
   echo implode("\n", $matches[1]);
   echo "\n";
} else {
   echo "No matches found\n";
}

更新

我剛剛注意到您的序列從1開始的第三個字符開始。在某些標準（以及我早期的示例中）中，您從 0 開始計數。因此，如果您從 1 開始，則只需要兩個點，而不是三個。換句話說，當您說“從位置 3 開始”時，您的意思可能是跳過前兩個字符，而 - 正如您從其他答案中看到的那樣 - 幾乎每個人都認為您想跳過三個字符。

uj5u.com熱心網友回復：

如果您的示例接近您的預期用途，您實際上是在搜索子字串的精確匹配，但使用 preg_match_all。但是，遍歷行應該具有較低的記憶體影響，并且嚴格的子字串比較對于完全相等具有比 preg_match_all 更低的 cpu 影響。

所以我建議這樣做。這可以通過fgets或來實作stream_get_line，這可能會稍微提高性能（盡管在大多數情況下這無關緊要）。

這可以通過以下方式實作：

$searchString = 'someFixedString';
$posOffset = 2;
$matchLength = mb_strlen($searchString);
$filePath = '/some/file.path';
$fileHandle = @fopen($filePath, 'r ');
$checkedLines = 0;
$matches = [];
$foundMatches = false;

//Depending on what you wish to output
$capturePosOffset = 0;
$captureLength = $matchLength   $posOffset   3;

// if lines are  no longer than 8192 bytes,
// otherwise set to a value above the byte-length of your lines
$maxBytesToReadPerLine = 0; 

// if file line-terminator is as in PHP, 
// otherwise set to file's line-terminator
$lineTerminator = PHP_EOL;

if ($fileHandle) {
    while (!feof($fileHandle)) {
       $checkedLines  ;
       // or just use fgets, which requires no further arguments
       $line = stream_get_line($fileHandle, $maxBytesToReadPerLine, $lineTerminator);
       if (mb_substr($line, $posOffset, $matchLength) === $searchString) { 
           $foundMatches = true;
           $matches[] = $line;
           // Or, if you want to capture a field with a fixed Length
           // (modify the offset and length arguments above)
           $matches[] = mb_substr($line, $capturePosOffset, $captureLength);
       }
    }
}
if ($foundMatches) {
    echo "Found " . count($matches) . " matches among $checkedLines lines:" . PHP_EOL;
    foreach ($matches as $matchedValue) {
        // I'm not sure what you intend to do here.
        // - In your example code, it appears you
        // implode the array, but then only output
        // 11 characters of the first line starting at position 3.
        // If you want the whole line, you can capture it above
        // and echo it here.

        // Or if you want, you can capture and output the first field
        // by modifying $capturePosOffset and $captureLength
        // by merely echoing the value (and a newline)
        echo '  ' . $matchedValue . PHP_EOL;
    }
} else {
    echo "No matches found!" . PHP_EOL;
}

我們使用mb_strlen和mb_substr的情況下，編碼允許多位元組字符-只有當你知道這是絕對不會的情況下可以strlen和substr安全使用。

人們不應該陷入過早優化的困境，但請注意：哪種解決方案最佳將在很大程度上取決于檔案大小和匹配長度。

uj5u.com熱心網友回復：

下面的正則運算式忽略每行開頭的前 3 個字符并捕獲后面的 11 個字符

https://regex101.com/r/MEaB67/1

/^.{3}(.{11})/gm

編輯

下面是一些示例 PHP 代碼來測驗正則運算式

<pre>
<?php
$pattern = '/^.{3}(.{11})/m';
$subject = '
I0023540987805R01  ABC                         GHI                       OLirrt                 000000000000000100EA 0812160070451700   1098833   1990041300000001086000000000108600000000000996000000000032100000000000000000000000000000000000000000000000000000000000000000000006589000000000000000                                     P0012B    
 0000002032902R01  DEF                         JKL                       KLijuI                 000000000000000100EA 0812160070451700   1029132   1997010800000002396000000000239600000120002326000000000000000000000000000000000000000000000000000000000000004560000000000000000000000000987600000000                                     A203SD   
';
$matches = null;
preg_match_all($pattern, $subject, $matches);
var_dump($matches);
?>
</pre>

法比奧

uj5u.com熱心網友回復：

這是一種與您不同的方法 - 由于我們正在該行的特定部分查找字串，因此我們可以洗掉其余部分并檢查該字串是否出現在所述行中。

    <?php


$text = "I0023540987805R01  ABC                         GHI                       OLirrt                 000000000000000100EA 0812160070451700   1098833   1990041300000001086000000000108600000000000996000000000032100000000000000000000000000000000000000000000000000000000000000000000006589000000000000000                                     P0012B    
0000002032902R01  DEF                         JKL                       KLijuI                 000000000000000100EA 0812160070451700   1029132   1997010800000002396000000000239600000120002326000000000000000000000000000000000000000000000000000000000000004560000000000000000000000000987600000000                                     A203SD   ";

echo '<pre>';
$txt = explode("\n",$text);

echo '<pre>';
print_r($txt);

foreach($txt as $key => $line){
    $subbedString = substr($line,2,11);

    $searchfor = '02354098780';
    //echo strpos($subbedString,$searchfor); 
    if(strpos($subbedString,$searchfor) === 0){
        $matches[$key] = $searchfor;
        $matchesLine[$key] = $line; /**Save the whole line when match is found. */
        echo "Found in line : $key";
    }

    
}

echo '<pre>';
print_r($matches);

echo '<pre>';
print_r($matchesLine);

將回傳：

  Array
(
    [0] => I0023540987805R01  ABC                         GHI                       OLirrt                 000000000000000100EA 0812160070451700   1098833   1990041300000001086000000000108600000000000996000000000032100000000000000000000000000000000000000000000000000000000000000000000006589000000000000000                                     P0012B    
    [1] => 0000002032902R01  DEF                         JKL                       KLijuI                 000000000000000100EA 0812160070451700   1029132   1997010800000002396000000000239600000120002326000000000000000000000000000000000000000000000000000000000000004560000000000000000000000000987600000000                                     A203SD   
)
Found in line : 0
Array
(
    [0] => 02354098780
)
Array
(
    [0] => I0023540987805R01  ABC                         GHI                       OLirrt                 000000000000000100EA 0812160070451700   1098833   1990041300000001086000000000108600000000000996000000000032100000000000000000000000000000000000000000000000000000000000000000000006589000000000000000                                     P0012B    
)

uj5u.com熱心網友回復：

您可以匹配 3 個字符，然后使用\K忘記到目前為止匹配的內容，然后匹配 11 個數字。

^...\K\d{11}

^ 字串的開始
... 匹配除換行符以外的任何字符的 3 次
\K 清除當前匹配緩沖區
\d{11} 匹配 11 位數字

您可以省略 usingpreg_quote因為在當前模式中沒有什么可以轉義的。

由于模式使用錨點，^您必須指定多行標志/m才能獲得所有結果。

$file = 'masterfile.out';
$contents = file_get_contents($file);
$pattern = "/^...\K\d{11}/m";

if (preg_match_all($pattern, $contents, $matches)) {
    echo "Found matches:" . PHP_EOL;
    foreach ($matches[0] as $m) {
        echo $m . PHP_EOL;
    }
} else {
    echo "No matches found";
}

輸出

Found matches:
23540987805
00002032902

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/374950.html

標籤：php 正则表达式

上一篇：上傳時運行PHP檔案

下一篇：如果影像未顯示以自動放入占位符，我如何獲得它

如何使用具有指定起點和終點的PHP從文本檔案中獲取匹配項？

更新