我在一個 Ubuntu 作業系統上,在一個 Bash shell 中,試圖用來在 .tra 擴展日志檔案grep中查找所有出現的子字串,比如說,并將結果保存在一個檔案中,比如說engineBreakdown()my_log_16.traresults_16.txt
所以我跑
cat /path/to/my_log_16.tra | grep "engineBreakdown()" > results_16.txt
當我運行時,less results_16.txt我實際上看到里面保存了一些包含子字串的行,但它們并不是我期望的所有行。
事實上,當我手動搜索engineBreakdown()down的出現時my_log_16.tra,我看到還有其他包含子字串的行,但這些并沒有保存到results_16.txt. 所以看來我的命令只保存了第一次出現的子字串。
我認為 grep 可能會失敗,因為my_log_16.tra它是一個非常大的檔案(大約 100 MB)。
如果這是原因,是否有更可靠的方法可以在一個非常大的檔案中找到所有出現的子字串?
grep 的版本和別名
grep --version
grep (GNU grep) 2.25 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3 : GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
$ type -a grep
grep is aliased to `grep --color=auto' grep is /bin/grep
my_log_16.tra 中的行示例
行正確檢測并保存到 results_16.txt
[I 2022-10-16 07:26:35.449 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.846 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.848 Rservice:75] engineBreakdown()
出現子字串的檔案的一部分,但未保存到 results_16.txt
[I 2022-10-16 11:32:48.039 web:2064] 200 GET /static/ui-src/default/img/Customer.png?v=0.9702853857687699 (127.0.0.1) 10.49ms
[I 2022-10-16 11:32:49.778 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:50.125 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.123 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:55.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.134 Rservice:75] engineBreakdown()
出現子字串的檔案的另一部分,但未保存到 results_16.txt
[I 2022-10-17 04:00:35.127 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:35.138 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:39.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:39.220 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:39.228 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.233 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.237 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.243 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:40.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:40.128 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:40.133 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:44.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:44.221 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:44.227 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.232 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.234 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.237 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:45.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:45.126 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:45.128 Rservice:75] engineBreakdown()
更新 1
我也試過
grep "engineBreakdown()" /path/to/my_log_16.tra > results_16.txt
但結果是一樣的。
更新 2
如建議的那樣,雙引號可能不足以正確處理括號,因此我從輸入子字串中洗掉了括號并將雙引號更改為單引號
grep "engineBreakdown" /path/to/my_log_16.tra > results_16.txt
grep 'engineBreakdown' /path/to/my_log_16.tra > results_16.txt
但結果是一樣的。
uj5u.com熱心網友回復:
如果這有awk幫助,您可以嘗試。
資料
$ cat file
engineBreakdown()
engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
engineBreakdown()
$ awk -v var="engineBreakdown()" '
$0~var{
printf NR
for(i=1;i<=NF;i ){
if($i~var){x }
}
print " # matches: " x
x=0
}' file
1 # matches: 1
2 # matches: 4
3 # matches: 1
只需列印沒有子字串檢測的行(如grep)就可以了
$ awk -v var="engineBreakdown()" '$0~var{ print }' file
engineBreakdown()
engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
engineBreakdown()
uj5u.com熱心網友回復:
似乎您的grep命令列為例外(可能是因為您使用的舊版本有一些稍后修復的錯誤)。
這是一個替代方案sed:
sed -n '/engineBreakdown()/p' /path/to/my_log_16.tra > op.txt
我建議更新您的grep安裝。ripgrep是另一種選擇。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/520834.html
標籤:重击文件grep子串
