根據單詞的第一個字符拆分檔案-有解無憂

我想根據單詞的第一個字符拆分檔案并根據第一個字符創建輸出檔案。我在做...

awk '{print > substr($0, 1, 1)}' "$File"

但這awk是給予'fatal: expression for >' redirection has null string value'。該檔案包含一些空行。在進行拆分時如何忽略空白行。

$File 的內容是

100009-01  -- This should go in file named 1
200009-01  -- This should go in file named 2
300009-01  -- This should go in file named 3
400009-01
500009-01
600037-01
700037-01
800037-01
900037-01
100037-01  -- This should go in file named 1
A0037-02_  -- This should go in file named A
a00037-02  -- This should go in file named a
c00037-02
B00037-02
200037-02

它應該生成名為“1”的檔案，所有以 1 開頭的行都應該進入這個檔案。

謝謝

uj5u.com熱心網友回復：

使用您顯示的示例，請嘗試以下awk代碼。

sort -k1.1 Input_file | 
awk '
!NF{ next }
{
  currentFile=substr($1,1,1)
}
prev!=currentFile{
  close(prev)
}
{
  print > (currentFile)
  prev=currentFile
}
'

說明：為上述添加詳細說明。

sort -k1.1 Input_file |         ##Sorting Input_file with 1st letter to make it easier for awk.
awk '                           ##Sending output to awk program as an input.
!NF{ next }                     ##If its an empty line then move to next line.
{
  currentFile=substr($1,1,1)    ##Setting currentFile to 1st letter of current line.
}
prev!=currentFile{              ##If prev is NOT equal to currentFile then do following.
  close(prev)                   ##Closing prev file in backend to avoid errors.
}
{
  print > (currentFile)         ##Printing current line into currentFile output file.
  prev=currentFile              ##Setting currentFile value to prev here.
}
'

uj5u.com熱心網友回復：

以下是使用 bash 完成的方法：

while read -r line; do
    echo "$line" >> "${line:0:1}"
done < "$File"

uj5u.com熱心網友回復：

該檔案包含一些空行。在進行拆分時如何忽略空白行。

如果這是唯一的問題，您可以通過以下方式簡單地修復您的代碼：

awk '$0!=""{print > substr($0, 1, 1)}' "$File"

說明：我在您的操作中添加了條件，如果整行（$0）不等于（!=）空字串（""），則為真，因此將忽略空行。

uj5u.com熱心網友回復：

這是對原始代碼的小更新：

awk 'NF{print > substr($1, 1, 1)}' "$File"

由于awk使用(pattern){action}規則，它意味著action在pattern非零或非空時采用。的值NF給出了當前記錄（行）中的欄位總數。通過使用NFas pattern，如果當前行包含非空格字符，awk 將執行該操作。

除此之外，我們還使用$1代替$0。這只是為了避免有些行可以以空格開頭，我們使用第一個欄位的第一個字符。

uj5u.com熱心網友回復：

我不知道如何將其放入一個 shell 腳本中，但您可以基于以下內容：

cut -c 1 test.txt | sort | uniq

這給出了檔案中出現的第一個字符的串列（它還給出了您將要創建的檔案名）。

grep "^1" test.txt

這將為您提供檔案的所有行，從“1”開始。

注意：不要使用a>file，因為這將始終洗掉并重新創建您的檔案。我建議你這樣做a>>file，它會在不存在的情況下創建檔案，否則會附加。

所以，在偽代碼中，你應該得到類似的東西：

foreach (char a in $(cut -c 1 test.txt | sort | uniq))
{
  grep "^$a" test.txt >>$a
}

uj5u.com熱心網友回復：

一個簡單的awk方法是：

awk '/^[^[:blank:]]/{
    fn=substr($0,1,1)
    if (fn in seen)
        print >>(fn)
    else
        print >(fn)
    close(fn)
    seen[fn]
}' file

你可以測驗它：

a=( $(cut -c 1 file | sort | uniq) )
head $(printf "%s " "${a[@]}")
==> 1 <==
100009-01  -- This should go in file named 1
100037-01  -- This should go in file named 1

==> 2 <==
200009-01  -- This should go in file named 2
200037-02

==> 3 <==
300009-01  -- This should go in file named 3

==> 4 <==
400009-01

==> 5 <==
500009-01

==> 6 <==
600037-01

==> 7 <==
700037-01

==> 8 <==
800037-01

==> 9 <==
900037-01

==> A <==
a00037-02  -- This should go in file named a

==> B <==
B00037-02

==> a <==
a00037-02  -- This should go in file named a

==> c <==
c00037-02

注意：這是在 Mac 上，因此檔案A與檔案相同a...

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/447733.html

標籤：重击贝壳 awk

上一篇：如何撰寫一個bash腳本來讀取用戶的輸入然后寫入檔案

下一篇：為什么make在目標名稱周圍使用反引號和單引號？