我有一個 .csv，其中每一行對應一個人（第一列）和具有該人可用值的屬性。我想為屬性可用的人提取特定屬性的名稱和值。該檔案的結構如下：

name,attribute1,value1,attribute2,value2,attribute3,value3
joe,height,5.2,weight,178,hair,
james,,,,,,
jesse,weight,165,height,5.3,hair,brown
jerome,hair,black,breakfast,donuts,height,6.8

我想要一個看起來像這樣的檔案：

name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8

使用這篇較早的帖子，我嘗試了幾種不同的awk方法，但仍然無法同時獲取第一列和任何具有所需屬性值（比如高度）的列。例如，以下回傳所有內容。

awk -F "height," '{print $1 "," FS$2}' file.csv

我grep只能在其中包含高度的行，但如果可以的話，我更愿意在一行中完成所有操作。

uj5u.com熱心網友回復：

你可以使用這個awk：

cat attrib.awk

BEGIN {
   FS=OFS=","
   print "name,attribute,value"
}
NR > 1 && match($0, k "[^,] ") {
   print $1, substr($0, RSTART 1, RLENGTH-1)
}

# then run it as
awk -v k=',height,' -f attrib.awk file

name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8

# or this one
awk -v k=',weight,' -f attrib.awk file

name,attribute,value
joe,weight,178
jesse,weight,165

uj5u.com熱心網友回復：

使用您顯示的示例，請嘗試以下awk代碼。在 GNU 中撰寫和測驗awk。簡單的解釋是，使用 GNUawk并設定RS（記錄分隔符）^[^,]*,height,[^,]*，然后根據要求列印 RT 以獲得預期的輸出。

awk -v RS='^[^,]*,height,[^,]*' 'RT{print RT}' Input_file

uj5u.com熱心網友回復：

我建議sed單線：

sed -n 's/^\([^,]*\).*\(,height,[^,]*\).*/\1\2/p' file.csv

uj5u.com熱心網友回復：

一個awk想法：

awk -v attr="height" '
BEGIN  { FS=OFS="," }
FNR==1 { print "name", "attribute", "value"; next }
       { for (i=2;i<=NF;i =2)                         # loop through even-numbered fields
             if ($i == attr) {                        # if field value is an exact match to the "attr" variable then ...
                print $1,$i,$(i 1)                    # print current name, current field and next field to stdout
                next                                  # no need to check rest of current line; skip to next input line
             }
       }
' file.csv

注意：這假設輸入值（height在本例中）將與檔案中的欄位完全匹配（包括相同的大小寫）

這會產生：

name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8

uj5u.com熱心網友回復：

與perl單線：

$ perl -lne '
    print "name,attribute,value" if $.==1;
    print "$1,$2" if /^(\w ).*(height,\d \.\d )/
' file

輸出

name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8

uj5u.com熱心網友回復：

awk-v接受腳本前標志后的可變值引數。因此，可以使用通用模式將所需屬性的名稱傳遞到 awk 腳本中：

awk -v attr=attribute1 ' {} ' file.csv

在腳本內部，傳遞的變數的值由變數名參考，在本例中為attr。

您的標準是列印第 1 列、包含名稱的第一列、與所需標題值對應的列以及緊接該列之后的列（包含匹配值）。

因此，以下腳本允許您找出標題為“attribute1”的列及其下一個鄰居：

awk -v attr=attribute1 ' BEGIN {FS=","} /attr/{for (i=1;i<=NF;i  ) if($i == attr) col=i;} {print $1","$col","$(col 1)} ' data.txt

結果：

name,attribute1,value1
joe,height,5.2
james,,
jesse,weight,165
jerome,hair,black

另一列（屬性 3）：

awk -v attr=attribute3 ' BEGIN {FS=","} /attr/{for (i=1;i<=NF;i  ) if($i == attr) col=i;} {print $1","$col","$(col 1)} ' awkNames.txt

結果：

name,attribute3,value3
joe,hair,
james,,
jesse,hair,brown
jerome,height,6.8

只需更改-v attr=所需列的引數值即可。

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/536855.html

標籤：狂欢awksed文本处理

上一篇：如何在bash腳本中執行帶有引數的“runuser”命令？

下一篇：如何使用SwiftUI放置形狀？

awk：在匹配單詞后選擇第一列和列中的值

輸出