如何根據第一行的內容對bash中的段落等行進行排序？-有解無憂

我想根據第一行按字母順序對檔案中的各個段落進行排序：

Hampton  
this is good  
(mind the mail)

Burlington  
I'm fine

Greater Yukonshire Fields  
(empty)

這些文本塊可能由一行或多行組成，但由一個或多個空行分隔。

期望的結果：

Burlington 
I'm fine

Greater Yukonshire Fields 
(empty)


Hampton 
this is good 
(mind the mail)

uj5u.com熱心網友回復：

一個GNU awk想法：

awk 'BEGIN { RS="" } 
           { a[FNR]=$0 }
     END   { PROCINFO["sorted_in"]="@val_str_asc"
             for (i in a)
                 print a[i] ORS
           }
' paragraphs

注意GNU awk：需要PROCINFO["sorted_in"]

這會產生：

Burlington
I'm fine

Greater Yukonshire Fields
(empty)

Hampton
this is good
(mind the mail)

uj5u.com熱心網友回復：

請嘗試一下msort，這將適用于大多數 Linux 發行版：

msort -bwq file

輸出：

Burlington  
I'm fine

Greater Yukonshire Fields  
(empty)

Hampton  
this is good  
(mind the mail)

選項：

-b一條記錄由兩個或多個換行符終止
-w對記錄的整個文本進行排序
-q保持安靜 - 作業時不要聊天

uj5u.com熱心網友回復：

使用perl：

$ perl -00 -lne '
  push @paras, [ substr($_, 0, index($_, "\n")), $_ ];
  END {
    for my $para (sort { $a->[0] cmp $b->[0] } @paras) {
      print $para->[1]
    }
  }' input.txt
Burlington
I'm fine

Greater Yukonshire Fields
(empty)

Hampton
this is good
(mind the mail)

該-00選項以“段落模式”而不是行讀取，其中多個換行符分隔一個段落。對于每個段落，它會提取第一行并將其和段落保存在一個串列中，然后在讀取整個檔案后，根據第一行進行排序并列印段落。

uj5u.com熱心網友回復：

使用awk：

一種按行閱讀的方式：

awk '
  {if (NF) a[p]=(a[p] $0 ORS); else p  }           # Collect
  END {asort(a); for (i in a) print a[i]}          # Sort and Output
' input.txt

另一種逐段閱讀的方式：

awk -v RS='\n{2,}' '
  {a[FNR]=$0}                                      # Collect
  END {asort(a); for (i in a) print a[i] ORS}      # Sort and Output
' input.txt

輸出

Burlington  
I'm fine

Greater Yukonshire Fields  
(empty)

Hampton  
this is good  
(mind the mail)

兩者都在陣列中收集連接的行。然后將其排序并輸出。

uj5u.com熱心網友回復：

一種使用ruby.

首先初始化一個計數器i和一個二維陣列arr，然后追加行$_
如果它發現一個空行增加計數器
將換行符附加到最后一段（最后一行沒有）
最后列印排序后的陣列

% ruby -ne 'i ||= 0; arr ||= []; arr[i] ||= []; arr[i] << $_
            i  = 1 if $_.length == 1
            END{ arr[i] << "" 
                 puts arr.sort }' file      
Burlington  
I'm fine

Greater Yukonshire Fields  
(empty)

Hampton  
this is good  
(mind the mail)

uj5u.com熱心網友回復：

使用任何 awk sort 并假設您的資料中沒有任何\rs：

$ awk -v RS= -F'\n' -v OFS='\r' '{$1=$1}1' file |
    sort |
    awk -v ORS='\n\n' -F'\r' -v OFS='\n' '{$1=$1}1'
Burlington
I'm fine

Greater Yukonshire Fields
(empty)

Hampton
this is good
(mind the mail)

我們只是將每個段落的行與第一個 awk 連接在一起，然后對其進行排序，然后再次將行分開：

$ awk -v RS= -F'\n' -v OFS='\r' '{$1=$1}1' file | cat -Ev
Hampton  ^Mthis is good  ^M(mind the mail)$
Burlington  ^MI'm fine$
Greater Yukonshire Fields  ^M(empty)$

$ awk -v RS= -F'\n' -v OFS='\r' '{$1=$1}1' file | sort | cat -Ev
Burlington  ^MI'm fine$
Greater Yukonshire Fields  ^M(empty)$
Hampton  ^Mthis is good  ^M(mind the mail)$

管道cat -Ev只是為了讓您可以看到其他不可見的CRaka \raka ^Ms。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/430501.html

標籤：重击排序 awk sed

上一篇：如何通過在bash/zsh中拆分字串來創建兩個變數？

下一篇：在awk腳本中保留浮點值小數位