我試圖徹底尋找這個解決方案,但并不幸運。希望在這里快速找到一些解決方案。我在 S3 中有一些遷移的檔案,現在需要確定給定路徑中涉及的檔案夾數量。假設我有一些檔案如下。
如果我給aws s3 ls s3://my-bucket/foo1 --recursive >> file_op.txt
“cat file_op.txt” - 如下所示:
my-bucket/foo1/foo2/foo3/foo4/foo5/foo6/foo7/file1.txt
my-bucket/foo1/foo2/foo3/foo4/foo5/foo6/foo7/file2.txt
my-bucket/foo1/foo2/foo3/foo4/foo5/foo6/file1.pdf
my-bucket/foo1/foo2/foo3/foo4/foo6/file2.txt
my-bucket/foo1/foo2/foo3/file3.txt
my-bucket/foo1/foo8/file1.txt
my-bucket/foo1/foo9/foo10/file4.csv
我已將輸出存盤在一個檔案中并進行處理以查找檔案的數量,wc -l
但我找不到路徑中涉及的檔案夾數量。
我需要如下輸出:
number of files : 7
number of folders : 9
編輯 1: 更正了預期的檔案夾數量。
(不包括my-bucket和foo1)
(foo6在foo5和foo4目錄中)
下面是我在計算目錄計數時失敗的代碼:
#!/bin/bash
if [[ "$#" -ne 1 ]] ; then
echo "Usage: $0 \"s3 folder path\" <eg. \"my-bucket/foo1\"> "
exit 1
else
start=$SECONDS
input=$1
input_code=$(echo $input | awk -F'/' '{print $1 "_" $3}')
#input_length=$(echo $input | awk -F'/' '{print NF}' )
s3bucket=$(echo $input | awk -F'/' '{print $1}')
db_name=$(echo $input | awk -F'/' '{print $3}')
pathfinder=$(echo $input | awk 'BEGIN{FS=OFS="/"} {first = $1; $1=""; print}'|sed 's#^/##g'|sed 's#$#/#g')
myn=$(whoami)
cdt=$(date %Y%m%d%H%M%S)
filename=$0_${myn}_${cdt}_${input_code}
folders=${filename}_folders
dcountfile=${filename}_dir_cnt
aws s3 ls s3://${input} --recursive | awk '{print $4}' > $filename
cat $filename |awk -F"$pathfinder" '{print $2}'| awk 'BEGIN{FS=OFS="/"}{NF--; print}'| sort -n | uniq > $folders
#grep -oP '(?<="$input_code" ).*'
fcount=`cat ${filename} | wc -l`
awk 'BEGIN{FS="/"}
{ if (NF > maxNF)
{
for (i = maxNF 1; i <= NF; i )
count[i] = 1;
maxNF = NF;
}
for (i = 1; i <= NF; i )
{
if (col[i] != "" && $i != col[i])
count[i] ;
col[i] = $i;
}
}
END {
for (i = 1; i <= maxNF; i )
print count[i];
}' $folders > $dcountfile
dcount=$(cat $dcountfile | xargs | awk '{for(i=t=0;i<NF;) t =$ i; $0=t}1' )
printf "Bucket name : \e[1;31m $s3bucket \e[0m\n" | tee -a ${filename}.out
printf "DB name : \e[1;31m $db_name \e[0m\n" | tee -a ${filename}.out
printf "Given folder path : \e[1;31m $input \e[0m\n" | tee -a ${filename}.out
printf "The number of folders in the given directory are\e[1;31m $dcount \e[0m\n" | tee -a ${filename}.out
printf "The number of files in the given directory are\e[1;31m $fcount \e[0m\n" | tee -a ${filename}.out
end=$SECONDS
elapsed=$((end - start))
printf '\n*** Script completed in %d:d:d - Elapsed %d:d:d ***\n' \
$((end / 3600)) $((end / 60 % 60)) $((end % 60)) \
$((elapsed / 3600)) $((elapsed / 60 % 60)) $((elapsed % 60)) | tee -a ${filename}.out
exit 0
fi
uj5u.com熱心網友回復:
你的問題不清楚。
如果我們在提供的串列中計算唯一的親屬檔案夾路徑,則有 12 個:
my-bucket/foo1/foo2/foo3/foo4/foo5/foo6/foo7
my-bucket/foo1/foo2/foo3/foo4/foo5/foo6
my-bucket/foo1/foo2/foo3/foo4/foo6
my-bucket/foo1/foo2/foo3/foo4/foo5
my-bucket/foo1/foo2/foo3/foo4
my-bucket/foo1/foo2/foo3
my-bucket/foo1/foo2
my-bucket/foo1/foo8
my-bucket/foo1/foo9/foo10
my-bucket/foo1/foo9
my-bucket/foo1
my-bucket
計算這個的awk腳本是:
BEGIN {FS = "/";} # set field deperator to "/"
{ # for each input line
commulativePath = OFS = ""; # reset commulativePath and OFS (Output Field Seperator) to ""
for (i = 1; i < NF; i ) { # loop all folders up to file name
if (i > 1) OFS = FS; # set OFS to "/" on second path
commulativePath = commulativePath OFS $i; # append current field to commulativePath variable
dirs[commulativePath] = 0; # insert commulativePath into an associative array dirs
}
}
END {
print NR " " length(dirs); # print records count, and associative array dirs length
}
如果我們計算唯一檔案夾名稱,則有 11 個:
my-bucket
foo1
foo2
foo3
foo4
foo5
foo6
foo7
foo8
foo9
foo10
計算這個的awk腳本是:
awk -F'/' '{for(i=1;i<NF;i )dirs[$i]=1;}END{print NR " " length(dirs)}' input.txt
uj5u.com熱心網友回復:
您已經澄清您想要計算唯一名稱,忽略前兩個級別(my-bucket和foo1)和最后一個級別(檔案名)。
perl -F/ -lane'
$f;
$d{ $F[$_] } for 2 .. $#F - 1;
END {
print "Number of files: ".( $f // 0 );
print "Number of dirs: ".( keys(%d) // 0 );
}
'
輸出:
Number of files: 7
number of dirs: 9
指定要處理的檔案到 Perl 單行
uj5u.com熱心網友回復:
如果您不介意使用管道并呼叫 awk 兩次,那么它相當干凈:
mawk 'BEGIN {OFS=ORS;FS="/";_^=_}_ _<NF && --NF~($_="")' file \
\
| mawk 'NF {_[$__]} END { print length(_) }'
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/421264.html
標籤:
上一篇:獲取'GetObject時發生錯誤。嘗試使用cloudFormation模板創建堆疊時S3錯誤代碼:NoSuchKey'
