如果我運行git fetch origin然后git checkout <revision>進行一系列連續提交,我會得到一個相對較小的 repo 目錄。
但是如果我運行git fetch origin <revision>然后git checkout FETCH_HEAD在同一系列的提交上,目錄就比較臃腫了。具體來說,似乎有一堆大包檔案。
無論提交是在第一次提交時全部到位fetch還是在每次提取之前立即提交,行為看起來都是相同的。
以下示例使用公共存盤庫,因此您可以重現該行為。
為什么示例 2 的目錄大小要大得多?
示例 1(小):
mkdir argo-cd
cd argo-cd/
git init
git remote add origin https://github.com/argoproj/argo-cd.git
git fetch origin
git checkout 497e53b0203638409e3083fa2ffac7d8fb3cce14
git fetch origin
git checkout 32be020af0f8bf6438201ee79b4d2b8037c57154
git fetch origin
git checkout 32d33dedcc70d94177384b235891b99d89497273
git fetch origin
git checkout 2e65b42f05bcc1401d1489e751993ec197f6942c
git fetch origin
git checkout b1ff9dbe1e3e3b2520e94eefc77d0322c765cd75
ls .git/objects/pack # shows two files
du -h . # current directory is 96M
示例 2(大):
cd ..
mkdir argo-cd-fetch
cd argo-cd-fetch/
git init
git remote add origin https://github.com/argoproj/argo-cd.git
git checkout FETCH_HEAD
git fetch origin 497e53b0203638409e3083fa2ffac7d8fb3cce14
git checkout FETCH_HEAD
git fetch origin 32be020af0f8bf6438201ee79b4d2b8037c57154
git checkout FETCH_HEAD
git fetch origin 32d33dedcc70d94177384b235891b99d89497273
git checkout FETCH_HEAD
git fetch origin 2e65b42f05bcc1401d1489e751993ec197f6942c
git checkout FETCH_HEAD
git fetch origin b1ff9dbe1e3e3b2520e94eefc77d0322c765cd75
git checkout FETCH_HEAD
ls .git/objects/pack. # shows ten files
du -sh . # current directory is 244M
注意:我使用的是 git 2.32.0。
Note: The question is inspired by an apparent bug in Argo CD (https://github.com/argoproj/argo-cd/pull/8897). That's why I don't just git gc to clean up the waste.
Update / Clarification:
Below are the full logs of each example. But in this case, I pushed each commit to my fork immediately before running the next git fetch. So in this case we know that the initial fetch isn't "fetching everything," leaving the subsequent steps with basically nothing left to do.
Example 1 (small):
$ mkdir argo-cd-fork
~ $ cd argo-cd-fork/
~/argo-cd-fork $ git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /Users/mcrenshaw/argo-cd-fork/.git/
~/argo-cd-fork (master|?) $ git remote add origin https://github.com/crenshaw-dev/argo-cd.git
# Fetch 1
~/argo-cd-fork (master|?) $ git fetch origin
remote: Enumerating objects: 83781, done.
remote: Counting objects: 100% (89/89), done.
remote: Compressing objects: 100% (62/62), done.
remote: Total 83781 (delta 60), reused 45 (delta 25), pack-reused 83692
Receiving objects: 100% (83781/83781), 60.99 MiB | 22.12 MiB/s, done.
Resolving deltas: 100% (52061/52061), done.
From https://github.com/crenshaw-dev/argo-cd
* [new branch] add-chart-field-to-application-yaml -> origin/add-chart-field-to-application-yaml
... removed a bunch of branches and tags for brevity ...
* [new tag] v2.1.4 -> v2.1.4
~/argo-cd-fork (master|?) $ du -sh .
65M .
~/argo-cd-fork (master|?) $ git checkout afb1fe635ff7f5c435c5780ba665c72d5bc3c557
Note: switching to 'afb1fe635ff7f5c435c5780ba665c72d5bc3c557'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
HEAD is now at afb1fe635 chore: fix unit test
# Fetch 2
~/argo-cd-fork ((afb1fe63…)|?) $ git fetch origin
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 161 bytes | 161.00 KiB/s, done.
From https://github.com/crenshaw-dev/argo-cd
afb1fe635..f8fe71ab8 master -> origin/master
~/argo-cd-fork ((afb1fe63…)|?) $ git checkout f8fe71ab8f38095e296932b73f929bfbaf24f110
Previous HEAD position was afb1fe635 chore: fix unit test
HEAD is now at f8fe71ab8 test
# Fetch 3
~/argo-cd-fork ((f8fe71ab…)|?) $ git fetch origin
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 162 bytes | 81.00 KiB/s, done.
From https://github.com/crenshaw-dev/argo-cd
f8fe71ab8..0363d622c master -> origin/master
~/argo-cd-fork ((f8fe71ab…)|?) $ git checkout 0363d622c391947349689904f6b40209ff3123cd
Previous HEAD position was f8fe71ab8 test
HEAD is now at 0363d622c test
# Fetch 4
~/argo-cd-fork ((0363d622…)|?) $ git fetch origin
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 161 bytes | 161.00 KiB/s, done.
From https://github.com/crenshaw-dev/argo-cd
0363d622c..4115a8c12 master -> origin/master
~/argo-cd-fork ((0363d622…)|?) $ git checkout 4115a8c1221751b1586caaf9871a0be12b5ce891
Previous HEAD position was 0363d622c test
HEAD is now at 4115a8c12 test
# Fetch 5
~/argo-cd-fork ((4115a8c1…)|?) $ git fetch origin
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0
Unpacking objects: 100% (1/1), 161 bytes | 161.00 KiB/s, done.
From https://github.com/crenshaw-dev/argo-cd
4115a8c12..8f01aaddb master -> origin/master
~/argo-cd-fork ((4115a8c1…)|?) $ git checkout 8f01aaddbaf4350217dcc84866275493b19308eb
Previous HEAD position was 4115a8c12 test
HEAD is now at 8f01aaddb test
~/argo-cd-fork ((8f01aadd…)|?) $ du -sh .
96M .
Example 2 (large):
~/argo-cd-fork ((8f01aadd…)|?) $ cd ..
~ $ mkdir argo-cd-fork-2
~ $ cd argo-cd-fork-2
~/argo-cd-fork-2 [128]$ git init
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /Users/mcrenshaw/argo-cd-fork-2/.git/
~/argo-cd-fork-2 (master|?) $ git remote add origin https://github.com/crenshaw-dev/argo-cd.git
# Fetch 1
~/argo-cd-fork-2 (master|?) $ git fetch origin 8f01aaddbaf4350217dcc84866275493b19308eb
remote: Enumerating objects: 47713, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 47713 (delta 3), reused 1 (delta 0), pack-reused 47709
Receiving objects: 100% (47713/47713), 40.90 MiB | 26.40 MiB/s, done.
Resolving deltas: 100% (31970/31970), done.
From https://github.com/crenshaw-dev/argo-cd
* branch 8f01aaddbaf4350217dcc84866275493b19308eb -> FETCH_HEAD
~/argo-cd-fork-2 (master|?) $ git checkout FETCH_HEAD
Note: switching to 'FETCH_HEAD'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
HEAD is now at 8f01aadd test
# Fetch 2
~/argo-cd-fork-2 ((8f01aadd…)|?) $ git fetch origin 3fad137f5dcd8ebdb504a8b8de0138fb92d76458
remote: Enumerating objects: 47714, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 47714 (delta 4), reused 1 (delta 0), pack-reused 47709
Receiving objects: 100% (47714/47714), 40.90 MiB | 19.89 MiB/s, done.
Resolving deltas: 100% (31971/31971), done.
From https://github.com/crenshaw-dev/argo-cd
* branch 3fad137f5dcd8ebdb504a8b8de0138fb92d76458 -> FETCH_HEAD
~/argo-cd-fork-2 ((8f01aadd…)|?) $ git checkout FETCH_HEAD
Previous HEAD position was 8f01aaddb test
HEAD is now at 3fad137f5 test
# Fetch 3
~/argo-cd-fork-2 ((3fad137f…)|?) $ git fetch origin a94ab16b0964c2b583f8b923ad5a84b2a6b2b716
remote: Enumerating objects: 47715, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 47715 (delta 5), reused 1 (delta 0), pack-reused 47709
Receiving objects: 100% (47715/47715), 40.90 MiB | 5.89 MiB/s, done.
Resolving deltas: 100% (31972/31972), done.
From https://github.com/crenshaw-dev/argo-cd
* branch a94ab16b0964c2b583f8b923ad5a84b2a6b2b716 -> FETCH_HEAD
~/argo-cd-fork-2 ((3fad137f…)|?) $ git checkout FETCH_HEAD
Previous HEAD position was 3fad137f5 test
HEAD is now at a94ab16b0 test
# Fetch 4
~/argo-cd-fork-2 ((a94ab16b…)|?) $ git fetch origin bf651bfc6653b6cf13a522d590a8779fc3b66a77
remote: Enumerating objects: 47716, done.
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 47716 (delta 6), reused 1 (delta 0), pack-reused 47709
Receiving objects: 100% (47716/47716), 40.90 MiB | 7.31 MiB/s, done.
Resolving deltas: 100% (31973/31973), done.
From https://github.com/crenshaw-dev/argo-cd
* branch bf651bfc6653b6cf13a522d590a8779fc3b66a77 -> FETCH_HEAD
~/argo-cd-fork-2 ((a94ab16b…)|?) $ git checkout FETCH_HEAD
Previous HEAD position was a94ab16b0 test
HEAD is now at bf651bfc6 test
# Fetch 5
~/argo-cd-fork-2 ((bf651bfc…)|?) $ git fetch origin 81895cf2a3f6e030aef7ddadc390b7a7743af03d
remote: Enumerating objects: 47717, done.
remote: Counting objects: 100% (8/8), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 47717 (delta 7), reused 1 (delta 0), pack-reused 47709
Receiving objects: 100% (47717/47717), 41.00 MiB | 9.17 MiB/s, done.
Resolving deltas: 100% (32005/32005), done.
From https://github.com/crenshaw-dev/argo-cd
* branch 81895cf2a3f6e030aef7ddadc390b7a7743af03d -> FETCH_HEAD
~/argo-cd-fork-2 ((bf651bfc…)|?) $ git checkout FETCH_HEAD
Previous HEAD position was bf651bfc6 test
HEAD is now at 81895cf2a test
~/argo-cd-fork-2 ((81895cf2…)|?) $ du -sh .
242M .
uj5u.com熱心網友回復:
因為每次獲取都會產生自己的包檔案,一個包檔案比多個包檔案更有效。效率高很多。如何?
首先,結帳是一個紅鯡魚。它們不會影響 .git/ 目錄的大小。
其次,在第一個例子中,只有第一個git fetch origin做任何事情。其余的將一無所獲(除非原點發生了變化)。
為什么多個包檔案效率較低?
壓縮通過在資料中找到常見的長序列并將它們減少為非常短的序列來作業。如果<div>long block of legal mumbo jumbo</div>出現幾十次,可以用幾個位元組代替。但仍必須存盤原始長字串。如果只有一個包檔案,它只能存盤一次。如果有多個包檔案,則必須多次存盤。您實際上是在每個包檔案中存盤了截至該點的整個更改歷史記錄。
我們可以在下面的例子中看到,第一個packfile是113M,第二個是161M,第三個是177M,最后fetch是209M。最終打包檔案的大小大致等于單個垃圾壓縮打包檔案的大小。
為什么多次提取會導致多個包檔案?
git fetch效率很高。它只會獲取您尚未擁有的物件。發送單個目標檔案效率低下。智能 Git 服務器會將它們作為單個包檔案發送。
當你git fetch在一個新的存盤庫上做一個單一的事情時,Git 會向服務器詢問每個物件。遠程向它發送每個物件的包檔案。
當你這樣做git fetch ABC然后git fetch DEFs 時,Git 告訴服務器“我已經擁有了 ABC 的所有內容,給我所有的物件到 DEF”,因此服務器創建一個從 ABC 到 DEF 的所有內容的新包檔案并發送它。
最終,您的存盤庫將執行自動垃圾收集并將它們重新打包到單個包檔案中。
我們可以減少例子。我將使用 Rails 來說明,因為它有明確定義的標簽來獲取。
git init
git remote add origin https://github.com/rails/rails.git
git fetch origin
du -sh .git/objects/pack/*
22M .git/objects/pack/pack-ef0a91833c4774a28a21c814a26e04043621512d.idx
209M .git/objects/pack/pack-ef0a91833c4774a28a21c814a26e04043621512d.pack
和:
git init
git remote add origin https://github.com/rails/rails.git
git fetch origin v5.0.0
du -sh .git/objects/pack/*
13M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.idx
113M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.pack
git fetch origin v6.0.0
du -sh .git/objects/pack/*
13M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.idx
113M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.pack
16M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.idx
161M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.pack
git fetch origin v7.0.0
du -sh .git/objects/pack/*
18M .git/objects/pack/pack-2d2066f04670f137265fed0f382ad0d6f0dd9f3e.idx
177M .git/objects/pack/pack-2d2066f04670f137265fed0f382ad0d6f0dd9f3e.pack
13M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.idx
113M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.pack
16M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.idx
161M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.pack
git fetch origin
du -sh .git/objects/pack/*
18M .git/objects/pack/pack-2d2066f04670f137265fed0f382ad0d6f0dd9f3e.idx
177M .git/objects/pack/pack-2d2066f04670f137265fed0f382ad0d6f0dd9f3e.pack
13M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.idx
113M .git/objects/pack/pack-7be7f8792d634f63a623e50165a11983e7cdaeef.pack
22M .git/objects/pack/pack-b28e1368cf8e1ee0152e7dd7b328760c5b589c40.idx
209M .git/objects/pack/pack-b28e1368cf8e1ee0152e7dd7b328760c5b589c40.pack
16M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.idx
161M .git/objects/pack/pack-c81c5343636211ffcc9ffdfeeb3bb65b9cba75df.pack
And after garbage collection this is all collected into a single packfile roughly the same size as the single fetch.
git gc
du -sh .git/objects/pack/*
22M .git/objects/pack/pack-7f1d7066fb6c5bd6a47749b215c020fab5ca416b.idx
212M .git/objects/pack/pack-7f1d7066fb6c5bd6a47749b215c020fab5ca416b.pack
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/451140.html
標籤:git
上一篇:在數百次提交后切換到GitLFS時是否應該遷移本地歷史記錄?
下一篇:更新MacOS后的Git提交問題
