當我這樣做時git log --all,我在日志中發現了一個有趣的提交:
commit 3a1a6bfbd936ea441ecf1f071e82f89c7e8bbf6c (replaced, origin/main)
replaced括號中的關鍵字是什么意思?以及如何觸發?
uj5u.com熱心網友回復:
這意味著有人使用git replace.
什么git replace是允許你告訴未來的 Git 操作,而不是一些原始物件,他們應該查看一些替換物件。本段介紹了替換的作業原理,但沒有告訴您這一切意味著什么。問題是,在這個層面上,意義還不存在。這就像說中子俘獲導致 U-235 原子核裂變成兩個重量更輕的原子核,發射出兩個中子。沒錯,但那又怎樣?那么,核反應堆或原子彈。我們已經從干核物理走向了嚴重的后果。
幸運的是,Git 替換并不是那么引人注目。但是一個簡單的替換可能會產生巨大的后果。它的后果將有,在你的資料庫,是不是我們可以提前確定。我們所能做的就是描述替換背后的想法。
替換背后的想法
任何Git 物件,一旦創建,都是只讀的,并且只要有人/某事正在使用它,就會繼續存在于存盤庫中。這種只讀質量的原因是每個物件都通過其哈希 ID 在鍵值資料庫中找到(或尋址,用一個奇特的術語),在一個鍵值資料庫中,其鍵是哈希 ID,其值是哈希物件。當 Git 從資料庫中提取物件時,Git 會重新計算散列,并驗證檢索到的物件的散列是否與用于檢索物件的鍵匹配。這保證了物件資料不會損壞。1
如果我們在進行新提交時犯了一個錯誤,而其他人現在沒有在使用,并且可以快速檢測到我們自己的錯誤,我們可以通過用新提交快速替換原始提交來糾正我們的錯誤。我們的原始提交只能通過存盤在某個分支名稱中的哈希 ID找到。如果我們為它做一個新的替換提交,并糾正錯誤,新的提交將有一些其他不同的哈希 ID。我們存盤新的替代單位犯的哈希ID分行名稱(這是可寫的),我們就大功告成了:“壞”的承諾仍然存在,但未被使用。在沒有人使用它的情況下,Git 最終會完全放棄它。2
這對于新提交來說很好,其哈希 ID 僅存盤在單個分支名稱中。但是如果提交不是那么新呢?特別是,提交哈希 ID 存盤在以后的提交中。如果這個“壞”提交是提交鏈的一部分,我們就有問題了。
請記住,提交形成后向鏈,通過分支名稱找到,該分支名稱指向 Git 所謂的提示提交:鏈中的最后一次提交。也就是說,給定一系列提交,每個提交都有自己的哈希 ID,我們可以使用單個大寫字母代表哈希 ID 來繪制它們:
... <-F <-G <-H <--main
該名稱 main指向尖提交,其哈希值H。該提交向后指向較早的提交G。CommitG向后指向更早的 commit F,依此類推。
如果 commit 有錯誤F,我們可以嘗試做git commit --amend它所做的:制作一個新的和改進的F'并推開F:
F ...
/
... <-F'
但是當我們這樣做時,現有的提交G——字面上包含現有提交的哈希 IDF并且不能更改——仍然指向F:
F <-G <-H <--main
/
... <-F'
我們簡單的修改嘗試是F行不通的,因為main指向的不是F,而是H。 H指向G,并將永遠這樣做。 G指向F,并將永遠這樣做。我們可以復制G和H到新的和改進的G'和H':
F <-G <-H <--main
/
... <-F' <-G' <-H'
并作出3份,我們現在可以重新點的分支名稱main:
F <-G <-H
/
... <-F' <-G' <-H' <--main
這就是git rebase它的作用。但它的缺點是之后的每次提交F也必須被復制。如果有復雜的鏈:
I--J <-- br1
/
...--F--G--H <-- main
\
K--L <-- br2
the whole thing rapidly becomes a nightmare of history rewriting, with the need to move multiple branch names. You can do this using git filter-branch or git filter-repo, but it's painful and not something you want to do frequently. This is where git replace comes in.
1If the key used to retrieve the object, compared to the hash of the object, does not match, something happened to the data since they were originally written. The hash function is of no help in correcting the erroneous data, so at this point we're stuck with finding a good copy, presumably in another clone or a backup. That's why disk drives use, e.g., Reed-Solomon codes rather than cryptographic checksums. Git's job here is only to find corruption, not to fix it.
2This "eventually" is a maintenance operation. The newfangled git maintenance command can be used to tune this stuff—that's the future direction for Git—but the actual dropping is done via git gc or git gc --auto, in existing Git usage. That works as follows:
git gcrunsgit reflog expire.git reflogscans reflogs, which contain reflog entries.- The reflog entries each have a date-and-time stamp, and a status ("reachable" or "unreachable") implied by the current hash ID stored in the corresponding ref.
- The status leads
git reflog expireto one of two "expiry" values: reachable, for commits reachable from the current ref value, and unreachable, for commits not reachable this way. - If the age of the entry exceeds the expiry value—30 days for "unreachable", by default—the reflog entry is deleted.
This drops the last actual reference to the internal Git commit object, which can now be deleted via git prune, which git gc runs after git reflog expire. So, running git commit --amend right after git commit pushes the "amended" commit off to the side, where it lingers for a minimum of 30 days thanks to reflog entries: one in the HEAD reflog and one in the branch reflog. Once the reflog entries are gone, there really is no reference to the commit, and git prune will prune it.
Replacements
The mechanism Git uses for replacements is simple. There's a relatively low level routine in Git to obtain an object from the objects database—that key-value store I mentioned earlier, where the keys are hash IDs and the values are objects. You give the key to the database lookup code and it fishes out the value.
Now, if you allow replacements—there are control knobs for this, at this level—then when you call the "get me an object, I have its hash ID" function, the lookup function will check to see if the object's hash ID exists as a name in the refs/replace/ namespace.
So: we can make a replacement commit F' that is a new and improved version of F. This commit has a hash ID, once we've written it to the object database. Let's say F had hash ID aaaaaaa, and F' has hash ID bbbbbbb (I've shortened them from 40 characters to 7 to make them easier to deal with, and real hash IDs are of course random looking).
We now store the hash ID bbbbbbb under the name refs/replace/aaaaaaa. That is, the hash ID of commit F, whatever it is, becomes a refs/replace/ name. In that name we store the hash ID of the replacement commit, here bbbbbbb.
When some other piece of Git software calls the "look up object" function with hash ID aaaaaaa, that software notices that refs/replace/aaaaaaa exists. That software reads the hash ID stored in refs/replace/aaaaaaa and, instead of looking up (and error-checking) aaaaaaa, it looks up (and error-checks) bbbbbbb instead. It then returns the replacement object's content, instead of the original object's content.
This means that when git log or git checkout or any other Git command goes to use commit F, it gets commit F' instead. Hence we've successfully replaced commit F without actually changing commit F.3 The git log command in particular makes sure to notice that this happened (the lookup routine will set a flag for git log to see) and adds the replaced notation that you saw.
3Note that this makes git gc and git prune have to work harder, because object F is still referenced "for real", while F' is referenced via the refs/replace/ name. Fortunately it suffices for git gc to run with replacements disabled.
看到現實,以及為什么這很重要
如果您想查看資料庫中的真實內容,無需替換,您可以運行git --no-replace-objects log. 這將git log呼叫禁用替換的“獲取物件”函式。您將看到原始歷史記錄,而不是被替換的歷史記錄。
要查看替換物件,請使用git replace --list(或git replace不帶引數,即--list),或在軟體中使用git for-each-ref refs/replace。
請注意,當你克隆一個倉庫,在克隆程序通常不復制的refs/replace/命名空間。默認情況下,使用git push也不會復制refs/replace/名稱。因此,當您使用git replace在您的存盤庫中構建虛幻的歷史時,這只會影響您的存盤庫。
您也可以替換非提交物件。因為替換是一個如此低級的操作,您可以將其用于各種有趣的效果。但它始終是本地的,除非您采取特殊措施將refs/replace/參考也放入另一個存盤庫。
請注意,使用git filter-branch和git filter-repo將使具有替換的新存盤庫受到尊重(盡管git --no-replace-objects filter-branch不會,并且可能與 有類似的事情filter-repo)。因此,一種用途git replace是編輯歷史記錄,直到它看起來像您希望其他人看到的那樣。然后,您運行其他無操作過濾器操作,它“將新的歷史記錄固定到位”,無需替換(它們現在已嵌入,而原件已消失)。然后您發布這個新的、不同的存盤庫而不是原始存盤庫。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/368847.html
下一篇:如何同步不同存盤庫中的兩個分支
