我有一個feature/mine分支，我要合并到feature/other，但是地方發展程序中出現了在仍處于特定的檔案夾所做的更改feature/mine，不應該在feature/other。因此，我的 PR 中的差異顯示了中的更改src/some-folder-I-dont-want-to-change，但這些更改是在提交樹的深處進行的。我不能簡單地還原一些提交。

我想要的是簡單地將 my 設定src/some-folder-I-dont-want-to-change為feature/other's src/some-folder-I-dont-want-to-change。有沒有辦法做到這一點？

我試過了，git checkout feature/other -- src/some-folder-I-dont-want-to-change但這只是從中添加檔案feature/other，它不會洗掉我的分支上但不應該洗掉的檔案。

uj5u.com熱心網友回復：

聽起來您正在尋找git restore，例如參見https://stackoverflow.com/a/15404733/6060876。

我相信對于您的用例，您需要撰寫如下內容：

git restore --source=feature/other --staged --worktree -- src/some-folder-I-dont-want-to-change

uj5u.com熱心網友回復：

TL; 博士

您可能只想從特定提交中檢出一些檔案，然后再次提交。然后，從合并基礎到您的分支提示的差異將不包括對這些檔案的任何更改，或者將對這些檔案進行相同的更改，具體取決于您的確切需求。這可能會造成未來的問題，因此您需要注意這一點。

相反，您可能希望使用一些新的和改進的提交git rebase來替換現有的提交。這比較復雜，可能會設定一些眼前的問題，但很可能會解決未來的問題。

長

您認為您的專案就好像它是一個簡單的檔案集合，但您使用的是 Git。Git 與檔案無關；Git 是關于commits 的。一次提交包含檔案——事實上，每個提交都有每個檔案的完整快照——但它是一次提交，而不是一些檔案。然后有些人認為 Git 是關于分支的，但這也不對。分支幫助我們（和 Git）找到提交。但是 Git 是關于commits 的。所以你需要在這里考慮提交，而不是檔案或檔案夾。

這并不能解決您的問題，但您需要了解所有相關資訊才能解決您的問題。所以繼續閱讀。

關于提交的知識

在 Git 中，提交是一個編號物體。每個提交都有一個唯一的編號，Git 將其稱為哈希 ID或有時稱為物件 ID。但是，這些不是簡單的計數：它們不會提交 #1、#2、#3 等等。相反，每個提交的唯一編號是巨大的——在 1 到 2 ¹⁶⁰ -1 之間——而且顯然是隨機的，并且通常以十六進制表示，例如^{_{cefe983a320c03d7843ac78e73bd513a27806845}}.

這些數字對人類毫無用處（所以我們大多不使用它們——如果需要，您偶爾會使用剪切和粘貼的數字），但它們是 Git 實際查找提交的方式。所以 Git 會需要它們。稍后我們將看到您如何避免輸入它們。

這些編號的物體（這些提交）中的每一個都包含兩件事：

每個提交都有每個檔案的完整快照。此快照是一種特殊的、只讀的、僅限 Git 的、壓縮和重復資料洗掉的形式。除了Git 本身之外，提交中的檔案不能被任何東西使用。
與快照分開，每個提交都有一些元資料，或關于提交本身的資訊：例如，誰做的（姓名和電子郵件）以及何時提交。

對 Git 本身至關重要的是，每個提交都保存了一些早期提交的原始哈希 ID 。大多數提交——Git 稱之為普通提交的提交——持有前一個提交的哈希 ID。

這一切對你來說意味著你只需要告訴 Git最新提交的哈希 ID 。假設我們畫這個，用大寫字母代表實際的提交哈希 ID，這太難看了。讓我們用H來代表最新的Hash ID：

在H的元資料中，Git 存盤了一些早期提交的原始哈希 ID。讓我們稱之為 commit G：

          G <-H

通過讀取H，Git 將能夠獲得的哈希 ID G，從而能夠讀取G. 所以我們說H 指向 G.

但G它本身也是一個提交，所以它具有某個較早提交的哈希 ID F，而后者又具有另一個更早提交的哈希 ID：

... <-F <-G <-H

Hence, given the hash ID of the latest commit, Git can easily find every previous commit, all the way back in time. Git simply follows the backwards-pointing internal arrows, one step at a time.

Since H and G both have complete snapshots—in Git's special de-duplicated format, no less—Git can easily retrieve both G and H and compare them, to see what files changed, if any. That lets Git figure out what you did, if you made H, and show you changes. Git didn't store the changes, it just computes them when you ask it to show H, on the theory that showing the changes since G is more interesting than showing the raw contents of H.

In fact, it's not just the snapshot that's read-only: all parts of any commit are completely read-only. So these internal arrows, once a commit is made, are set that way forever.

Branch and other names

To do all this, though, Git needs to know the hash ID of that latest commit. This is where branch names enter the picture. We get a little lazy about drawing the arrows from commit to commit, and draw them like this:

...--F--G--H   <-- main

Here the name main points to commit H, the way H points to G. Unlike the arrows inside commits, though, the arrows coming from a branch name can change: we can make main point to G if we really want to. (That's a lot of what git reset is about, though I'm not going to cover it here.)

We can make more branch names. The only limit Git makes on our branch names is that they must point to actual commits. So we can have two names that both point to H, like this:

...--G--H   <-- develop, main

Now that we have more than one name, we need a way to remember which name we're using. Right now both names find commit H, so in some sense it doesn't matter which name we're using, but we are about to change that. So, to remember the name we're using, we'll attach the special name HEAD, written in all uppercase like this, to one branch name:

...--G--H   <-- develop, main (HEAD)

If we now use git switch develop or git checkout develop, we get:

...--G--H   <-- develop (HEAD), main

We are still using commit H, we're just doing that through the name develop now.

Besides branch names, Git lets us have tag names and remote-tracking names. These will generally also point to commits (though tag names often do so through an extra Git object, so as to be able to store annotations). I'll come back to this in a while.

Adding new commits: Git's index or staging area

When we make a new commit, we:

use git checkout or git switch to pick a starting commit;
modify some files and run git add on them;
run git commit.

You might have wondered why we have to run git add over and over again: didn't Git get the memo about those files the first time?

The answer here has to do with Git's index. This part of Git is quite central and important, and Git forces you to know about it. Other version control systems may have something like the index, but keep it well hidden so that you don't have to care, but Git is very insistent here. (It's so important that it actually has three names: it's not just the index or staging area: Git sometimes calls it the cache. This is probably the worst of the three names, and is now mostly just found in flags like git rm --cached.)

Now, we already know that the snapshot inside a commit is read-only, and in fact only Git can read these files. So Git is going to have to copy the files from the commit into a more usable form. These usable copies live in what Git calls your working tree: the place you do your work.

That's pretty straightforward: "checking out" some commit means extract its files. What's not straightforward is that the way Git does this is in two steps:

First, Git "copies" the files to its index, removing from its index the files that are there from the previously checked out commit.
Then Git copies the files from the index to your working tree.

The word "copies" in the first step is in quotes because what Git puts in the index is in Git's compressed and de-duplicated format. Since these files all came out of some commit, they're automatically duplicates. That means they take no space.¹ But still, they act like copies.

The copies in your working tree are ordinary files, in your computer's ordinary format, de-compressed, so they really are copies. And that's why you have to run git add so often.

What git add does is read the working tree version, compress and Git-ify it, and check to see if it's a duplicate. If it is a duplicate, there's already a copy of the data in the repository, and Git re-uses that copy. If it's not a duplicate, the compressed data are now ready to be the first copy, and Git uses that copy. Either way, Git now updates the index entry for the file and the file is ready to be committed.

So git add really means update my proposed next commit, and the index is really your proposed next commit. It starts out matching the current commit and as you git add files to it, you update the proposed commit.

This makes git commit's job faster and easier. When you run git commit:

Git gathers any needed metadata, such as your name and email address and a log message, for the new commit. Git uses the current commit, as found by reading HEAD to see which branch name is the current branch, and reading the branch name to find out the commit's hash ID, as the parent for the new commit.
Git turns the proposed commit, in the index, into an actual commit, using the metadata from step 1. Git writes all this out as a new commit object, into the big commit-objects-database. Since this is a new commit, it gets a new, unique, random-looking (but not random) hash ID: this is the first time Git actually finds the hash ID. Since the hash ID depends on all of the data—including not just your name and email address, but also the exact second at which you make the commit—there's no way to know in advance what the hash ID would be.
As its last trick, git commit writes the new commit's hash ID into the current branch name.

The result is that we go from:

...--G--H   <-- develop (HEAD), main

to:

...--G--H   <-- main
         \
          I   <-- develop (HEAD)

where commit I is our new commit.

If we now run git checkout main, Git will remove, from its index and our working tree, all the files that are in commit I—they're safely saved forever² in that commit—and fill in its index and our working tree from the files saved in commit H.³

¹There's some space per index entry for the file's name, mode, cache data, and hash ID. The actual amount varies, but averages a bit under 100 bytes per file in a lot of cases.

²Forever, or as long as the commit itself exists, that is.

³Git plays a bunch of speed and cleverness tricks here: it can tell, easily, which files are different in H and I, and can skip the remove-and-replace for any files that aren't different. When taken to the extreme of switching between branch names that all point to the same commit, this makes git checkout or git switch really fast, as nothing actually changes except the binding of HEAD. It also means that we can switch branches with uncommitted work lying around, since Git won't have to swap out the files.

How branches grow; merging

This shows us how branches grow. We might start with:

...--G--H   <-- main

From here, we create two new branch names, feature1 and feature2 or feature/tall and feature/short or whatever. I'll just use br1 and br2:

...--G--H   <-- main, br1, br2

We pick br1 to be "on", check it out, and make two new commits:

          I--J   <-- br1 (HEAD)
         /
...--G--H   <-- main, br2

Now we check out br2 and make two other new commits:

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K--L   <-- br2 (HEAD)

We'll stop bothering to draw in the name main in a moment: it's not important, since what matters to Git are the commits. Note that commits up through H are on all three branches, while commits I-J are only on br1 and commits K-L are only on br2.

Let's now check out br1 and run git merge br2, and look very quickly at how git merge works:

          I--J   <-- br1 (HEAD)
         /
...--G--H
         \
          K--L   <-- br2

The git merge command needs to combine work. To do so, it has to find a common starting point: the best commit that's on both branches. Git will do the usual work-backwards-from-the-end thing to do this. (Technically Git uses the Lowest Common Ancestor algorithm for this, using the extension to DAGs.) In this case the best shared commit is obvious by eyeball though: it's commit H.

In order to combine work, Git has to figure out changes. To do that, Git will run git diff on two commits at a time. We start with:

git diff --find-renames <hash-of-H> <hash-of-J>   # what we changed

By comparing the snapshots in H and J, Git can figure out what files we modified, and what we did to those files.

Next, Git repeats this diff, this time from H to L:

git diff --find-renames <hash-of-H> <hash-of-L>   # what they changed

Again Git gets a list of files that were changed, and what happened to them.

The job of git merge is now to combine these changes. Git should then apply the combined changes to the base version.

For a file that we changed, and they didn't, that's easy: take our changes—or our version of that file from our commit. In other words, "file from H, plus our changes, equals file from J" so Git can just take the file from J. This applies even if we create a new file from whole cloth, for instance: if we added a file, and they didn't, Git can just take ours.

For a file that nobody changed, Git can take any version, as they're all the same.

For a file that they changed and we didn't, Git can just take their version. This also applies to files they deleted, that we didn't, for instance: Git can just delete the file.

Only when both of us changed the same file does it get a little tricky. Git now really does have to combine the two sets of changes. This can produce a merge conflict, if we both touched the same original lines. (I'm skipping over a lot of finicky details here too.)

Assuming all goes well, though, Git simply combines all of our and their changes, applies those combined changes to the files from H, and makes a new merge commit from the result. This new merge commit has a single snapshot—just like any commit—but instead of having a single parent, Git adds, to the usual single parent, a second parent. New merge commit M points back to commit J, like a regular commit would, but then also points to L:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Then, having written out merge commit M, Git writes its hash ID into the current branch name—br1, where `HEAD is attached—as usual, and our merge is done.

Your situation

I have a feature/mine branch that I want to merge into feature/other, however somewhere during development there were changes made in specific folders that are still in feature/mine and should not be in feature/other. So, the diff in my PR shows changes in src/some-folder-I-dont-want-to-change, but these changes were made deep down the commit tree. I can't simply revert some commits.

At this point you're talking about a GitHub pull request, rather than a Git merge. GitHub PRs are specific to GitHub. Bitbucket also have Pull Requests, and I'm the one who added github to your post, so perhaps you're actually talking about a Bitbucket pull request. Fortunately, while some details are different, the overall setup is the same here. (GitLab call theirs merge requests, and again, there are some differences in detail, but overall the idea is the same.)

To make a GitHub PR, you:

send your new commits to some repository on GitHub: either a GitHub fork, or a shared repository (it doesn't really matter which); then
use a GitHub web page clicky button, or the gh CLI script, to create the pull request: this generates a test merge on GitHub to see if the merge will work, and if not, they tell you about merge conflicts.

To make that test merge, GitHub:

found a merge base commit;
did a test merge, which presumably worked;
is now showing that you've changed some files in src/some-folder-I-dont-want-to-change

Assuming that you didn't change those files yourself, what that means is that the merge base that GitHub are using here is such that the common starting point comes before someone else changed those files.

That is, suppose you have:

         J   <-- feature/mine (HEAD)
        /
       I   <-- feature/third
      /
...--H
      \
       K--L   <-- origin/feature/other

where you made one commit J that changes files that aren't in this src/some-folder-I-dont-want-to-change location. The comparison between H and J, however, includes commit I.

Meanwhile, the comparison from H to L doesn't include any changes to src/some-folder-I-dont-want-to-change.

Your PR therefore shows changes to src/some-folder-I-dont-want-to-change.

You can, if you wish, obtain commit L—that's their most recent commit—and extract from it all the files that are in src/some-folder-I-dont-want-to-change, in that form, and commit the result:

         J--M   <-- feature/mine (HEAD)
        /
       I   <-- feature/third
      /
...--H
      \
       K--L   <-- origin/feature/other

Now, a comparison from H to M shows no changes to src/some-folder-I-dont-want-to-change. The problem is, your branch name feature/mine now means commit M, which—by going back one hop at a time—includes commit I, which means you have now backed out the changes from feature/third. That is, your commit M might as well be a revert.

If they (whoever they are) accept your updated PR, with its proposed merge of M into L, the result will include commit I, provided they use the MERGE button on GitHub. GitHub have three different clicky web buttons here:

MERGE: this does an actual Git merge. You've now set things up so that Git will believe that commit I is properly incorporated. This means whoever did make commit I has to re-do their work. That's the future time-bomb.
REBASE AND MERGE: this makes the Git software on GitHub copy each commit in the PR to a new-and-supposedly-improved commit. This will have a similar effect, though it changes how whoever did make commit I has to handle things.
SQUASH AND MERGE: this doesn't create a merge at all. This prevents this particular time-bomb. They replace all of "your" commits—including the commit I that you inherited from someone else—with a single commit. The effect is that whoever did make commit I doesn't have to re-make it, because nobody will ever see your commits J and M in the end (and you'll have to discard your branch, as with any squash).

Understanding how and why this works is the key here. If you're going to set up a future time-bomb, whoever made the commit that you are in effect reverting needs to know about this. If you're using the squash-and-merge method (which has other drawbacks), that has the beneficial side effect of defusing this entirely.

Side note: `origin/feature/other`

The name origin/feature/other is a remote-tracking name. This has the same effect as a branch name, in terms of finding commits. The key difference between a branch name and a remote-tracking name is that the latter is something your Git creates to remember some other Git repository's branch name. You cannot switch to a remote-tracking name (git checkout produces a detached HEAD, and git switch produces an error).

When you run git fetch to obtain commits from some other Git repository, your Git reads that other Git's branch names. To keep your branch names yours—to avoid overwriting yours with theirs—your Git software renames their branch names. Your Git then creates or updates these remote-tracking names instead. The remote-tracking name, such as origin/main or origin/develop or whatever, has the name of the remote origin stuck in front of it: hence the term remote-tracking name. (The Git documentation calls these remote-tracking branch names but I find the word branch here to have negative value: removing it produces an improved term.)

Rebasing

Assuming this diagram, or a similar one:

         J--M--N   <-- feature/mine (HEAD)
        /
       I   <-- feature/third
      /
...--H
      \
       K--L   <-- origin/feature/other

captures the real problem, you have another alternative. You can replace your existing commits J-M-N with new-and-improved commits that start from H, rather than from I. The git rebase command assists with this.

To do this, you would want to run:

git switch feature/mine
git rebase --onto <hash-of-H> feature/third

If the name main points to commit H, you could use:

git switch feature/mine
git rebase --onto main feature/third

You could use the raw hash ID of commit I rather than the name feature/third here as well. The general idea is that we want to tell git rebase two things:

Put the copies after commit H: that's why we need the hash of commit H, or the name main. Anything that locates the right commit will work here.
Using the current branch, copy commits, but don't copy commit I or anything earlier than I. Anything that locates commit I suffices here: the name feature/third does it, but so does the raw hash ID of commit I itself.

The rebase command will start at the current commit, here N, and work backwards until it reaches commits that are to be excluded from the copying: in this case, commit I and earlier commits. These are the candidates for copying.⁴ It then puts the to-be-copied commits into the right order, saving their actual hash IDs in a work-list. (An interactive rebase, git rebase -i, allows you to edit the work list, but you probably don't want that here.)

Once the list is ready, git rebase uses Git's detached HEAD mode to copy each to-be-copied commit, one at a time. Rather than detail how this works—though I will say here that it uses git cherry-pick or equivalent—I'll just draw the final result that you get if all goes well:

         J--M--N   [abandoned]
        /
       I   <-- feature/third
      /
...--H--J'-M'-N'   <-- feature/mine (HEAD)
      \
       K--L   <-- origin/feature/other

This final result has copies of your commits, but omits the not-copied I commit. So you no longer have any changes to the files you didn't change: your three—or however many—commits start from the same base commit as origin/feature/other. So if you now use git push --force to update your GitHub PR, the apparent changes to these other files will vanish.

No matter what merge strategy someone uses, the merge won't include any changes to these other files, and there is no time-bomb set up. So this method is often superior. This is what git rebase is for: to set up new-and-improved commits that do only what you want done.

⁴This candidates list is normally further winnowed by:

removing any merge commits;
removing from the list commits whose patch IDs match those in the upstream; and
using the fork-point trick if --fork-point is enabled.

In this case, none of these should have much effect. If you have merge commits in the list, though, things get much more complicated.

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/313257.html

標籤：混帐 github 位桶

上一篇：TerraformCICD流水線GCP

下一篇：為什么在使用Heroku部署應用程式時需要設定AWS和POSTgresdb？

Git：如何將專案中的檔案夾設定為與遠程分支中的檔案夾相同？