- Yet Another git Guide for svn Users
- Other Subjects
- A Day In The Life Of...
- Special Examples
- What's Not Here
- Other Resources
Thanks to the COVID-19 pandemic, I'm stuck at home. My work team is migrating from svn
to git
and I was going to put together a brown bag or two for them, but since I'm doing it from home I'm able to put a copy on GitHub.
There are many sites out there that cover similar things, but many seem to be "how to migrate a repo" or simple cheat sheets. I want a simple set of things I can present over a lunch session or two.
We know every technical thing has to have its own jargon and lingo, and there is some overlap between the two. I'd like to make sure we're always on the same page. Here a few key terms that we've been using already with svn and how they apply along with a few extra you'll need:
Term | svn |
git |
---|---|---|
working copy | your local checkout | N/A, but kinda the same |
repository | the One True Copy(TM) | a copy of the codebase with all history included; some people (incorrectly) say "the repository" to indicate a central copy, e.g. the repo on GitHub |
revision | snapshot of the state of all files across all branches | snapshot of the file tree at any specific time |
branch | another full copy of entire file tree | a special type of reference (see below) |
commit | save your changes into a new revision for everybody to immediately see | save your changes into a new revision in your local repository |
reference | N/A | a pointer into a specific revision |
index | N/A | the "staging area" between your local file system (working copy-ish) and the repository |
A full reference is available with git help gitglossary
.
As noted above, a revision is a snapshot of everything at a specific time. Each of these snapshots, when combined with their metadata (author, comment, ancestors, etc.), is hashed into a SHA-1 hash to create the unique identifier to label that revision.
Reference | Location | Use |
---|---|---|
HEAD |
N/A | Points to a specific revision in the repository that the localfs is "based on." |
branch name | .git/refs/heads/ |
Points to the HEAD of the given branch. A new commit to this ref will move ("follow") this to the new revision. |
tag | .git/refs/tags/ |
Points to a specific revision. If currently HEAD of a branch, it will not move on a new commit. |
Tags are special references that don't move, similar to subversion's tags. If you cat
any of the files listed above, e.g. .git/refs/heads/master
, you will see it is simply the 40-hex SHA-1 hash and a newline.
Often the shortened version of the hash is unique enough (7 chars), and you can use "R^N
" to say "N references before R" (no N means 1). References can be used in many places on the command line and are very useful. For example, if I want to see what files were changed in the last commit, knowing that the most recent revision is always HEAD
:
$ git diff HEAD^ HEAD
# shortened with bash shortcut:
$ git diff HEAD{^,}
$ git diff HEAD{^,} --stat
README.md | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------
1 file changed, 62 insertions(+), 29 deletions(-)
$ git diff HEAD{^,} --stat -b -w # ignore whitespace
README.md | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------
1 file changed, 55 insertions(+), 22 deletions(-)
The git lg
alias below is very helpful to see history from the command line along with important refs:
* 5845e57 - (HEAD -> scratch, origin/scratch) WIP done for now (21 hours ago) <Aaron D. Marasco>
* b0cfc5e - WIP: Need to see those images (22 hours ago) <Aaron D. Marasco>
* eba6752 - Merge branch 'scratch' of https://github.com/AaronDMarasco/git4svn into scratch (22 hours ago) <Aaron D. Marasco>
|\
| * 725b186 - WIP save to see new images (23 hours ago) <Aaron D. Marasco>
* | a7c7334 - More images (22 hours ago) <Aaron D. Marasco>
* | 564d80d - More images (23 hours ago) <Aaron D. Marasco>
|/
* ef4aec1 - New image with fetch etc (23 hours ago) <Aaron D. Marasco>
...
* 8dc42ff - (origin/master, origin/HEAD, master) Initial commit (3 days ago) <Aaron D. Marasco>
What you cannot see here is the colors - all the refs on the side (e.g. 564d80d
) are red and the named refs (e.g. origin/scratch
) are in yellow. This makes them easy to identify. You can also see it uses ASCII art to show when two copies of the source code diverged (after ef4aec1
) and then merged (eba6752
) because I was creating the images on my local machine while using an online editor for the main document.
Many of us learn better with example, so let's provide an example. This is my file tree:
LICENSE
README.md
I don't have them in a git repository yet, so I will create one. Details omitted, because it's easy to search and not relevant here. Since the files are new to the repository, I need to git add
them. What actually happens when I do that?
git
begins to create a new revision, called the index, and adds the contents ofLICENSE
andREADME.md
to it immediately.
Why did I emphasize the word immediately? Because svn add
says "from now on, I need to start watching LICENSE
and README.md
", while git add
stages the contents of the files as it is right now in the index. If I then do echo FOO >> README.md
, a commit in svn
would have "FOO" added to the end of the file. In git
, the added "FOO" won't be committed. I'll try to illustrate it a little:
After that set of commands, the index has the original file, while the file system has the new "FOO" at the end:
$ git diff
diff --git a/README.md b/README.md
index 1a29b06..24b5c0a 100644
--- a/README.md
+++ b/README.md
<extra info removed>
+FOO
And if we want to see "how much" has changed (compared to what would be committed):
$ git diff --stat
README.md | 1 +
1 file changed, 1 insertion(+)
Again, the default of git diff
tells you the difference between the file system and the index. This is important, because git commit
and svn commit
are very similar, but act differently.
The subversion mindset is "but these files changed, I want them checked in." The git mindset is "only check in the changes I explicitly tell you to."
This is further illustrated here, along with the git diff
variation you need to use to see what is about to be committed:
This is a major source of confusion for subversion users and it's extremely important in unlocking some of the power of git!
So many times, you'll see the advice to "just run git add --all
(or git commit -a
) to add it all." This is using a halberd when you want a scalpel, but it's the default mindset of a svn user, because that's the only tool they had.
I've been working for a few hours on a small feature, and I didn't have it in a branch (so I'm working in master
). My local working copy has about six files changed, but mostly added debug statements and other stuff that I don't want checked in. My coworker asks me to tweak a file with some secret sauce that only I know about and nobody else.
You probably already have two or three working copies from the same repo, right? So you switch to one of the others, do an svn update
(more on that later), make the change, and then svn commit
.
You fix the two magic files that needed to be fixed. A changeset is effectively a series of modifications (patches) to a file, and you can run git add -p
which is a special mode of add
ing that will present to you each patch asking if it should be added to the index. You know certain files don't need to ask, so you can either tell it not to ask any more about that file ("d
") or just give it the two files on the command line: git add -p file1 file2
. If you had no unrelated changes in the files, you could've skipped the -p
but we're going to say you had debug enabled at the top of the file, and you don't want that committed. When you're done:
git diff --cached
shows you only the changes needed to fix your coworker's problemgit commit
will commit only those changes, with all your other files still modified
Hopefully this has shown you some of the power of the index. But what if somebody made changes on the central repo, and you haven't been keeping up to date? Then your push
might fail.
Another useful tool in the git toolbox is The Stash. It is a place where you can stash changes temporarily for various reasons. In this example, it's because you need to make changes but didn't get the latest "official" code (see below for more).
$ git stash save "need to fix something"
Saved working directory and index state On master: need to fix something
$ git stash list
stash@{0}: On master: need to fix something
stash@{1}: WIP on master: 2ca9eec diffs image
The stash can be thought of as a special branch that stores changesets that are only available to the local repository. That means they won't ever leave that working copy. So we've saved off our work (all that ghetto debugging with printf
etc.) and now have a clean version of master
again!
$ git pull
# code code code and fix the problem
$ git diff
$ git commit -am "Fixed the doohickey again" # git commit -a is NOT RECOMMENDED
$ git push
$ git stash pop
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
no changes added to commit (use "git add" and/or "git commit -a")
Dropped refs/stash@{0} (ed5e73f01a6c0de518cdc24a9cf2da4dd7a6fff5)
See there that stash@{0}
is a revision? It was ed5e73f01a6c0de518cdc24a9cf2da4dd7a6fff5
, but now it's gone. Since a revision is roughly a changeset, you can actually apply
it instead and it will stay in there. So you could actually do something like this:
# write code to enable debug mode in a submodule
$ git stash save "Enable debug mode in submodule X"
What would this do? Three weeks from now, you can be working in submodule X. You already have a snippet that enables all the debugging that you can apply
automatically at any time!
$ git stash list
# find the one you want, let's say it's stash@{3}
$ git stash apply stash@{3}
# your working copy now has debug enabled on submodule X!
If you didn't catch the subtle difference, pop
is a combination of apply
and then, if successful, a drop
which I didn't cover. So apply
leaves the revision in the stash.
This is another sticky point for svn users, so let's talk about what happens when. As a reminder:
origin
is the default name for an upstream ("remote") repository; the "main" copy hosted on a corporate server or GitHub, etc. It's where you originally "cloned" from previously.
Earlier we talked about git add
and how to put your changes into the index. If you then git commit
them, they are put into your repository. In subversion, it's everybody's repository, so you know you were always working with the latest code (because you were forced to svn update
before you could commit if you weren't!). In git, you're also responsible for keeping your entire repository synchronized with other(s).
Command | Usage |
---|---|
add |
adds a file to the index to be committed (covered above in index discussion) |
commit |
writes a set of changes into the repository as a revision (covered above in index discussion) |
clone |
copies a repository from a remote location (not covered here) |
fetch |
synchronizes the database from a remote repository to the local (read-only) |
merge |
merges two revisions into a single revision (not always a branch!) |
pull |
combines fetch and merge into a single command |
push |
synchronizes the database from a local repository to the remote (write-only) |
checkout |
copies file(s) from the repository to the localfs |
This section's examples and text assumes you set autosetuprebase
as noted in the Setup section below.
git pull
is roughly equivalent to svn update
and is the command you will likely use the most. However, for completeness, let's examine the two underlying commands because they can be useful in their own.
git-fetch - Download objects and refs from another repository
This will read all the changes from a remote repository (by default origin
) and replicate them in the local repository. These revisions are now all available immediately in our repository. However they are not expressed in our local filesystem. If I was working yesterday on a branch named branchA
, and I pushed it to the server, the one revision 24f3b8fecf
in my repository is referenced as my branch branchA
and also origin/branchA
. However, if my coworker made some changes this morning and pushed them, I just received them in my repository. branchA
on my repository has not changed, but origin/branchA
is now fcbd2855f
. An example of this:
$ git status
On branch branchA
Your branch is behind 'origin/branchA' by 1 commit, and can be fast-forwarded.
(use "git pull" to update your local branch)
nothing to commit, working tree clean
By adding --prune
to the end of fetch
(or pull
), any remote branches that no longer exist will be removed from your local repository. This is often what you want, so it is referenced below as a recommended configuration.
git-merge - Join two or more development histories together
This command is usually used for another reason (which we'll touch on, but as an svn user, you already know). But, as noted above, git merge
will merges two revisions into a single revision and we want to merge "our" branchA
(24f3b8fecf
) with "their" branchA
(origin/branchA
or fcbd2855f
). So to do that (assuming the local branch is already in branchA
), we use the same merge command, but the source of the merge looks special (the target is the current HEAD
):
$ git merge origin/branchA
git-pull - Fetch from and integrate with another repository or a local branch
So that leaves us with git pull
. It's basically a shortcut - it is mostly equivalent to git fetch && git merge origin/<branchname>
. As noted above, it is what you will do 99% of the time, and can be treated as a rough equivalent of svn update
. The difference is that if the merge fails, the fetch did happen, so you can locally examine what is wrong, e.g.:
$ git diff origin/branchA README.md
Because (don't forget) your local repository has all the information, and the reference to what the upstream has for branchA
is origin/branchA
.
Note: I'm hand-waving here a bit, because I hope you set autosetuprebase
as noted in the Setup section below. If you did, then it's actually doing a "git rebase
" in between the fetch
and merge
. This makes our repo a lot cleaner and easier to follow. Essentially, it "rolls back" all your changes since you last synchronized to the upstream. Then, it updates your branch to match what is upstream. Once that is complete, it re-applies your changesets but based off of the "new" branch. If this is able to happen cleanly, then a merge revision was never needed. It will clearly tell you when it is doing it as well:
$ git pull --rebase # This is your default if you set autosetuprebase
remote: Enumerating objects: 8, done.
remote: Counting objects: 100% (8/8), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 6 (delta 4), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (6/6), done.
From https://github.com/AaronDMarasco/git4svn
eba6752..c44bf7f branchA -> origin/branchA
First, rewinding head to replay your work on top of it...
Applying: Image tweaked
git-push - Update remote refs along with associated objects
This command simply sends your latest changes to the remote repository. If the remote has "moved on" past what your repo "knew" about, it will fail and require you to pull
again. There are server-side hooks that may also reject your changes for various reasons (branch control, etc.). There is no equivalent in subversion, because commit
handled that. Don't forget to do this if you are expecting somebody else to see your code!
git-revert - Revert some existing commits
A "revert" is explicitly an additional revision that will undo a previous revision. To maintain history, both the insertion and the deletion remain in the repo. If your code was ever pushed to a remote, this is what you should be doing. It does not rollback to a previous revision. If you never pushed the change and you want to rollback, there are tricks you can do with git checkout
that can be found online.
git-checkout - Switch branches or restore working tree files
This command is another source of confusion because subversion's checkout
is totally different (it's the same as git clone
). As shown in the illustration above, git checkout
checks file(s) out of the repo. The normal 99.44% use case is to change what branch you are currently working on:
$ git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
$ git checkout branchA
Switched to branch 'branchA'
Your branch is up to date with 'origin/branchA'.
When you want to create a new branch, you can add -b
to the command and it will branch from wherever you are, including a "dirty" workspace.
But yes, you can checkout
a single file (and it will auto-add
, which I don't like).
$ git checkout master README.md
$ git status
On branch branchA
Your branch is up to date with 'origin/branchA'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: README.md
This also covers the "just throw away everything I did to this file and bring it back to what's in the repo" scenario (equivalent to svn revert
):
$ git checkout -- README.md
$ git status
On branch branchA
Your branch is up to date with 'origin/branchA'.
nothing to commit, working tree clean
Branching is yet another paradigm shift that you should embrace. With subversion, a new branch meant you sacrificed a ton of disk space and had to wait while things were copied, etc. Then when you re-build, all the files are new so make
runs forever. In git, a branch is a 41-byte file because it's simply a reference into the repository at a certain revision. This is why they are often referred to as "lightweight" and branching is extremely encouraged. Because of the decentralized nature of git, your branches are unknown to anybody else unless you push
them to a remote repository. This means you can make branches for the tiniest of things if you think you would need to rollback. You should almost always be working in a branch even if it is local-only. You can always merge it back into the mainline development on your schedule and with your sanitized notes (see "squashed commits" elsewhere). For example, you may want to commit to your branch:
- Hourly. Seriously, you can then
diff
and see what you changed in the past hour. - After code successfully compiled. Then you can always get back to the working state. Do this before trying to clean anything up.
- When you're about to experiment with an alternative option with something.
It's difficult to emphasize how much of a life-changer this can be until you actually start using it. Especially when you combine it with git diff
to see the differences.
There are actually three kinds of merges, and you should be familiar with them because they make things a lot easier to follow if used properly.
- The first kind of merge is a "fast-forward" merge:
Since there are no changes in the local repository between what the upstream considers
branchA
and what we have asbranchA
, we can simply "fast forward" the reference to the new revision:
$ git pull
...
Fast-forward
README.md | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------
1 file changed, 62 insertions(+), 29 deletions(-)
Current branch branchA is up to date.
If you think this is what should happen, you can use git merge --ff-only
to ensure that's the case. If, for some reason, you find this unacceptable, you can use the --no-ff
flag.
- The second kind of merge is a "standard" merge:
For this, both repositories (ours and the remote) have new revisions that we both consider part of
branchA
, so we need to create a new revision that merges them. (Again, this wouldn't happen withautosetuprebase
, but you could imagine two different branches instead; it's the same.) Of course, this still needs to be pushed as noted above. When performing this kind, I highly recommend using the--no-commit
flag so you can review what the merge would have done and then manuallygit commit
; it will autopopulate the commit message properly for a merge. - The last kind of merge is a "squashed" merge:
In this image, there's a dotted arrow between
7589a237e
and2c57a9eec
because the merge happened and all the data is there, but the metadata doesn't record it. The log message is (by default) a culmination of all the changesets in between. At this point, all the "my_work" and revisions7589a237e
are now orphans and are subject to garbage collection in the future. A much more detailed example of squashed merges (I'm a huge fan of them) is below.
Squashing commits is a useful way to keep related changes within a single changeset for later examination. There are good arguments both for squashing ("no need to see how the sausage was made and these intermediate changesets that make no sense on their own") and against ("the way this document was tweaked is unique enough that I might want to reference that specific changeset later").
The following is an example of merging in a branch "feature--cool-intro
" from the top-level of the repository:
git checkout master
Ensure latest sync with server:
git pull --rebase
git merge --no-commit --squash --no-ff feature--cool-intro
At this point, the merge is staged in the "index,"" but the git-proposed commit message is not very human-readable (it includes the full logs of all intermediate commits). This information is stored in your repository in .git/SQUASH_MSG
:
$ grep commit .git/SQUASH_MSG | tail -3
commit 5faf49f11536fbae7819c559a62a21a47ec505c3
commit 0cd5cd8ae470377ed11b22058a45a895e318d4fa
commit 1b4f298de3ae062063c02b767f045aa3ecf3be82
Note the last commit ID, in this example 1b4f298de3ae062063c02b767f045aa3ecf3be82
, which is the first change in the branch when it first came out of "master
" (or when it was last pushed there).
Now we will create a new proposed log message in a prettier format and condensed information:
git log --format=’%h (%cD)%+s%+b’ --graph 1b4f298de3ae062063c02b767f045aa3ecf3be82^..feature--cool-intro > /tmp/commit_message
Don’t miss the "^" which indicates that the log should begin at the changeset before the requested, which makes it include that changeset as well.
git commit -eF /tmp/commit message
This will launch your editor. If your editor is git-savvy, it will note that your current commit format is invalid. Be sure to insert two lines into the beginning:
- Your one-line commit summary, used to generate log messages like the one currently being edited. It should probably be something like "Squashed commit of feature--cool-intro".
- A totally blank line.
This is stuff that I think is important / useful but I couldn't fit it elsewhere.
AKA "DAG" is how the revisions are related to each other - see Wikipedia and the git dag
add-on mentioned elsewhere. For example, the illustrations shown for "git fetch" and "git merge" show DAGs.
git-help - Display help information about Git
This command is how you access the documentation for git, e.g.:
git help diff
git help commit
git help checkout
git help gitglossary
git help
git-status - Show the working tree status
This one is used all the time and you should probably scan through git help status
to see how powerful it can be.
Some reminders:
--stat
can show roughly how much you've deviated from what's currently checked in- With no parameters, it shows the entire repo, so you often want to give it a path (or just
.
)
git-grep - Print lines matching a pattern
If you're on a system that you cannot get ripgrep
installed, then git grep
is the next best thing. It's just like grep -r
but will use all your available CPUs in parallel to search the repository database, which is insanely faster. By default it will only search actively-monitored files, but you can also ask it to search --untracked
files as well. I cannot emphasize enough how useful this capability is.
The options I prefer for grep
are listed in the config info below so I can just do git g <expression>
.
Because a repository is "only" a subdirectory .git
at the top-level of a directory tree, you can put a git repository anywhere you think having history and the ability to rollback would be useful. Some examples:
/etc/
before upgrading some RPMs- Any configuration directory before you make a bunch of changes
- Don't forget that Windows 10 has embedded Ubuntu with git available
- Inside a subversion repo (if you're stuck in svn and don't want to use the built-in git-svn capabilities)
- If doing this, it's useful to
git tag
svn revisions
- If doing this, it's useful to
Then when you're done with whatever "risky" thing you were doing, simply rm -rf .git
.
git-diff - Show changes between commits, commit and working tree, etc
Yes, it was tangentially covered above, but you really need to play with it to start to understand the power. You can effectively get a diff of nearly anything across all space and time. You can compare a file in one branch to another file (different name) in another branch, etc.
git-difftool - Show changes using common diff tools
If you'd prefer a graphical interface, you can configure one and then use git difftool
to launch it. See below for configuring it for meld
. Also supports --cached
and any other git diff
options.
git-log - Show commit logs
This tool has many amazing options, like --since
and --before/--after
, e.g. git log --since="yesterday"
or git log --since="last month"
or even "last tues
"
The configuration below has two aliases with git log
options - git lg
(shown before) and git last
which shows the last change.
git-mergetool - Run merge conflict resolution tools to resolve merge conflicts
When a merge fails, subversion leaves you high and dry. Git lets you define a graphical tool (See below for configuring it for meld
) to launch to attempt to fix the broken merge.
git-cherry-pick - Apply the changes introduced by some existing commits
This is very helpful when you are deep in a branch for weeks and somebody tells you, "in my branch, I fixed that really important bug that you've been hitting."
$ git fetch
$ git lg origin/helpful_branch
$ git cherry-pick -x <ref> # if last, can simply be origin/helpful_branch
This fixes that one important issue without a full merge of the other branch, putting that off until you are ready later. The -x
records its source for later reference.
- See also:
git help giteveryday
What you'll need to do every day is covered elsewhere within this document, but let's reiterate one more time the workflow that will happen 90% of the time. This assumes a multi-day change that might be peer-reviewed, etc.:
$ git checkout master # switches our localfs to master (or whatever the main development branch is named)
$ git pull # always ensure you have the latest
$ git checkout -b my_branch # the -b tells it to make the new branch from current revision
$ git push --set-upstream origin my_branch
You don't need to remember that last command; I don't myself. If you try to do a "regular" push, git
tells you exactly what to do:
$ git push
fatal: The current branch my_branch has no upstream branch.
To push the current branch and set the remote as upstream, use
git push --set-upstream origin my_branch
# 10:
# Do some work
$ git diff
$ git commit # remember - this is NOT visible by coworkers
# goto 10
You probably want to throw in a git push
at least once a day; hard drives fail sometimes.
If you worry about missing major changes in master
, it's a good idea to occasionally bring it into your branch. This helps minimize your conflicts later down the road, and ensures the code you test is "more realistic."
$ git fetch
$ git merge origin/master
OR
$ git checkout master
$ git pull --rebase
$ git checkout my_branch
$ git merge master
The latter ensures your local master
is also in sync; this is useful if you want to keep an eye on your differences with git diff master
as part of your development cycle.
It's been a few days, and you're done. Your work has possibly even been peer reviewed, run through a test harness, etc.
$ git checkout master
$ git pull --rebase
$ git merge --no-commit my_branch # might be --squash, --no-ff, etc...
$ git difftool # Make sure everything looks sane
$ git commit
$ git push
As noted above, we have options when merging; we need to decide if we will have a standard merge with every revision of my_branch
or if we want to throw away the intermediate steps (squashed merge).
There is now a branch we no longer care about in our repository, which isn't a big deal. The bigger concern is it's on the central copy we all share, and that can get cluttered very easily. Especially if we have a CI/CD infrastructure like Jenkins operating on every branch in the repo!
$ git branch -d my_branch
Deleted branch my_branch (was 36fa99f).
$ git push --delete origin my_branch
To https://github.com/AaronDMarasco/git4svn.git
- [deleted] my_branch
If the squash was merged, or for some other reason you want to abandon the branch without merging, git
will try to protect you and remind you one last time how to get to that revision:
$ git branch -d my_branch
warning: deleting branch 'my_branch' that has been merged to
'refs/remotes/origin/my_branch', but not yet merged to HEAD.
Deleted branch my_branch (was 3fb9dd3).
At this point, I could still git checkout 3fb9dd3
to get it back. It won't be there forever; there is a garbage collector.
If something broke and you're not sure where, you want to use git bisect
. In subversion, if you knew r200
was broken, and you're sure that r100
worked, you could manually split the problem space and say "let's check r150
!" This is impossible in git when there's no way to figure out what is halfway between 6fb166f
and e29fff0
without more metadata. That's where git bisect
comes in.
This example is fairly automatic; if you can simply run a test script to say good/bad, then it can be fully automated! If not, you can manually tell git
"OK, this one worked" etc. You can find more resources online or under git help bisect
.
Note: This is a real example with hashes from a private git repo and anonymized
- Write a script that will be able to tell if a specific version works or not. Here's my example script that I manually tested on the known good and known bad. Basically, my program would crash nearly immediately as the failure, so I launch it in the background and then check on it in five seconds.
#!/bin/bash -x
set -e
cd common/build # git bisect must be run from the top-level of the repo
make -j distclean
../bootstrap.sh
make -j
make check
./myprog &
sleep 5
jobs %% # This will fail if it is not still running
kill -9 %%
wait
-
Determine where you want to start and end the search. In my example, I know that my branch "
adm
" has something broken, while "master
" is good (n.b.master
hasn't changed since I branched off). -
Run it!
user@host$ git bisect start adm master
Bisecting: 29 revisions left to test after this (roughly 5 steps)
[3b30109-fullhash] [commit message]
user@host ((3b30109...)|BISECTING)$ git bisect run ./test_script.sh
...
Bisecting: 7 revisions left (roughly 3 steps)
...
[hash] is the first bad commit
commit [hash]
Author, Date, etc.
...
bisect run success
Be sure to check the help with git help bisect
for lots of interesting options, like the ability to skip a certain revision if it is totally unusable but independently of your actual problem. For example, after running the above, I added "|| exit 125
" to the make
calls to indicate that this revision should be skipped but not blamed because I was also messing with Makefile
s previously. I could have also forced the working Makefile
into every check by adding git checkout adm Makefile
to my testing script.
- Fix it
You now know what was broken, and you fix it. But you now have a patch for a revision from three weeks ago - what to do with that?
Make a branch from the first broken (
git checkout -b my_hotfix
) and then commit the patch (git commit -am "Hotfix"
). Switch back to the original branch (git checkout adm
) and then bring in the patch (git merge --no-commit my_hotfix ; git merge --reset
). From there, manipulate as needed. When done, delete the temporary branch (git branch -D my_hotfix
) since nobody needs it / cares any more.
This section is fairly esoteric; you might say "git
is always offline unless I tell it to fetch
or something." That would be correct, but sometimes you might have to work only with a subset of the repository; for example you don't want to have the entire repository's history taking up space in your dropbox. It was written as an example where you go on a trip to a customer location and find a bug in the code that you can easily fix. "What now?"
Note: There's nothing special about your copy of the branch. If you didn't bundle
before your trip, somebody back at the office can drop you a bundle!
- Decide which branch you want your changes to be based on. In this example, we'll say "
my_branch
". - Back up the branch using "
git bundle
":git bundle create my_branch.bundle my_branch
- The
my_branch.bundle
is the filename to dump to - The "
my_branch
" is any git reference, in this case theHEAD
of your chosen branch- If you wanted to make sure you had the latest, you can add another "
master
" to the end if you'd like - You can add as many branches as you might need; feel free to tweak as needed and see the filesize trade-off
- If you wanted to make sure you had the latest, you can add another "
- Check what you've done
git bundle list-heads my_branch.bundle
- Should show your current
HEAD
's hash withrefs/heads/my_branch
- Test run the next section somewhere
- Should show your current
- Upload
my_branch.bundle
to dropbox, USB key, etc.
- Download
my_branch.bundle
to working PC - Clone a new working copy from the bundle
git clone /tmp/my_branch.bundle
- It's OK if it says something like "
warning: remote HEAD refers to nonexistent ref, unable to checkout.
"
- It's OK if it says something like "
cd my_branch
git branch -av
- Should show all refs you bundled, e.g.
my_branch
andmaster
- Should show all refs you bundled, e.g.
git checkout my_branch
- Now all your files should be there
- Make changes / do work
git commit
to your heart's content
- Save all your changes as a new bundle
git bundle create my_branch_diff.bundle origin/my_branch..HEAD
- The resulting file will be much smaller - only the diffs you've made!
- Again, the last argument might include other things. If space is not an issue, you can just put the branch name again and get them all.
- Verify you've got what you think
- Compare the hash from
git log -1
togit bundle list-heads my_branch_diff.bundle
- Compare the hash from
- Upload
my_branch_diff.bundle
to dropbox, USB key, etc.
Warning: If you used git stash
and didn't save the final results into a named branch and then bundle
, those changes will be lost!
- Download
my_branch_diff.bundle
to your machine - Change directory to your "daily" working copy
- Make sure your workspace is clean
git bundle unbundle my_branch_diff.bundle
- Note the reference it gave you here, e.g.
5bab64db350fcf45033481191a976164e8551538 HEAD
- Note the reference it gave you here, e.g.
git merge --no-commit 5bab64db350fcf45033481191a976164e8551538
- If there were no changes since you left, it will say "Fast-forward" and you're done;
git push
to the server - If there were changes, you are in a "normal" merge situation to be handled appropriately
- If there were no changes since you left, it will say "Fast-forward" and you're done;
There are some other things I've already documented on an internal wiki for my team that may interest public users; treat this as a breadcrumb that you might want to search the internet for more information:
Usinggit bisect
to automate finding where something is brokenThis is important because unlike svn, git revisions are unpredictable, so it is non-trivial to say "I want a revision halfway between then and now"
Usinggit bundle
when traveling and needing a minimal set of files with you
Also, I just didn't know where to fit it - a remote
can be any git repository, which includes a clone in another directory on the local filesystem!
Lastly, it seems newer versions of git
are using some aliases like "restore
" to be more user-friendly. These are not yet in the Linux distro we use, so I haven't migrated this document to reference them. Pull requests referencing these alternate methods are welcome!
I started learning git back in 2015, and I noted that the following sites were great help, and I highly recommend them still:
- A one-hour preview talk from 2007 by Randal Schwartz (of perl fame)
- Great intro, including around 3:20 when he simply asks "What is git?" and notes it tracks "Changes to a tree of files over time"
- Ignore what he says about
git rebase
- in my experience, it's rarely used - The Thing About Git by Ryan Tomayko was a good intro to the usage side
- Think Like (a) Git by Sam Livingston-Gray was an excellent site that I used that really helped me understand the (graph) theory, the way it all comes together
- Git Immersion by Neo Innovation is a great hands-on lab-like approach to learning
In my previous office, we had "free reign" so I was able to use any tools I wanted. In a perfect world, I'd still have access to them all:
git dag
- from git-cola (also has a nice diff viewer)git lg
- an alias I use daily - possibly from here and old notes of mine are below- Powerline looks great (but I've had problems with it if your git repo is on an NFS mount, e.g. under your
/home/
in an enterprise environment)mkdir ~/.config/powerline
cp /etc/xdg/powerline/config.json ~/.config/powerline/
vim ~/.config/powerline/config.json
- Change
shell:theme
fromdefault
todefault_leftonly
- Change
- (Manual) Bash prompt support
- Found in various locations but part of git's "contributed" - see this git page for more
- Source is here
- Add the following to your
~/.bashrc
file to get immediate feedback from the shell concerning the status of your working copy:
if [ -e /usr/share/git-core/contrib/completion/git-prompt.sh ]; then
. /usr/share/git-core/contrib/completion/git-prompt.sh
export GIT_PS1_SHOWDIRTYSTATE=1
export GIT_PS1_SHOWCOLORHINTS=1
export GIT_PS1_SHOWUPSTREAM="auto"
PROMPT_COMMAND='history -a;__git_ps1 "\u@\h:\w" "\\\$ "'
fi
- Some settings to add to your
~/.gitconfig
for some nice aliases:
[branch]
autosetuprebase = always
[alias]
# bav: Show all branches, including if "gone" from remote and if you are ahead/behind remote
bav = branch -av
# bavv: Even more verbose (will tell you names of the remote branches)
bavv = branch -avv
# cm: Commit with a message
cm = commit -m
# diffr: Show the difference made by a specific revision, e.g. HEAD
diffr = "!f() { v=$1; shift || :; git diff "$v"^.."$v" $a; }; f"
# dc: Show diff of index (what you are about to commit)
dc = diff --cached
# dcs: Show diff of index (stats)
dcs = diff --cached --stat
# ds: Show diff of working copy (stats)
ds = diff --stat
# filelog / flog: Show logs (optionally of a file) along with the patches that each commit applied
filelog = log -p
flog = log -p
# find: find a file at the current directory or deeper (using standard grep) starting in current directory
find = "!f() { cd ${GIT_PREFIX:-./}; git ls-files | grep -i $@; }; f"
# g: Quick grep with easy to copy/paste filenames
g = grep --break --heading --line-number -i
# gfind: global find - anywhere in the git repo
gfind = "!f() { git ls-files | grep -i $@; }; f"
# gi: grep case ignored
gi = grep -i
# gn: grep case ignored but only list file names found
gn = grep --break --heading --line-number -i --name-only
# last: Show the last log entry
last = log -1 HEAD
# lg/lga/lgs: Show a pretty graph of the logs of the current branch / all branches / + stat of each changeset
lg = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative --full-history --simplify-merges
lga = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative --full-history --simplify-merges --all
lgs = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative --full-history --simplify-merges --stat
# tips: Show the tip of every branch in the repository (local or not)
tips = "!f() { git show -s --oneline --decorate $(git show-branch --independent -a); }; f"
- You might want to set your "grep" settings as well:
git config --global grep.lineNumber true
- makes it easy to copy-paste file names and jump to the line in editorsgit config --global grep.extendedRegexp true
- "better" regex supportgit config --global grep.patternType perl
- "best" regex support
meld
as your merge resolver (when conflicts occur, runninggit mergetool
will launch)
$ git config --global diff.tool meld
$ git config --global merge.tool meld
- If your organization has lots of ephemeral branches that are pushed to the remote server, you most likely want to always "prune" on a
fetch
orpull
to remove stale references to dead branches:
git config --global fetch.prune true
- Dillinger an online Markdown editor
- gh-md-toc for the (offline) generation of the Table of Contents
- Tables Generator for online table generation
- WebGraphviz for online Graphviz graphics