Git Submodules are a way to include or embed one or more Git repositories as a sub-folder(s) inside another Git repository at a specific path. These submodules allow you to keep a Git repository as a subdirectory of another Git repository.
When you add a submodule in Git, you don't add the code of the submodule to the main repository, you only add information about the submodule that is added to the main repository. This information describes which commit the submodule is pointing at. This way, the submodule's code won't automatically be updated if the submodule's repository is updated. This is good, because your code might not work with the latest commit of the submodule, it prevents unexpected behaviour.
Use cases:
- If you have a library that's used in multiple projects, a submodule is perfect to ensure every project gets updated at the same time when the library changes.
- If you have a large project and you want to split up the codebase into smaller, more manageable pieces.
- If you have code that is developed in parallel across multiple repositories but need to include a reference to a particular state of this external code.
See docs about git submodules here
You can add a submodule to your repository using the following command:
git submodule add -b <branch> <repository> <path>
git submodule add -b main https://github.com/owner/repo.git my-folder
This command adds the given Git repository as a submodule, cloned into a directory at the root of your project.
First you should notice the new .gitmodules file. This is a configuration file that stores the mapping between the project’s URL and the local subdirectory you’ve pulled it into:
[submodule "git_submodules/Python-Guide"]
path = git_submodules/Python-Guide
url = [email protected]:BenjaminYde/Python-Guide.git
Use the following command to list all the submodules present in your repository contents:
git submodule
When you clone a Git project, submodules won't be included by default. They are not downloaded.
You need to initialize and update them with these commands:
git submodule init
git submodule update --recursive
-
git submodule init: This command is used to initialize your submodules. This means that it sets up the necessary Git metadata and configuration files to mark the project directory as a Git repository itself. It prepares the local configuration file (found in .gitmodules), allowing the submodule to be subsequently cloned.
-
git submodule update: This command fetches the data from those repositories and checks out the appropriate commit (specified by the main repository). If you haven't initialized the submodules yet, git submodule update won't do anything, because there's no submodule data to fetch and update yet.
Or you can do both with a single command when cloning the repository:
git clone --recurse-submodules <repository>
git
mv is a convenient command used to move or rename files and directories within a Git repository.
git mv <path/to/submodule> <path/to/new/submodule>
"Shallow" is a general term that describes a Git repository that doesn't contain the full history of commits. A shallow clone is created using the --depth flag followed by a number that indicates the depth of the clone, i.e., the number of recent commits from each branch that you want to fetch. The term "shallow" doesn't specify how shallow the clone is; it only indicates that you're not getting the full history.
git submodule add --depth 1 <repository_url> <path>
Here, --depth 1
indicates that you want a shallow clone with only the latest commit from the submodule repository.
The command will perform a shallow clone of the submodule repository, but it will not automatically add the shallow = true line to the .gitmodules file. This means that the shallow state of the submodule will not be preserved when someone else tries to initialize the submodule from the superproject repository.
In .gitmodules
you will need to manually add shallow = true
to the .gitmodules:
[submodule "<submodule_name>"]
path = <submodule_path>
url = <repository_url>
shallow = true
If you want to update the submodule to the latest commit on a specific branch (like the default branch),
you can navigate to that submodule directory and run the git pull command, or use the following command:
git submodule update --remote
The --remote
option tells Git to go into the submodule and pull down the latest changes from that submodule's default branch.
When you switch branches or make changes within a submodule, your main repository will recognize this as a change to the submodule.
cd <submodule_path>
git checkout <branch_name>
Git does not have a built in way to remove submodules... Hopefully this will be resolved in the future, because we now have to do submodule removal manually.
- Identify the submodule you want to remove from the list of submodules.
- Use the following command to remove the submodule:
Replace
# Remove the submodule entry from .git/config git submodule deinit -f <path_to_submodule>
<path_to_submodule>
with the relative path to the submodule, - Use the following command to remove any leftover Git references to the submodule:
# Remove the submodule directory from the superproject's .git/modules directory rm -rf .git/modules/<path_to_submodule>
- Use the following command to remove any leftover Git references to the submodule:
# Remove the entry in .gitmodules and remove the submodule directory located at path/to/submodule git rm -f --cached <path_to_submodule>
Relative URLs in Git submodules are a feature that allows you to specify submodule URLs relative to the URL of the superproject (the main repository containing the submodules). This is particularly useful in scenarios where you want to switch between HTTPS and SSH protocols without having to update the submodule URLs manually. It's also handy when you're working with repository mirrors or moving repositories between different servers.
For example, assume your superproject's repository URL is:
- SSH: [email protected]:User/SuperProject.git
- HTTPS: https://github.com/User/SuperProject.git
And you have a submodule located at:
- SSH: [email protected]:User/SubModule.git
- HTTPS: https://github.com/User/SubModule.git You could specify the submodule URL in .gitmodules like this:
[submodule "path/to/submodule"]
path = path/to/submodule
url = ../SubModule.git
To switch between HTTPS and SSH URLs for Git submodules, especially for CI/CD purposes, you can modify the URL in the .gitmodules file and then synchronize the settings. Here's how to do it:
- Edit .gitmodules: Open your .gitmodules file and change the URL for the submodule from HTTPS to SSH.
From:
[submodule "path/to/submodule"]
path = path/to/submodule
url = https://github.com/User/SubModule.git
To:
[submodule "path/to/submodule"]
path = path/to/submodule
url = [email protected]:User/SubModule.git
- Synchronize and Update:
git submodule sync # Syncs the URL in local config
git submodule update --init --recursive # Fetches the latest content
When you see ...-dirty
in the context of a Git submodule, it typically means that there are changes in the submodule directory that have not been committed. These could be modifications to tracked files, or new files that are not yet tracked by Git. The term "dirty" in this context refers to a working directory that is not in a clean state.
To reset a "dirty" submodule and bring it back to a clean state, you can do the following:
Option 1: Discard the uncommitted changes
cd <path_to_submodule>
git reset --hard
git clean -fd
Option 2: Commit the changes
cd <path_to_submodule>
git add .
git commit -m "Committing changes in submodule"
git push