A lot has been said about git submodules lately. Many well known developers seem to cater to the opinion that a native submodule primitive in a VCS is somehow a bad thing and start wildly proposing alternative solutions. In my role as a technical lead at Glanzkinder I was more than once confronted with similar argumentation. Many times we launched into deep research and review procedures investigating all these claims. Now some years later we still are using submodules. We also never had any of the problems that many people seem to advertise. From time to time a less experienced new hire might come up with questions and the idea to scrap git submodule
altogether but that's about it. Actually the criticism itself only ever came from younger developers who still weren't completely eloquent with things like Unix, protocols and VCS.
After collecting my thoughts for a while and reading up on our own past research, two big issues come to mind. As analysis will show they actually boil down to one specific problem that isn't even near the software layer.
The this code doesn't exist!
issue.
This presents itself when People who don't understand how git and the
underlying transport protocols work tend to get lost in submodule hell. Usually it is a case of not
utilising best practices. Not using git via ssh or git protocol or not authenticating via
certificates can lead to states where submodules don't come automatically when checking out a
project. This of course isn't a big issue; most experienced users can easily cobble up a manual
fetch of submodules via the cli. Even if you are not experienced all you have to do is look at the
documentation and you are set for life.
The version conflict
problem.
In nature another variation of the this code doesn't exist!
problem. This is people missing
a version bump of a submodule. This happens when someone updates a submodule on his instance
of the repository and pushes. The dependency to the new
version of the submodule will, correctly, also propagate to the including repo. Users pulling
the new commits to the master repo will also have to pull the changes to the submodule. So what
if the submodule repo hasn't been pushed to and the changes only exist on the local machine
of the developer who made them. That is very unfortunate indeed isn't it? Still one can
of course just rollback that commit but that isn't the point here. When using modules in a
project one has to comprehend the consequences. You can't go along and treat the whole repo as one
monolithic block. You have to be concious about the fact that the submodules must have a separate
release lifecycle from the base repo. This also means that it actually is non standard procedure
to do changes on the submodule. Of course you have the flexibility of doing it but when committing
the changes to the submodule must be released (i.e.: pushed to the upstream repo) before the
base repo changes.
So this class of problems is actually lack of understanding and discipline instead of drawbacks to software. If you are occurring such problems in your teams the real culprit is not Your git strategy but overall compliance to logical workflows. Try looking at the project from further away and reflect a bit more on dev operations; such problems will explain and solve themselves eventually when your basics are solid.
In my opinion the drawbacks of using git subtree
are not acceptable:
Still reading: