[October] The first security issue and usable gix status
(CLI)
#1124
Byron
announced in
Progress Update
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This month was full of smaller updates with some progress on
gix status
which now works correctly. However, overall it feels like not a lot of substantial work was done while spending a lot of time once more on maintenance.Security Patch: arbitrary code execution by passing special SSH URLs for fetching or cloning
Who would have thought that I could fall for this, and particularly, how I could fall for this. The short story is that it was possible to craft SSH URLs that contained ssh flags instead of the host name. And these flags could instruct SSH to run a proxy program, which of course can also be an arbitrary shell script.
With this it doesn’t take much to hide URLs that
gix
(CLI) could try fetching from.ssh
is quite aware and the solution is simple - just pass—
to make it stop processing arguments which makes it consider the following argument to be the host no matter what.gitoxide
, however, solves the problem just likegit
does by pre-validating the host before using it.The lesson to be learned here is to always double and triple check what kind of arguments are passed to a command-line tool.
The same issue could possibly happen with
file://
urls, but fortunately heregitoxide
is already validating that the path, which also could contain command-line flags, is a valid directory on disk.gix status
It took plenty of testing to finally iron out al the small issues that one has to deal with when doing a status check on the worktree, that is to compare if a tracked file changed compared to the time we recorded its data in the index file.
Now it’s at a state where it appears to be working just like
git
does, while being reasonably fast.Something that needed special attention is the handling of conflicts, as these may span multiple chunks. Fortunately this was straightforward to solve by deciding which chunk is ‘owning’ the conflict, while allowing one chunk to reach into another one for reading. Interestingly this change changed the whole architecture of the system away from having mutable access to the index towards producing events describing how it should change. This also turned out to be much cleaner and ultimately easier to work with.
It’s worth noting that I wasn’t able to figure out what the difference is in terms of threading i.e.
git
performs great with 20 threads, whilegix
gets slower even when going beyond 4 threads on MacOS. On Linux, there is no issue at all so it’s something very specific to the filesystem.What makes it worse is that there are rare occasions when the index needs to be refreshed, which is when using many threads speeds up the operation tremendously even on MacOS, so just reducing the default amount of threads on MacOS also isn’t always the best.
I even tried to change the mode of parallelism to match the one of
git
so the pattern of access is more similar, but to no avail. Maybe it’s something related to how the Rust standard library abstracts over the respective C-library calls, who knows.API Improvements
Packetline tracing
A hang could occour when fetching via
file://
orssh://
(and possiblygit://
) and the V1 protocol and it took a long time to figure out a fix.One step on the way was to finally make it possible to observe all packetlines as they are sent over the wire just like
GIT_TRACE_PACKET
does forgit
. Asgix
integrates this with its tracing engine which supports structure and hierarchy, it looks really nice and it becomes immediately obvious which lines are sent along with which ‘stage’ of the protocol.Here is an example for those interested.
The hanging bug could finally be resolved by ‘trying’ a patch that was applied by the original author of the bug which I could never reproduce, which in conjunction with the test-suite led to an even uglier V1 implementation which really only tries to not deal with the intricacies of of this old (but not outdated) version of the protocol.
The reason I don’t like V1 is that
gix
abstracts over all of these details which makes it way easier to implement fetches, but it also doesn’t allow for the incredible amount of special handling that V1 would otherwise need, leading to certain heuristics being used ‘overcome’ V1 complexities.I just hope that this was it now and it will just keep working 😅.
Split-worktree support
Opening a
git
repository is everything but trivial and I keep comparing it to a boot sequence rather than opening a bunch of files. And with that complexity it comes at no surprise that from time to time, I learn about new issues related to openinggit
repositories.This one was reported in
GitPython
actually which has trouble opening agit
repository with thecore.worktree = /path/to/worktree
configuration set.This means one can have a repository at
/repo
and the worktree at/worktree
, merely by configuration.gitoxide
, specificallygix-discover
can now handle this in such a way that it guesses more correctly the kind of repository when encountering one of these, to allowgix
later correct for the imprecision when ‘booting’ the repository for real.It’s worth noting that
gix-discover
implements a high-performance guessing machinery to quickly see if a folder at hand contains agit
repository, along with enough information to help speed up actually opening it (to avoid repeating most of the work already done).More intuitive
HEAD
conversionsMost programs that interact with
git
repositories will have to start somewhere, and that’s typically throughHEAD
. Its access functions have been revised to make typical conversions, likeHEAD
to tree andHEAD
to commit (even) easier.While at it, object conversions are also more consistent now, and there is a newly added
Blob
type for completeness. It’s useful to double-check that the retrieved object is indeed the desired kind of object:repo.find_object(hash)?.try_into_blob()?
now is possible, for instance.Better 32bit support
Even though
gix
is meant to be run on 64 bit systems, 32bit systems will compile it just fine but wouldn’t be able to handle pack files larger than 4GB. Now it should work a little better by assuring that file-sizes are represented asu64
instead ofusize
.Community
gix-url
engine rewriteThis PR contributes a core-engine rewrite for the URL parser, which makes it more similar to
git
and more capable, which allows more of the baseline tests to be parsed correctly.Interestingly that also led to a regression due to a test that was (always) missing, which was fixed later on. Now I’d think
nix-url
is in a good position to one day pass all baseline tests.rustsec admin tool uses
gitoxide
For completeness, I decided to also ported the
rustsec-admin
tool togitoxide
which led to a 2.2x performance improvement on CI.Now all of the rustsec tooling, as far as I am aware, uses no
git2
anymore and could reduce its dependence on C-code quite a bit.Gix in Cargo
There was no progress here, once again, as there was no substantial progress in
gix-status
.Cheers,
Sebastian
PS: The latest timesheets can be found here.
Beta Was this translation helpful? Give feedback.
All reactions