-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organisation of subprojects #3818
Comments
Bonus points if you manage to not lose our change history to those projects when moving them to submodule. Splitting folder history is possible in git, but re-applying changes from a different tree (even if the code matches) may be harder. |
I didn't know lack of permissions was a roadblock for more cool work. I think that can be arranged. |
This would also be a blocker for me. I do see the advantage in terms of organization if more stuff was in submodules, but it does require all maintainers to have access to all the relevant projects. Cleaning up the root folder of the bizhawk repo is independent from that though, and I'd definitely approve moving stuff to a unified folder. There is |
If people tell me what to fork under tasemus, I can do that and invite you guys as admins to those forks. I may also discuss adding more admins to the org itself... |
Well, most of the repos I linked should be forked, but pragmatically the larger ones are far more important, and as CPP said some projects have had more extensive changes than others. IMO forks can be made "as needed" i.e. the next time someone wants to update a core. |
What's the objective here; making it easier to merge upstream changes? In that case, the difficulty comes chiefly from how much the code has diverged from upstream, not the mechanism by which they're stored. It's a tradeoff we've made different ways different times: Heavy modifications to source lets us get off the ground with something that works for us more quickly. Light modifications to source mean we have to do work that's sometimes more awkward and difficult to integrate with the existing code better, but we can take upstream changes more easily. I don't think there's one clear answer. The cores that sync with upstream seem to be the better long-term bet, and we can see that especially in cases where we've done it both ways and can compare the two (e.g., early Mednafen efforts vs the Nyma system.) But we also have to consider that some of these ported cores might not even exist if we hadn't taken the "easy" way out. I'm against any specific requirements here, because we're all volunteer and the most important thing is being useful for the person who will actually work on the core. I think core porters should consider the value of various approaches, but if they want to get hacking, and they're the only one who will do it, let them. |
To be clear, this issue is not arguing against the heavily-modified path. I'm just against adding hundreds of files directly to this repo. |
I don't understand why that's worth discussion. Apart from concerns on implementing the cores, and concerns about keeping the cores synced with upstream, why does it matter? |
For one, having an extra million LOC in the repo drastically increases the time it takes to clone. That affects both humans and CI. |
I guess; aren't most devs going to check out most sobmodules anyway? |
I for one only ever checkout the submodules I require. That said, I don't think the C/C++ cores in the repo contribute much to the clone size; it's probably going to be the assets folder for the most part. |
Seeing as you don't need the submodules cloned to build the solution, and we don't really advertise their existence, I'd imagine no, most devs aren't cloning the submodules.
Downloading edf5f15 as a zip (not the same as cloning, but similar, and easier to measure): |
Well if we're doing this, here's a probably at least somewhat accurate and representative list of what contributes most to the Git object size list
Initial list was generated using |
They're in Anyway, that download size is awful, but the problem there is mostly unrelated to submodules and this ticket, as it's about prebuilt binaries. We'll need to get all of our crusty binary build setups into an easily reproducible build system. |
A minor problem with doing this specifically is cmake being used in places, and cmake does not allow including stuff in parent directories, you can only include stuff in the current directory and subdirectories. So unless you end up also having the cmakelists also be within the /submodules folder alongside the submodule, or possibly in the BizHawk root directory itself, you're screwed. |
That's dumb. And I assume you need CMake on Windows too, so you can't just use a symlink, bind mount, etc. to work around it. |
tl;dr: I'm calling for a moratorium on copying third-party code into the repo, and I'm proposing that we move a few dirs around.
We all hate code duplication. So why is it okay for us to needlessly copy entire codebases?
snip(Cut out a bunch of back-and-forth here since I feel none of you need to be convinced this is a good idea, only that it's worth the effort to do it. edit: Apparently natt did need convincing. Scroll down for arguments.)So let's assess the damage:
/ExternalCoreProjects/Virtu
, Virtu core, < 11 kLOC, upstream is https://github.com/digital-jellyfish/Virtu/tree/master/Virtu/ExternalProjects/NLua
, Lua host, < 10 kLOC, upstream is https://github.com/NLua/NLua/ExternalProjects/iso-parser
, disc system, < 2 kLOC, upstream was on Google Code/blip_buf
, multiple cores, < 2 kLOC, upstream was on Google Code/libmupen64plus
, Mupen64Plus and plugins, < 340 kLOC, upstreams are under https://github.com/mupen64plus/lynx
, Handy core, < 12 kLOC, upstream is in tarball (or https://github.com/TASEmulators/mednafen/tree/master/src/lynx)/psx
, Octoshock core, < 68 kLOC, upstream is in tarball (or https://github.com/TASEmulators/mednafen/tree/master/src/psx)replaced with submodule/quicknes
/waterbox/ares64/ares
, Ares64 core, < 55 kLOC, upstream is https://github.com/ares-emulator/ares/waterbox/bsnescore/bsnes
, (new) BSNES core, < 67 kLOC, upstream is https://github.com/bsnes-emu/bsnes/waterbox/gpgx
, Genplus-gx core, < 88 kLOC, upstream is https://github.com/ekeeke/Genesis-Plus-GX/waterbox/libsnes
, (old) BSNES core, < 63 kLOC, upstream is https://github.com/bsnes-emu/bsnes/waterbox/nyma/zlib
, zlib for Nyma cores, < 19 kLOC, upstream is https://github.com/madler/zlib/waterbox/picodrive
, PicoDrive core, < 79 kLOC, upstream is https://github.com/notaz/picodrive/waterbox/tic80
, TIC-80 core, < 17 kLOC, upstream is https://github.com/nesbox/TIC-80/waterbox/uzem
, Uzem core, < 5 kLOC, upstream is https://github.com/Uzebox/uzebox/tree/master/tools/uzem/waterbox/virtualjaguar/src
, Virtual Jaguar core, < 52 kLOC, upstream was on a self-hosted Git server/wonderswan
, Cygne core, < 8 kLOC, upstream is in tarball (or https://github.com/TASEmulators/mednafen/tree/master/src/wswan)In total, the upper bound is 917 kLOC of source code, severed from its trunks and waiting to bitrot. (For reference, just the BizHawk solution is ~450 kLOC, and Mesen is ~325 kLOC.)
Not listed are several binaries in
/Assets/dll
which are rundeps for unmanaged cores. I'll be looking at these as part of my Nix experiments. I also discounted dependencies of these projects whenever I noticed e.g..../vendor
near the bottom of the filesize list—but they are nonetheless checked-in to this repo.I propose that from today we only use Git submodules for this purpose:
/submodules
, and supplemental files like Makefiles or shell scripts in one of/ExternalProjects
,/ExternalCoreProjects
, or/waterbox
when those are necessary.I also propose we migrate all subprojects, starting with those listed above, to the same scheme.
Adding a submodule and removing the checked-in copy can be squished into a single commit.
Moved here from my personal notes #118; see also #2423, my personal notes #11, and #2312.
The text was updated successfully, but these errors were encountered: