Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix up CLI and FathomFox dependencies to make fathom train run again #329

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

linabutler
Copy link
Member

@linabutler linabutler commented Jun 28, 2023

Hi! 👋🏼 I'm doing some prototyping with Fathom, and ran into a few dependency-related snags getting fathom train to work. This PR is an attempt to fix all of them up—with this patch stack, fathom train works for me, and prints metrics and results! 🎉

I haven't worked with (or on) Fathom before, but happy to revert or fix up any commits. The individual commit messages have some more details about the versions I chose, but here's a quick summary:

  • Newer NumPy versions aren't compatible with Fathom's version of tensorboardX, so I pinned NumPy to the newest compatible version.
  • Selenium 4 has some breaking changes, but 3 isn't compatible with modern Firefoxes, so I bumped the version of Selenium (and geckodriver, on the FathomFox side) and updated the Vectorizer to use it.
  • Yarn was very confused trying to install FathomFox's dependencies, even after I regenerated FathomFox's copy of yarn.lock. I think it's unhappy that npm install now saves to package.json by default, and generates its own package-lock.json lockfile that interferes with Yarn's. After a bit of debugging, I just replaced Yarn with npm.

I also had to use Python 3.9.13. It looks like 3.11 is too new for the version of PyTorch that Fathom depends on—but I wasn't sure about bumping that dependency just yet.

Thanks!

/cc @gleonard-m @DimiDL

tensorboardX 1.9 isn't compatible with NumPy >= 1.24.0
(pytorch/pytorch#91516), because it uses a deprecated comparison
that throws a `TypeError` in newer versions.
Selenium 4 removes the deprecated signatures and methods still used by
the Vectorizer (mozilla#308), and Selenium 3 is incompatible
with newer versions of Firefox, so we can't pin to the older version.
Flake8 3.x uses a deprecated interface that was removed in
`importlib-metadata` 5 (python/importlib_metadata#406).
Yarn fails to install its dependencies during the "building FathomFox
with your ruleset" phase, reporting a syntax error in the cached
FathomFox's `yarn.lock`.

Regenerating `yarn.lock` in the repo's copy of FathomFox fixes the
syntax errors, but then Yarn fails to resolve package URLs during the
same phase.

I think this is because the Vectorizer's invocation of `npm install
yarn` in the cached FathomFox directory itself alters `package.json`,
and writes a `package-lock.json` that confuses Yarn. Yarn specifically
warns about `package-lock.json` causing resolution inconsistencies.

npm's performance has improved quite a bit since Yarn was added, and
its lockfile also supports hashes. Instead of debugging further, I
opted to use npm to manage FathomFox's dependencies, and replaced
`yarn install` with `npm install`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant