Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.0] Use hg clone --stream to clone repos #17786

Merged
merged 2 commits into from
Mar 19, 2024

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Mar 19, 2024

We're exceeding (already generous) timeouts in test_0110_reset_metadata_on_all_repositories

The docs says this about --stream:

In normal clone mode, the remote normalizes repository data into a common exchange format and the receiving end translates this data into its local storage format. --stream activates a different clone mode that essentially copies repository files from the remote with minimal data processing. This significantly reduces the CPU cost of a clone both remotely and locally. However, it often increases the transferred data size by 30-40%. This can result in substantially faster clones where I/O throughput is plentiful, especially for larger repositories. A side-effect of --stream clones is that storage settings and requirements on the remote are applied locally: a modern client may inherit legacy or inefficient storage used by the remote or a legacy Mercurial client may not be able to clone from a modern Mercurial remote.

I think this is overall beneficial for us, since we often need to clone few but large files (think compressed or binary test data).

I have no idea if that solves the test timeouts, it's marginally faster locally, but this also doesn't time out locally.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

We're exceeding (already generous) timeouts in test_0110_reset_metadata_on_all_repositories

The docs says this about `--stream`:

> In normal clone mode, the remote normalizes repository data into a common exchange format and the receiving end translates this data into its local storage format. --stream activates a different clone mode that essentially copies repository files from the remote with minimal data processing. This significantly reduces the CPU cost of a clone both remotely and locally. However, it often increases the transferred data size by 30-40%. This can result in substantially faster clones where I/O throughput is plentiful, especially for larger repositories. A side-effect of --stream clones is that storage settings and requirements on the remote are applied locally: a modern client may inherit legacy or inefficient storage used by the remote or a legacy Mercurial client may not be able to clone from a modern Mercurial remote.

I think this is overall beneficial for us, since we often need to clone
few but large files (think compressed or binary test data).

I have no idea if that solves the test timeouts, it's marginally faster
locally, but this also doesn't time out locally.
@mvdbeek mvdbeek changed the title [24.0] Use hg clone --stream to clone repos [24.0] Use hg clone --stream to clone repos Mar 19, 2024
@github-actions github-actions bot added this to the 24.1 milestone Mar 19, 2024
@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 19, 2024

With the changes test_0110_reset_metadata_on_all_repositories takes 89 seconds, without it's 134 seconds ... caveat is I only ran it once, but it also seems like the tests are now passing when they consistently failed before.

@jdavcs jdavcs modified the milestones: 24.1, 24.0 Mar 19, 2024
@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 19, 2024

Hmm, still timing out :(

@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 19, 2024

ok, 1 out of 3 times out ... I'll put an xfail on it, no need to hold up unrelated PRs

Copy link
Contributor

@davelopez davelopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mvdbeek mvdbeek merged commit 7850c04 into galaxyproject:release_24.0 Mar 19, 2024
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants