Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

731 Performance Analysis for Nihms Loader #63

Merged
merged 12 commits into from
Oct 11, 2023

Conversation

tsande16
Copy link
Contributor

Identified areas to improve to the run-time of the Nihms Loader. The goal was to get it under 5 minutes run-time and currently it runs at 4m56s locally.

Notable changes:

  • Entrez PMID lookup was performing a lookup to external service, but since the pmids are test IDs in the TransformAndLoadSmokeIT they were running null. Since there is another retry with a timeout (required by the ext service), this was consuming more time than necessary. It is now using wiremock and an added system env variable to set the timeout to 0ms.
  • Reduced the number of real award numbers to test for searching grant award numbers. Manually went through the file to identify similar award numbers and reduce the redundancy of award numbers.
  • Reduced the number of test cases in the TransformAndLoadSmokeIT. There were a few cases that were similar to each other, and one of them could be removed and not interfere with coverage.
  • Small cleanups also performed.

To test:

mvn verify

@tsande16 tsande16 self-assigned this Oct 10, 2023
@tsande16 tsande16 force-pushed the 731-perfom-analysis-nihms-loader branch from 59dba49 to a9421db Compare October 10, 2023 14:07
@tsande16
Copy link
Contributor Author

Forgot to rebase before initializing the PR. It has now been rebased with main.

Copy link
Contributor

@rpoet-jh rpoet-jh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, Tim! mvn verify completed in ~5 mins for the whole nihms project on my machine, that is a great improvement.

@tsande16 tsande16 merged commit a18cc57 into main Oct 11, 2023
@tsande16 tsande16 deleted the 731-perfom-analysis-nihms-loader branch October 11, 2023 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants