Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate title page performance #73

Closed
AMDmi3 opened this issue Jan 29, 2018 · 4 comments
Closed

Investigate title page performance #73

AMDmi3 opened this issue Jan 29, 2018 · 4 comments

Comments

@AMDmi3
Copy link
Member

AMDmi3 commented Jan 29, 2018

Title page is currently renders in ~1.5 seconds, which is too slow.

There also were several performance pessimizations after latest refactoring of database class:

image

Should investigate and improve.

PS. This is not too critical for now as the title page is agressively cached by our nginx.

@AMDmi3
Copy link
Member Author

AMDmi3 commented Jan 29, 2018

  • 13th - likely wikidata addition, which has added a lot of version labels
  • 19th - database refactoring (2804a63)
  • 24th - likely deploy of 9f8b9f7, also cb573a9 and 21323ad

@AMDmi3
Copy link
Member Author

AMDmi3 commented Jan 29, 2018

Yea, too many versions from wikidata. At least on the title page, we probably want to limit number of outdated versions with around ten, since it's only a demonstration.

@AMDmi3
Copy link
Member Author

AMDmi3 commented Oct 11, 2018

Was:
a. Getting packages from the database: 282ms
b. Calculating summaries: 520ms 210ms
c. Rendering: 171ms

After 6a899c4 which optimized b by 2.5x, the slowest part is the database, but the time is spread roughly evenly between the other parts as well. Possible solutions:

  • Fetch less from the database, as currently it's a random fetch per each package, and there are 27k packages for 112 projects we show on the title page. Fetching only 5 columns does not help either.
    • Limit number of projects on the main page. Well that feels kind of a retreat. I need to mention that the same scenario happens when run a metapackage search with lower limit on families set to some high value, and we can't affect that list. But technically, we can lower a number of items displayed on a page based on metapackage statistics we fetch before actual packages. 👎
    • Only fetch recent versions - this will limit the amounts of data fetched and transferred, but will not affect number of projects shown. This is tricky as we still need to fetch each package and check its version. And calculating a version threshold is tricky on its own. 👎
    • Introduce additional database structures for this kind of load, such as a covering index on required fields. It turns out not to weight too much (around 100MB currently) and can actually replace packages_effname_idx which is 60MB, thus taking up only 40 additional megabytes. Test shows that even if it's used, performance doesn't change though. Separate table doesn't help either, even though it weights 1/10th of packages. On another box though, the same query takes 15 seconds, and in index only scan case doesn't produce any heap fetches.
    • General database optimization which will make more data fit into memory
    • Cache summary information and update piecewise
  • Drop python
  • Limit number of versions shown on the site

@AMDmi3 AMDmi3 transferred this issue from repology/repology-updater Oct 4, 2019
@AMDmi3
Copy link
Member Author

AMDmi3 commented Nov 15, 2024

Close in favor of repology/repology-rs#93

@AMDmi3 AMDmi3 closed this as not planned Won't fix, can't repro, duplicate, stale Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant