Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Discussion) High CPU usage in DSpace frontend related to Angular Server Side Rendering (SSR) #3110

Open
tdonohue opened this issue Jun 10, 2024 · 23 comments
Labels
bug component: SEO Search Engine Optimization help wanted Needs a volunteer to claim to move forward high priority needs discussion performance / caching Related to performance, caching or embedded objects

Comments

@tdonohue
Copy link
Member

tdonohue commented Jun 10, 2024

Describe the bug

This is a discussion ticket related to ongoing issues with high CPU usage related to Angular SSR. It's a place to gather public notes & recommendations while developers work on solutions that can be brought back into DSpace code and documentation.

Hints / Notes

A few common tips/hints have been helpful:

  1. First, ensure you are aware of the Performance Tuning Documentation.
    • Make sure you are using "cluster mode" for PM2! You should define several "instances" in that PM2 cluster in order to allow PM2 to scale across multiple CPUs.
    • The built-in, memory-based caching of server-side rendered pages seems beneficial for some sites. But, be aware it's currently not a shared cache across all "instances" in the PM2 cluster (each instance will have its own cache). Because of this, larger sites may not see much benefit as caching large amounts of content requires a lot of memory.
    • See some discussion on some example settings in this dspace-tech thread.
  2. Consider disabling Angular's inlineCriticalCss, which can improve SSR performance.
  3. Some sites have noted that, immediately after upgrading, they were swarmed by bots (doing reindexing, also AI related bots, etc.). Over time, some have found those bots slow down. But the early days may require actively blocking or slowing down highly active bots.

Known Community Customizations

What follows are a few (known) customizations from DSpace community members to work around or partially solve these CPU issues:

Brainstorms

Random ideas/brainstorms that are unproven at this time:

  • Is there a way to use Apache "mod_cache" to cache SSR generated pages?
    • As with any external cache solution, it may have a negative impact on download/view statistics (in Solr Statistics) as hits to the external cache would not send "viewevents" to the REST API (necessary for statistics). Unclear how to workaround this limitation of external caching solutions.

Mailing List Discussions

These are (known) mailing lists discussions related to this topic:

How can you help?

DSpace Committers & Developers are striving to improve the performance of DSpace 7+ to minimize its resource requirements. We welcome brainstorms, solutions, shared code, improved docs that aligns with these goals. DSpace is community built/maintained software (with no centralized development team). The more details community members can share, the more quickly we can locate/design solutions to benefit everyone.

@tdonohue tdonohue added bug help wanted Needs a volunteer to claim to move forward high priority performance / caching Related to performance, caching or embedded objects needs discussion labels Jun 10, 2024
@paulo-graca
Copy link
Contributor

I want to share that we felt the same described behavior when we went live with our first deployed DS7.
image

Currently, we are still experiencing constant CPU demand when comparing it to the previous version. Mainly on the Angular processes due to crawling or robots activity.

We use Apache mod_proxy in front of the service. We have created some Apache rules to redirect requests to 404 page not found:
/simple-search?query= and /handle/HANDLE_PREFIX/ID/simple-search?query=, avoiding angular to render the result for that particular broken URLs.

Like:

<Location "/simple-search">
  Order Deny,Allow
  Deny from all
  Allow from 127.0.0.1
  ErrorDocument 404 "Page not found"
  RewriteEngine On
  RewriteRule .* - [R=404,L]
</Location>

<LocationMatch "/handle/(.*)/(.*)/simple-search">
  Order Deny,Allow
  Deny from all
  Allow from 127.0.0.1
  ErrorDocument 404 "Page not found"
  RewriteEngine On
  RewriteRule .* - [R=404,L]
</LocationMatch>

Also, we use Apache mod_evasive, with some strict restrictions to the traffic (we are still fine tunning it) and we intend to try out some of the above suggestions (like the inlineCriticalCss), and tune even more the values of angular cache.

@mwoodiupui
Copy link
Member

The built-in memory cache isn't even shared across processes on the same server.

The Redis approach is interesting to me because, while Redis is another in-memory cache, it can be shared across processes.

@mwoodiupui
Copy link
Member

We tried Apache HTTPD mod_cache here and ran into indiscriminate sharing of Shibboleth tickets across distinct users' sessions. I haven't had time to investigate, so it might be due to unrelated changes that were hastily backed out at the same time that we disabled the external cache. If the cache was the culprit, I do not know whether a smarter configuration would fix that.

@hutattedonmyarm
Copy link
Contributor

We (Universität Konstanz) also have made some customizations.

We've opted to use varnish (instead of redis/mod_cache, or built-in cache). This has a few advantages for us:

  • The cache is shared between all instances
  • We run a multilingual repo and the built-in cache does not distinguish by language
  • Finer control over what gets cached when
  • Things like static assets don't even need to touch the node processes most of the time. This is especially useful as long as the frontend is under high load

We've also disabled SSR for logged-in users, becasue they don't need it and it saves resources for us

4Science has updated DSpace-CRIS to exclude the Search component from SSR. This search component is not necessary for SEO, so it may make sense to disable SSR for this component.

We did something similar, but only for the "related publications list" on person entity pages, as that took quite a while to load. We may extend this to all searches in the future though

We extended the robots.txt to disallow additional robots we don't want

@michdyk
Copy link
Contributor

michdyk commented Jun 25, 2024

We provide documentation that you can find helpful. There is the solution we used to replace Server-Side rendered pages with Redis in the case study on the Jagiellonian University Repository:
Replace the build-in Server-Side Rendered pages with Redis.pdf

@tdonohue tdonohue added the component: SEO Search Engine Optimization label Jun 26, 2024
@pnbecker
Copy link
Member

A person with some Angular experience (not a complete expert) asked me why we are making so many small requests to the backend instead of making less but larger requests to the backend. One example could be the submission form where we send one request for each section of the form and one request for each vocabulary and even more requests, instead of having one request for all information we need to build up the form.

@hutattedonmyarm
Copy link
Contributor

We've been wondering about it as well. Same with /api/authz/authorizations/. So many requests. Same thing with search facets. Loading the configuration, facets, facet values all in separate requests

@tdonohue
Copy link
Member Author

tdonohue commented Jun 27, 2024

There was some good discussion in today's Dev Mtg about areas where we should consider "bundling" many smaller REST Requests into a single (or fewer) requests. They include:

Those in attendance agreed that we should work on efficiency of these requests. Ideally we should bundle requests better and send fewer requests per page.

That said, this sort of work requires help/volunteers/investigation. I'll work on creating some tickets for each of these, as I think they are each separate issues which will require often individual solutions.

(These many requests are semi-related to performance of SSR as they likely can contribute to the slowness. However, these are less likely to be "quick fixes", as some may require a new backend endpoint to be created and refactor the UI code.)

@hutattedonmyarm
Copy link
Contributor

There's another one: Browing the landing page of the demo and clicking to go to one of the communities also loads the data twice:
image
The first one is a 302 redirect, but already contains the community object. Then the actual object is requested again.
#458 also loads data twice.

Not only the number of requests, also the size. E.g. clicking on a publication to go to its page requests the whole "owningCollection" object in a separate request. It's unclear what for and if needed it should be embedded in the item instead of a separate request

@atarix83
Copy link
Contributor

atarix83 commented Jun 28, 2024

@tdonohue @pnbecker

A person with some Angular experience (not a complete expert) asked me why we are making so many small requests to the backend instead of making less but larger requests to the backend. One example could be the submission form where we send one request for each section of the form and one request for each vocabulary and even more requests, instead of having one request for all information we need to build up the form.

Any multiple request done during the submission form are not relevant for this problem since the submission form is not rendered during the SSR
https://github.com/DSpace/dspace-angular/blob/main/src/app/submission/edit/submission-edit.component.ts#L150

Anyway I agree with the fact that we should optimize the number of request to the server when is possible.
I reckon that when we have multiple requests to the same endpoint referring to different dspace objects, we could try to group them in a single one.
Authorizations endpoint is one example where we have a search method which take multiple uuids for the same authorization feature request (https://github.com/4Science/Rest7Contract/blob/main/authorizations.md#objects). This search method should be used where we handle a set of dspace objects like in the search component.

We did something in this regarding on DSpace-CRIS with the item endpoint (https://github.com/4Science/Rest7Contract/blob/main-cris/items.md#findallbyids) to avoid to have multiple request to the item endpoint just to request for embedded objects as done in the search for thumbnails or other objects :
image

This approach is helpful not only for the SSR but also for the search functionality in general because it allows to speed up the loading of search results.

More in general as well as the high number of REST requests, the SSR can introduce overhead, especially for resource-intensive apps that require heavy server-side processing or complex data computations. In such cases, SSR may strain server resources and slow down rendering.
To mitigate these issues, we should profile and optimize resource-intensive code by using a node monitoring tool.

@mwoodiupui
Copy link
Member

May I suggest that removing redundant requests, and bundling related requests, are important and should be addressed, but that we may need a second ticket for issues that are unrelated to SSR.

@tdonohue
Copy link
Member Author

tdonohue commented Jul 3, 2024

Today, I've moved a number of the performance issues described above into separate tickets. They are not directly related to this SSR issue (or only very loosely related), but obviously are performance issues themselves which require fixing:

If there are other obvious areas where DSpace is (inefficiently) making many requests, then I'd recommend others create additional tickets as well. While these issues are all semi-related, they all may need to be solved separately, as the solutions may not be identical in all cases.

@paulo-graca
Copy link
Contributor

«4Science has updated DSpace-CRIS to exclude the Search component from SSR. This search component is not necessary for SEO, so it may make sense to disable SSR for this component.»

I tend to disagree here. Search results are relevant for SEO since you can have faceted results that give relevance in some thematics to your repository/site. Imagine that your repository approaches the "horses" thematic. You could have faceted results per specie and that will give more relevance to your repository in the "horse" thematic and each specie itself. If you pay attention to your repository traffic, you will see several bots trying combinations of search results. If we remove the search from SSR, it would not be possible for crawlers to index them, nor promoting you site in some specific thematic.

I think there is room for improvement regarding the search results....
Performing a single request to a search result page, retrieving just one result (one single item per page), our system performs 38 different database queries:

  • 8 from resourcepolicy table
  • 7 from collection table
  • 6 from item table
  • 4 from metadatavalue table
  • 3 from public.community table
  • 1 from handle table
  • 1 from epersongroup table
  • 1 from relationship_type table
  • 1 from relationship table
  • 1 from community2collection table
  • 1 from community2community table
  • 1 from bundle2bitstream table
  • 1 from fileextension table
  • ...

If I change it to retrieve 10 records per page, then, the system performs more than 500 database queries to the database. This is not scalable. What if we just use Solr and retrieve data stored on Solr for search results?

There is a trade-off here. More time spent on indexing documents, versus more time and performance degradation to retrieve them.
I propose a big change to DSpace here: DSpace/DSpace#9736, by extending the concept of virtual metadata in DSpace. In practice, we could store structured data in Solr and retrieve it later on, for search results, avoiding to perform extra DB queries for additional or related data.

@tdonohue
Copy link
Member Author

tdonohue commented Aug 2, 2024

@paulo-graca : I think your comment may (partially) belong over on #3231, which is the ticket created specifically for this task.

As I've clarified on that ticket, the reason I noted that Search results are "not necessary for SEO" is because we already tell crawlers in our robots.txt file to ignore the search page & facets: https://github.com/DSpace/dspace-angular/blob/main/src/robots.txt.ejs#L11 So, it's odd to have this component available in SSR when (well behaving) crawlers are being told not to use it.

That said, I do agree that we may want to make this configurable, so that sites can choose to allow search results via SSR. Overall the goal of #3231 is to see if there is "quick fix" performance improvement at least until we can determine ways to improve performance of search in general.

I agree with you that performance of search is NOT ideal, but that also seems like a larger redesign / discussion. Any larger redesign would not help the performance of sites on 7.x or 8.x (as it wouldn't be possible to backport). However, a small change to optionally exclude the search.component from SSR could be backported to 7.x/8.x, as it might be considered a performance fix for those platforms.

UPDATE: Also, it's worth noting we have a separate ticket that is about smaller fixes to search performance: #3163 However, this ticket is more about decreasing the number of REST API requests (from the UI) to perform a single search...which hopefully should decrease the database queries as well.

@paulo-graca
Copy link
Contributor

Thank you @tdonohue. I agree, I'm proposing a big change that, perhaps, we can't add it easily to DS8 and DS7. I've also added a comment to #3231, I agree that it should be configurable. My only point is that it matters in terms of SEO to have Search Results in SSR available for crawlers and also to Internet Archives out there.

@vitorsilverio
Copy link
Contributor

This week @autavares-dev and me tried to disable SSR in all pages that not in sitemaps, in short terms all pages that not include "/items/" in url. Our infrainstructure have 5 pods for frontend that had a CPU usage of 1.9 cores, and decreased to 500mcores after this change.
This change will not affect crawllers that follow robots.txt and sitemaps

@tdonohue
Copy link
Member Author

tdonohue commented Aug 27, 2024

@vitorsilverio or @autavares-dev : Could you share how you disabled SSR (by linking to code, or providing an example)? I'm also not entirely clear what you mean by it started at a usage of "1.9 cores" and "decreased to 500mcores"? Could you clarify further?

Overall, this sounds promising, but we'd want to understand more about what you've found to be successful. It might be that others could try a similar approach, or even help make it configurable (so that sites can easily turn SSR off for some pages as needed).

@vitorsilverio
Copy link
Contributor

vitorsilverio commented Aug 28, 2024

@tdonohue
Version: dspace-angular 7.6.1
File: server.ts line 236:

/*
 * The callback function to serve server side angular
 */
function ngApp(req, res) {
  // Only use server-side rendering for specific routes that include items or entities
  if (environment.universal.preboot && (req.url.includes('/items/') || req.url.includes('/entities/')) && req.method === 'GET') {
    // Render the page to user via SSR (server side rendering)
    serverSideRender(req, res);
  } else {
    // If preboot is disabled, just serve the client
    console.log('Universal off, serving for direct client-side rendering (CSR)');
    clientSideRender(req, res);
  }
}

CPU usage:
image
Untill Fri 23 original code, after with the changes above

@vitorsilverio
Copy link
Contributor

vitorsilverio commented Sep 2, 2024

@tdonohue after contact from Ollie (@ google scholar) we realize that not all endpoints in sitemaps are using SSR so we change to:

diff --git a/server.ts b/server.ts
index 93f3e8687..a5ad17c59 100644
--- a/server.ts
+++ b/server.ts
@@ -56,6 +56,7 @@ import { extendEnvironmentWithAppConfig } from './src/config/config.util';
 import { logStartupMessage } from './startup-message';
 import { TOKENITEM } from './src/app/core/auth/models/auth-token-info.model';
 
+import { SSR_PATHS } from 'ssrpaths';
 
 /*
  * Set path for the browser application's dist folder
@@ -238,7 +239,8 @@ export function app() {
  * The callback function to serve server side angular
  */
 function ngApp(req, res) {
-  if (environment.universal.preboot) {
+  // Only use server-side rendering for specific routes that include items or entities
+  if (environment.universal.preboot && req.method === 'GET' && SSR_PATHS.some(p => req.url.includes(p))) {
     // Render the page to user via SSR (server side rendering)
     serverSideRender(req, res);
   } else {
diff --git a/ssrpaths.ts b/ssrpaths.ts
new file mode 100644
index 000000000..9e63d7a12
--- /dev/null
+++ b/ssrpaths.ts
@@ -0,0 +1,11 @@
+/**
+ * Path list for SSR
+ */
+export const SSR_PATHS = [
+  '/items/',
+  '/entities/',
+  '/collections/',
+  '/communities/',
+  '/bitstream/',
+  '/bitstreams/',
+];

@vitorsilverio
Copy link
Contributor

2 new endpoints are required by Google Scholar
'/bitstream/' and '/bitstreams/'

@pcg-kk
Copy link

pcg-kk commented Sep 26, 2024

I think that we are focusing on the wrong part of the optimization. IMO, there is no problem with SSR as it is in Angular. The issue is how CPU-consuming the whole application is. We can specify a few areas that cause this situation:

  • A lot of selectors with parameters where memoization doesn't work in NgRX
  • Everything is pushed to the store and subscribed so store is refreshed a lot, and because of the previous points application get unnecessary subscriptions.

This complicated structure sometimes causes multiple unnecessary data refresh with additional mapping, and the script, instead of mapping data from response, is too busy making selections. That causing the CPU to reach maximum values.

And on the SSR we just render a lot of pages at the same time so the CPU is a crucial resource.

@tdonohue
Copy link
Member Author

@pcg-kk : I think we'd need more information about your analysis. You may be entirely correct, but it's difficult to understand what you are looking at without linking to examples or providing suggestions for possible improvement. Maybe you could document your findings in a Google Doc or on the general DSpace Wiki (e.g. under our "Development > Discussion" page. I think more details would help others to better understand what you are seeing.

Regardless, I do agree that this ticket is not entirely about SSR. As you can see from the discussion points above, we've spun off other tickets from this one about endpoints/pages which are sending abnormally large numbers of requests. So, we are aware that the core bug is not necessarily in SSR, but we are also working to find "quick fix" solutions which can help institutions running 7.x and 8.x to "stabilize" their system if they hit CPU issues specifically in the SSR processing.

In any case, I'd encourage you to share more of your findings...that way I can bring them to the attention of other DSpace developers.

@pcg-kk
Copy link

pcg-kk commented Sep 27, 2024

@tdonohue please look into almost any request executed by the frontend. When you look into the initiator tab in dev tools there is a list of (in my example) 1410 methods called between on the subscription before the request is sent and... my browser doesn't present more. I still don't know the first method that was executed at the beginning. We execute a lot of requests and each of them passes that crazy path before is executed.

image

I can understand that in DSpace we try to create a lot of generic methods that can do a lot of things immediately, but I think that we can agree that this approach that each request is executed in this way is a lost of resources. I don't know the decisions at the beginning why we just don't use effects and actions to call methods, and then success/error actions to save results in the store? Instead we process each request through the store on level before it was send, when it's pending and success. That Request is piped multiple times with a lot of checks and subscriptions, combineLatests.

--EDIT

Here is one more nice example where we see at least what happen but we can also notice that we are going in the same code multiple times:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component: SEO Search Engine Optimization help wanted Needs a volunteer to claim to move forward high priority needs discussion performance / caching Related to performance, caching or embedded objects
Projects
Development

No branches or pull requests

9 participants