SOLR-17568: SolrCloud shouldn't proxy/route a core specific request #2885

dsmiley · 2024-11-26T02:05:38Z

https://issues.apache.org/jira/browse/SOLR-17568

Because cores are a node level concept, not a cluster one. This also removed a collection loop.

# Conflicts: # solr/CHANGES.txt

dsmiley · 2024-11-27T06:20:57Z

solr/CHANGES.txt

@@ -36,7 +36,8 @@ Improvements

 Optimizations
 ---------------------
-(No changes)
+* SOLR-17568: The CLI bin/solr export tool now contacts the appropriate nodes directly for data instead of proxying through one.


@epugh you might be interested in this aspect as it touches the CLI. Or perhaps not since you are working on CLI infrastructure and not the business of what they do.

Nice fix.. I had no idea that it could be so inefficient. In the future, export might leverage a streaming expression, i wonder if Streaming Expressions is currently smart enough for this?

Shrug; I found the issue above by seeing a test failure. There's no test failure relating to what you say.

gbellaton

I fail to understand why the previous code with even bothering to go through all the collections from the ClusterState.
The new code looks good to me.

epugh

LGTM, seems like a gnarly pattern to spot and fix? One nit about the major changes text.

epugh · 2024-11-27T11:28:26Z

solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java

This is some nice simplification!

epugh · 2024-11-27T11:30:13Z

solr/solr-ref-guide/modules/upgrade-notes/pages/major-changes-in-solr-10.adoc

+=== SolrCloud request routing
+
+HTTP requests to SolrCloud that are for a specific core must be delivered to the node with that core.
+Past SolrCloud versions would scan all cores everywhere to find a node with the core, and if found then proxy the request.


Feels like a slightly dangling sentence... I now understand the "past", now what is happening? Or was this just a a inefficient bug pattern?

now what is happening

Will HTTP 404 if Solr can't resolve the request. I should add that. I don't find the last sentence to be dangling but feel free to offer a rewording suggestion.

janhoy · 2024-11-27T12:31:05Z

solr/core/src/java/org/apache/solr/cli/ExportTool.java

-
-        try (SolrClient client = CLIUtils.getSolrClient(baseurl, credentials)) {
+        // reference the replica's node URL, not the baseUrl in scope, which could be anywhere
+        try (SolrClient client = CLIUtils.getSolrClient(replica.getBaseUrl(), credentials)) {


Since this is a CLI tool, and the CLI tool is potentially being used from a computer outside the cluster itself, I wonder if a side-effect of this might be that it won't work through reverse proxy / ingress with a different external DNS name than what Solr has on the inside.

If I interpret this code correctly, replica.getBaseUrl() is data fetched from ZK clusterstate, while baseUrl was the user-supplied --solr-url. Thus it could be that export tool would currently work through a proxy / ingress due to the auto proxying, but may not work with this change.

A related and similar issue is that the Admin UI does not work well behind a reverse proxy, if you try to click on links for other Cores it will all break.

If we truly want to support clients through reverse proxies, I believe the correct fix is for the Solr server to support Forwarded HTTP header (and its X-Forwarded-* counterparts) and rewrite URLs served in REST responses to adhere to the external address space, not the internal.

It's true that this change would break for such a scenario. Thankfully the ExportTool is using the Solr/HTTP ClusterStateProvider instead of ZK; thus the URLs may be rewritable by such a proxy thing since it'd be at the HTTP level. But Solr the client would need to be configured to use XML instead of javabin, but I think it's javabin by default. I could imagine adding special support for this for CLUSTERSTATE or COLSTATUS

Despite that issue, I still believe in this change. I don't think it makes sense for SolrCloud to go through such guessing efforts to resolve collection/core ambiguity. Ideally we'd improve on that so that we wouldn't have ambiguity. I could imagine a /solr/collectionName!shardName/select syntax or similar.

It’s a corner case for sure and not worth modifying until there is a user request. Just wanted to highlight the issue and raise awareness as to whether we want bin/solr to work through reverse proxy.

dsmiley · 2024-11-27T23:36:43Z

Thanks for feedback everyone. I'll merge in a day or two. Reminder: this is 10.0 only.

bruno-roustant · 2024-11-28T13:14:56Z

solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java

-      shuffledSlices = new ArrayList<>(slices);
-      Collections.shuffle(shuffledSlices, Utils.RANDOM);
-    }
+    Iterator<Slice> shuffledSlices = new RandomIterator<>(Utils.RANDOM, slices);


I was wondering why we need to shuffle if we look for a specific core by its name. But I see that the byCoreName parameter is always false (byCoreName is now always false).
So the code wants to find any active replica of any shard in a specific collection, and then return a url with replica.getBaseUrl() + "/" + origCoreName. Do I understand correctly? Is it the intended behavior?

Thanks for looking closer. I see there's some dead code to remove -- byCoreName and don't pass origCorename in either. I'll do another update to clarify.

dsmiley added 2 commits November 25, 2024 18:44

SolrCloud shouldn't proxy/route a core specific request.

dea80ed

Because cores are a node level concept, not a cluster one. This also removed a collection loop.

Use RandomIterator utility.

eaf76ba

github-actions bot added test-framework jetty-server cat:cloud labels Nov 26, 2024

dsmiley added 3 commits November 27, 2024 00:56

fix another test

dc4c37c

CLI ExportTool should have been contacting the correct node but wasn't

a357f4a

CHANGES.txt with upgrade notes.

1a6be13

github-actions bot added documentation Improvements or additions to documentation cat:cli labels Nov 27, 2024

Merge branch 'refs/heads/main' into solr-17568

f0201b6

# Conflicts: # solr/CHANGES.txt

dsmiley requested a review from bruno-roustant November 27, 2024 06:19

dsmiley commented Nov 27, 2024

View reviewed changes

gbellaton approved these changes Nov 27, 2024

View reviewed changes

epugh approved these changes Nov 27, 2024

View reviewed changes

janhoy reviewed Nov 27, 2024

View reviewed changes

improve upgrade notes

b1cf02b

bruno-roustant reviewed Nov 28, 2024

View reviewed changes

simplify getCoreUrl; no byCoreName

8969ae0

github-actions bot added the cat:api label Dec 2, 2024

bruno-roustant approved these changes Dec 2, 2024

View reviewed changes

dsmiley merged commit 3e35891 into apache:main Dec 2, 2024
4 checks passed

dsmiley deleted the solr-17568 branch December 2, 2024 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-17568: SolrCloud shouldn't proxy/route a core specific request #2885

SOLR-17568: SolrCloud shouldn't proxy/route a core specific request #2885

dsmiley commented Nov 26, 2024

dsmiley Nov 27, 2024

epugh Nov 27, 2024

dsmiley Nov 27, 2024

gbellaton left a comment

epugh left a comment

epugh Nov 27, 2024

epugh Nov 27, 2024

dsmiley Nov 27, 2024

janhoy Nov 27, 2024 •

edited

Loading

dsmiley Nov 27, 2024

janhoy Nov 27, 2024

dsmiley commented Nov 27, 2024

bruno-roustant Nov 28, 2024

dsmiley Nov 29, 2024

SOLR-17568: SolrCloud shouldn't proxy/route a core specific request #2885

SOLR-17568: SolrCloud shouldn't proxy/route a core specific request #2885

Conversation

dsmiley commented Nov 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gbellaton left a comment

Choose a reason for hiding this comment

epugh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janhoy Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsmiley commented Nov 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janhoy Nov 27, 2024 •

edited

Loading