Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI-4837 : In-memory synthetic SolrCore for QA node #181

Merged
merged 20 commits into from
May 2, 2024

Conversation

patsonluk
Copy link
Collaborator

@patsonluk patsonluk commented Feb 24, 2024

Description

A prototype to attempt using an in-memory SyntheticSolrCore similar to the SolrCoreProxy in our Solr 8.8

The synthetic collection/replica introduced in Solr 9 added the collection folder to physically disk and also the zookeeper solr/collections.

However, it seems that what we essentially need is a SolrCore in the memory for each config set, which might be able to simply register it against the CoreContainer without actually registering to ZK.

Solution

Introduced SyntheticSolrCore which extends SolrCore, SyntheticSolrCore is only used in Coordinator node to support a subset of SolrCore functionalities required by Coordinator operations such as aggregating and writing out response and providing configset info.

Such core is only registered to the CoreContainer but is no longer register itself to zookeeper.

Take note that to minimize code changes, there will still be snapshot/index data directory written to the QA node. We could have overridden initSnapshotMetaDataManager and initIndex to be no-op to avoid creating them, however, there is currently other code (and possibly future code too) that might rely on the existence of those values. It is safer to just keep them as is.

The data folder for the synthetic core would NOT be loaded on QA node start-up as there's no core.properties in such directory. This is because instead of going through CoreContainer#create for such synthetic core, it's using SyntheticSolrCore#createAndRegisterCore, which bypass core.properties creation and zk registration

Tests

  1. TestCoordinatorRole passed, which provides good integration test coverage to ensure SyntheticSolrCore is working as expected
  2. Local and playpen testing:
    i. Basic query and restart tests from QA node
    ii. Shard split and moves and ensure QA node can distribute the requests to new shards
    iii. collection name is logged correctly
    iv. test on multiple collections to ensure that query work for them all, and the synthetic core should only be created once (by observing the log message Loading synthetic core for config set...)

@@ -1175,7 +1178,7 @@ private SolrCore(
initSearcher(prev);

// Initialize the RestManager
restManager = initRestManager();
restManager = isSynthetic() ? new RestManager() : initRestManager();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how this is helping here?

Copy link
Collaborator Author

@patsonluk patsonluk Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a quick prototype so I basically quickly handle NPEs when i ran into them. 😁 I put this one is so it will init w/o throwing exceptions.

I believe for a better implementation, we will need to understand all the init happened in the SolrCore, there's a fair chance that we can simply bypass most of the init routine for a synthetic core (rest manager, event handler/listener, plugin, searcher etc), as synthetic core does not do any real searching, it simply relays it to the actual data node

@@ -66,6 +66,12 @@ public ZkConfigSetService(SolrZkClient zkClient) {
this.zkClient = zkClient;
}

@Override
protected SolrResourceLoader createConfigSetResourceLoader(CoreDescriptor cd, String configSetName) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to understand better, why we need to override here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For synthetic core, the resource loader only needs to load the configset resource, it would have no access to collection specific resource - which is non existent for this new synthetic core/collection, since it does not even exists in ZK.

There are several implementation of ConfigSetService, and for prototype purpose, we only support this for ZkConfigSetSrvice

e);
}
}
coreContainer.registerCore(syntheticCoreDescriptor, syntheticCore, false, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not publishing into zk? nice!

@patsonluk
Copy link
Collaborator Author

Also pushed some fixes, so the unit test case would pass

@patsonluk
Copy link
Collaborator Author

@chatman @noblepaul . I would ❤️ to gather some thoughts from you both! Mostly on the idea, as for the implementation details, we might need to further cleanup the logic if we decide to move forward.

The synthetic collection was created to avoid excess proxy cores created in 8.x AFAIK, I'm wondering if this proposal could address such concern without raising new ones?

@noblepaul
Copy link
Collaborator

The biggest challenge is pushing this upstream. Making changes to SolrCore and introducing a concept of synthetic for a one of the core classes, for a niche feature is hardsell. However if this is to be only kept in-house , this should be fine

@patsonluk patsonluk force-pushed the patsonluk/qa-core-proxy branch from 5b7b8c1 to 0d2b744 Compare April 18, 2024 00:47
@patsonluk
Copy link
Collaborator Author

patsonluk commented Apr 18, 2024

The biggest challenge is pushing this upstream. Making changes to SolrCore and introducing a concept of synthetic for a one of the core classes, for a niche feature is hardsell. However if this is to be only kept in-house , this should be fine

That's a good point, totally agree! Even if we do not upstream this, minimizing changes to SolrCore could save us from merge issues.

I have added some minor changes and javadoc to my implementation. The change to SolrCore should now only be the access modifier of bufferUpdatesIfConstructing. I'm hoping that the apache community will find it not as invasive (and it used to be protected as well in 8.8).

@patsonluk patsonluk changed the title Experiment : in-memory Solr Core Proxy for QA node Experiment : in-memory synthetic SolrCore for QA node Apr 18, 2024
@patsonluk patsonluk marked this pull request as ready for review April 18, 2024 21:36
@patsonluk patsonluk changed the title Experiment : in-memory synthetic SolrCore for QA node In-memory synthetic SolrCore for QA node Apr 18, 2024
@noblepaul
Copy link
Collaborator

noblepaul commented Apr 22, 2024

I like the fact that the changes to SolrCore is trivial/non-invasive. We can push this upstream as well

👍

Copy link
Collaborator

@noblepaul noblepaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

// core's collection from ZK
// which synthetic core is not registered in ZK.
// We do not expect RestManager ops on Coordinator Nodes
return new RestManager();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what restManager enables?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is only a tiny feature called "managed resources" being powered by RestManager. It's not even reuired

Copy link
Collaborator

@hiteshk25 hiteshk25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

(Thanks @patsonluk , please make sure QA logs are ok (collname, etc), though I don't feel any change in that area)

Ensure only one synthetic core created per config set
@patsonluk
Copy link
Collaborator Author

@hiteshk25

Good call on please make sure QA logs are ok (collname, etc) . The existing proposal did miss the collection name!

I have committed some fixes:

  1. Collection name should be set properly for the collection being queried on now.
  2. The SyntheticSolrCore should now be one instance per configset, my previous code didn't check correctly

Would you mind to review the changes in a32a2ac . Many thanks again!!!

Copy link
Collaborator

@hiteshk25 hiteshk25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (Thanks @patsonluk )

@patsonluk patsonluk merged commit cba3230 into fs/branch_9_3 May 2, 2024
7 checks passed
@patsonluk patsonluk deleted the patsonluk/qa-core-proxy branch May 2, 2024 21:05
@patsonluk patsonluk changed the title In-memory synthetic SolrCore for QA node SAI-4837 : In-memory synthetic SolrCore for QA node May 2, 2024
patsonluk added a commit that referenced this pull request May 2, 2024
* Experimental QA core as proxy in memory

* Experimental QA core as proxy in memory

* Experimental QA core as proxy in memory

* ./gradlew tidy

* Comment out portion that checks for synthetic collection/core existence

* Fixes watches and test cases

* Refactoring to avoid modifying existing method modifiers

* Cleanup

* Some refactoring to minimize changes to SolrCore

* javadoc

* ./gradlew tidy

* code cleanup

* code cleanup

* Fix resource loader issue if configSetName is null

* ./gradlew tidy

* Fixed issue with incorrect collection name in context
Ensure only one synthetic core created per config set

* ./gradlew tidy

* Correctly use <config set>_core as synthetic core name for clarity
Should open the registered core on first synthetic core creation since it is a getCore op

* ./gradlew tidy

* Fixed issue with incorrect collection name in context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants