Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add concurrent limit on datasource and sessions #2390

Merged
merged 3 commits into from
Oct 30, 2023

Conversation

penghuo
Copy link
Collaborator

@penghuo penghuo commented Oct 28, 2023

Description

  1. add cluster level datasource limit.
  2. add cluster level session limit.
  3. move query process to handler based on query type. Todo, will add drop index will be address in Redefine Drop Index as logical delete #2386.
  4. add LeaseManager, currently, it only limit the concurrent session in cluster.

Issues Resolved

#2379

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -780,7 +787,7 @@ void testDispatchDescribeIndexQuery() {
StartJobRequest expected =
new StartJobRequest(
query,
"TEST_CLUSTER:index-query",
"TEST_CLUSTER:non-index-query",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dai-chen @vamsi-amazon we could process DESC SKIPPING INDEX in interactive/batch query, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, all SHOW/DESC index can actually be treated as direct query

@@ -690,6 +695,7 @@ void testRefreshIndexQuery() {
HashMap<String, String> tags = new HashMap<>();
tags.put(DATASOURCE_TAG_KEY, "my_glue");
tags.put(CLUSTER_NAME_TAG_KEY, TEST_CLUSTER_NAME);
tags.put(JOB_TYPE_TAG_KEY, JobType.BATCH.getText());
Copy link
Collaborator Author

@penghuo penghuo Oct 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dai-chen @vamsi-amazon REFRESH XXX, means batch job right. it will not be long running job for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, REFRESH only supports triggering manual refresh. Ref: opensearch-project/opensearch-spark#100

Map<String, String> tags = context.getTags();
tags.put(INDEX_TAG_KEY, indexQueryDetails.openSearchIndexName());
DataSourceMetadata dataSourceMetadata = context.getDataSourceMetadata();
tags.put(JOB_TYPE_TAG_KEY, JobType.STREAMING.getText());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only handle long running streaming query in StreamingQueryHandler, so JOB_TYPE is always streaming.


if (IndexQueryActionType.DROP.equals(indexQueryDetails.getIndexQueryActionType())) {
// todo, fix in DROP INDEX PR.
return handleDropIndexQuery(dispatchQueryRequest, indexQueryDetails);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vamsi-amazon drop index will be handled on next PR. keep it for now.

return handleNonIndexQuery(dispatchQueryRequest);
DataSourceMetadata dataSourceMetadata =
this.dataSourceService.getRawDataSourceMetadata(dispatchQueryRequest.getDatasource());
dataSourceUserAuthorizationHelper.authorizeDataSource(dataSourceMetadata);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vamsi-amazon. we can always do AuthZ before any processing, right?

@codecov
Copy link

codecov bot commented Oct 28, 2023

Codecov Report

Merging #2390 (cd01cf8) into main (1bcacd1) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #2390      +/-   ##
============================================
- Coverage     95.55%   95.55%   -0.01%     
- Complexity     4918     4922       +4     
============================================
  Files           468      471       +3     
  Lines         13668    13675       +7     
  Branches        915      921       +6     
============================================
+ Hits          13061    13067       +6     
  Misses          587      587              
- Partials         20       21       +1     
Flag Coverage Δ
sql-engine 95.55% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ces/transport/TransportCreateDataSourceAction.java 100.00% <100.00%> (ø)
...rch/sql/opensearch/setting/OpenSearchSettings.java 100.00% <100.00%> (ø)
...search/sql/spark/dispatcher/AsyncQueryHandler.java 100.00% <ø> (ø)
...search/sql/spark/dispatcher/BatchQueryHandler.java 100.00% <100.00%> (ø)
.../sql/spark/dispatcher/InteractiveQueryHandler.java 100.00% <100.00%> (ø)
...rch/sql/spark/dispatcher/SparkQueryDispatcher.java 100.00% <100.00%> (ø)
...ch/sql/spark/dispatcher/StreamingQueryHandler.java 100.00% <100.00%> (ø)
...ch/sql/spark/execution/session/SessionManager.java 100.00% <ø> (ø)
...rch/sql/spark/execution/statestore/StateStore.java 81.75% <100.00%> (-0.27%) ⬇️
...easemanager/ConcurrencyLimitExceededException.java 100.00% <100.00%> (ø)
... and 1 more

... and 1 file with indirect coverage changes

Signed-off-by: Peng Huo <[email protected]>
@penghuo penghuo merged commit d3ce049 into opensearch-project:main Oct 30, 2023
22 of 23 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.x
# Create a new branch
git switch --create backport/backport-2390-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d3ce049be22e6df7e83b57f8b61f8533241aab83
# Push it to GitHub
git push --set-upstream origin backport/backport-2390-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2390-to-2.x.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.11 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.11 2.11
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.11
# Create a new branch
git switch --create backport/backport-2390-to-2.11
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d3ce049be22e6df7e83b57f8b61f8533241aab83
# Push it to GitHub
git push --set-upstream origin backport/backport-2390-to-2.11
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.11

Then, create a pull request where the base branch is 2.11 and the compare/head branch is backport/backport-2390-to-2.11.

penghuo added a commit to penghuo/os-sql that referenced this pull request Oct 30, 2023
)

* add concurrent limit on datasource and sessions

Signed-off-by: Peng Huo <[email protected]>

* fix ut coverage

Signed-off-by: Peng Huo <[email protected]>

---------

Signed-off-by: Peng Huo <[email protected]>
(cherry picked from commit d3ce049)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 30, 2023
* add concurrent limit on datasource and sessions

Signed-off-by: Peng Huo <[email protected]>

* fix ut coverage

Signed-off-by: Peng Huo <[email protected]>

---------

Signed-off-by: Peng Huo <[email protected]>
(cherry picked from commit d3ce049)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 30, 2023
* add concurrent limit on datasource and sessions

Signed-off-by: Peng Huo <[email protected]>

* fix ut coverage

Signed-off-by: Peng Huo <[email protected]>

---------

Signed-off-by: Peng Huo <[email protected]>
(cherry picked from commit d3ce049)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
penghuo pushed a commit that referenced this pull request Oct 30, 2023
* add concurrent limit on datasource and sessions



* fix ut coverage



---------


(cherry picked from commit d3ce049)

Signed-off-by: Peng Huo <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
penghuo pushed a commit that referenced this pull request Oct 30, 2023
* add concurrent limit on datasource and sessions



* fix ut coverage



---------


(cherry picked from commit d3ce049)

Signed-off-by: Peng Huo <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
mengweieric added a commit to mengweieric/sql that referenced this pull request Nov 8, 2023
mengweieric added a commit to mengweieric/sql that referenced this pull request Nov 8, 2023
vamsimanohar added a commit to mengweieric/sql that referenced this pull request Nov 13, 2023
vamsimanohar added a commit that referenced this pull request Nov 13, 2023
* Revert "Add more metrics and handle emr exception message (#2422) (#2426)"

This reverts commit b57f7cc.

* Revert "Block settings in sql query settings API and add more unit tests (#2407) (#2412)"

This reverts commit 3024737.

* Revert "Added session, statement, emrjob metrics to sql stats api (#2398) (#2400)"

This reverts commit 6e17ae6.

* Revert "Redefine Drop Index as logical delete (#2386) (#2397)"

This reverts commit e939bb6.

* Revert "add concurrent limit on datasource and sessions (#2390) (#2395)"

This reverts commit deb3ccf.

* Revert "Add Flint Index Purging Logic (#2372) (#2389)"

This reverts commit dd48b9b.

* Revert "Refactoring for tags usage in test files and also added explicit denly list setting. (#2383) (#2385)"

This reverts commit 37e010f.

* Revert "Enable session by default (#2373) (#2375)"

This reverts commit 7d95e4c.

* Revert "Create new session if client provided session is invalid (#2368) (#2371)"

This reverts commit 5ab7858.

* Revert "Add where clause support in create statement (#2366) (#2370)"

This reverts commit b620a56.

* Revert "create new session if current session not ready (#2363) (#2365)"

This reverts commit 5d07281.

* Revert "Handle Describe,Refresh and Show Queries Properly (#2357) (#2362)"

This reverts commit 16e2f30.

* Revert "Add Session limitation (#2354) (#2359)"

This reverts commit 0f334f8.

* Revert "Bug Fix, support cancel query in running state (#2351) (#2353)"

This reverts commit 9a40591.

* Revert "Fix bug, using basic instead of basicauth (#2342) (#2355)"

This reverts commit e4827a5.

* Revert "Add missing tags and MV support (#2336) (#2346)"

This reverts commit 8791bb0.

* Revert "[Backport 2.x] deprecated job-metadata-index (#2340) (#2343)"

This reverts commit bea432c.

* Revert "Integration with REPL Spark job (#2327) (#2338)"

This reverts commit 58a5ae5.

* Revert "Implement patch API for datasources (#2273) (#2329)"

This reverts commit 4c151fe.

* Revert "Add sessionId parameters for create async query API (#2312) (#2324)"

This reverts commit 3d1a376.

* Revert "Add Statement (#2294) (#2318) (#2319)"

This reverts commit b3c2e94.

* Revert "Upgrade json (#2307) (#2314)"

This reverts commit 6c65bb4.

* Revert "Minor Refactoring (#2308) (#2317)"

This reverts commit 051cc4f.

* Revert "add InteractiveSession and SessionManager (#2290) (#2293) (#2315)"

This reverts commit 6ac197b.

---------

Co-authored-by: Vamsi Manohar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants