Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow TransportConfigUpdateAction when security config initialization has completed #3810

Merged
merged 33 commits into from
Jan 5, 2024

Conversation

cwperks
Copy link
Member

@cwperks cwperks commented Dec 8, 2023

Description

Introduces another variable on DynamicConfigFactory called bgThreadComplete that behaves differently than the initialized variable. bgThreadComplete is a flag that signals to TransportConfigUpdateAction that it can start accepting updates.

There are 2 ways the security index can be created from scratch:

  1. If plugins.security.allow_default_init_securityindex is set to true it will create the security index and load all yaml files
  2. If plugins.security.allow_default_init_securityindex is set to false, the security index is not created on bootstrap and requires a user to run securityadmin to initialize security. When securityadmin is utilized, the cluster does depend on TransportConfigUpdateAction to initialize security so there still needs to be an avenue where this can update the config before initialized is set to true

This PR sets bgThreadComplete to false on node startup and explicitly sets it to true once its ok for TransportConfigUpdateAction to start accepting transport actions. In case 2) up above, it can be set to true before DynamicConfigFactory is initialized so that it can accept requests from securityadmin.

  • Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation)

Bug fix

Issues Resolved

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (1f9edf4) 66.58% compared to head (4d53edc) 65.17%.
Report is 12 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3810      +/-   ##
==========================================
- Coverage   66.58%   65.17%   -1.42%     
==========================================
  Files         298      298              
  Lines       21188    21218      +30     
  Branches     3453     3460       +7     
==========================================
- Hits        14109    13828     -281     
- Misses       5363     5674     +311     
  Partials     1716     1716              
Files Coverage Δ
.../opensearch/security/OpenSearchSecurityPlugin.java 84.48% <100.00%> (+0.04%) ⬆️
...ch/security/securityconf/DynamicConfigFactory.java 55.62% <ø> (ø)
...g/opensearch/security/support/ConfigConstants.java 95.23% <ø> (ø)
...tion/configupdate/TransportConfigUpdateAction.java 95.45% <66.66%> (-4.55%) ⬇️
...earch/security/privileges/PrivilegesEvaluator.java 72.03% <0.00%> (-0.53%) ⬇️
...ecurity/configuration/ConfigurationRepository.java 69.11% <65.55%> (-10.35%) ⬇️

... and 37 files with indirect coverage changes

Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
@cwperks
Copy link
Member Author

cwperks commented Dec 19, 2023

Looking at the CI failures

Signed-off-by: Craig Perkins <[email protected]>
Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned these test cases have will be flaky due to system performance differences on github runners. Lets make sure the validation is really solid since we are fixing a timing issue based problem.

Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created a PR [1] that I think helps clean up the logic of this class so there is a set way that the background initialization is triggered - its always bugged me that we start a thread in the constructor but then 'restart it' after init is called. Let me know what you think of this approach.

@peternied peternied added the backport 2.x backport to 2.x branch label Jan 4, 2024
@peternied
Copy link
Member

@scrawfor99 Could you take another look?

@peternied peternied changed the title Only allow TransportConfigUpdateAction after bgThread for initialization has completed Allow TransportConfigUpdateAction when security config initialization has completed Jan 5, 2024
Copy link
Contributor

@stephen-crawford stephen-crawford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Craig!

@stephen-crawford stephen-crawford merged commit 045d4ef into opensearch-project:main Jan 5, 2024
81 of 83 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/security/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/security/backport-2.x
# Create a new branch
git switch --create backport/backport-3810-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 045d4ef8a5d9bb8470e03c1b4d7cc68847b986cb
# Push it to GitHub
git push --set-upstream origin backport/backport-3810-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/security/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-3810-to-2.x.

peternied pushed a commit to peternied/security that referenced this pull request Jan 5, 2024
… initialization has completed (opensearch-project#3810)

Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
(cherry picked from commit 045d4ef)
cwperks added a commit that referenced this pull request Jan 8, 2024
… initialization has completed (#3810) (#3927)

### Description
- Backport of #3810 from 045d4ef

### Check List
- [X] New functionality includes testing
- [X] New functionality has been documented
- [X] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and
signing off your commits, please check
[here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

---------

Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Co-authored-by: Craig Perkins <[email protected]>
cwperks added a commit to cwperks/security that referenced this pull request Mar 12, 2024
… has completed (opensearch-project#3810)

Introduces another variable on DynamicConfigFactory called
`bgThreadComplete` that behaves differently than the `initialized`
variable. `bgThreadComplete` is a flag that signals to
TransportConfigUpdateAction that it can start accepting updates.

There are 2 ways the security index can be created from scratch:

1. If `plugins.security.allow_default_init_securityindex` is set to
**true** it will create the security index and load all yaml files
2. If `plugins.security.allow_default_init_securityindex` is set to
**false**, the security index is not created on bootstrap and requires a
user to run securityadmin to initialize security. When securityadmin is
utilized, the cluster does depend on
[TransportConfigUpdateAction](https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/tools/SecurityAdmin.java#L975-L977)
to initialize security so there still needs to be an avenue where this
can update the config before `initialized` is set to **true**

This PR sets `bgThreadComplete` to **false** on node startup and
explicitly sets it to **true** once its ok for
TransportConfigUpdateAction to start accepting transport actions. In
case 2) up above, it can be set to **true** before DynamicConfigFactory
is `initialized` so that it can accept requests from securityadmin.

* Category (Enhancement, New feature, Bug fix, Test fix, Refactoring,
Maintenance, Documentation)

Bug fix

- Resolves opensearch-project#3204

- [X] New functionality includes testing
- [ ] New functionality has been documented
- [X] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and
signing off your commits, please check
[here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

---------

Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
(cherry picked from commit 045d4ef)
cwperks added a commit that referenced this pull request Mar 13, 2024
… initialization has completed (#4115)

Backport #3810 to 1.3

Testing performed with the configuration below:


[1.3-latest.zip](https://github.com/opensearch-project/security/files/14577043/1.3-latest.zip)

This uses a configuration similar to the demo configuration, but with
the following modifications that make a reproduction of this error more
pronounced:

```
# plugins.security.allow_default_init_securityindex: true
plugins.security.unsupported.restapi.allow_securityconfig_modification: true
plugins.security.unsupported.load_static_resources: false
```

Steps to test:

1. Start the 3 node cluster
2. Exec into one of the nodes: ` docker exec -it
13-latest-opensearch-node2-1 /bin/bash`
3. Run securityadmin: `cd plugins/opensearch-security/tools &&
./securityadmin.sh -cd ../securityconfig/ -icl -nhnv \
  -cacert ../../../config/root-ca.pem \
  -cert ../../../config/kirk.pem \
  -key ../../../config/kirk-key.pem`
4. Exit the remote session: `exit`
5. Start infinite loop of config update in another terminal: `while
true;do curl -ai -u "admin:admin" -k -X PATCH
https://localhost:9200/_opendistro/_security/api/securityconfig -H
'Content-Type: application/json' -d'[{"op": "replace", "path":
"/config/dynamic/authc/basic_internal_auth_domain/transport_enabled",
"value": "true"}]'; done`
6. Restart one of the nodes

Query the rebooted node and you will not get a response:

```
> curl -XGET https://admin:admin@localhost:9201 -k
No response
```

If the cache is loaded correctly you will get a response, but access
denied since static resource loading is disabled (this is expected and
it means that internal users have been loaded into the cache):

```
> curl -XGET https://admin:admin@localhost:9200 -k
{"error":{"root_cause":[{"type":"security_exception","reason":"no permissions for [cluster:monitor/main] and User [name=admin, backend_roles=[admin], requestedTenant=null]"}],"type":"security_exception","reason":"no permissions for [cluster:monitor/main] and User [name=admin, backend_roles=[admin], requestedTenant=null]"},"status":403}
```

---------

Signed-off-by: Craig Perkins <[email protected]>
cwperks added a commit that referenced this pull request Mar 13, 2024
…config initialization… (#4116)

Backport #3810 to opendistro-1.13

Testing performed with the configuration below:


[opendistro-1.13-latest.zip](https://github.com/opensearch-project/security/files/14578486/opendistro-1.13-latest.zip)

This uses a configuration similar to the demo configuration, but with
the following modifications that make a reproduction of this error more
pronounced:

```
# opendistro_security.allow_default_init_securityindex: true
opendistro_security.unsupported.restapi.allow_securityconfig_modification: true
opendistro_security.unsupported.load_static_resources: false
```

Steps to test:

1. Start the 3 node cluster
2. Exec into one of the nodes: ` docker exec -it
13-latest-opensearch-node2-1 /bin/bash`
3. Run securityadmin: `cd plugins/opendistro_security/tools && chmod 777
* && ./securityadmin.sh -cd ../securityconfig/ -icl -nhnv \
  -cacert ../../../config/root-ca.pem \
  -cert ../../../config/kirk.pem \
  -key ../../../config/kirk-key.pem`
4. Exit the remote session: `exit`
5. Start infinite loop of config update in another terminal: `while
true;do curl -ai -u "admin:admin" -k -X PATCH
https://localhost:9200/_opendistro/_security/api/securityconfig -H
'Content-Type: application/json' -d'[{"op": "replace", "path":
"/config/dynamic/authc/basic_internal_auth_domain/transport_enabled",
"value": "true"}]'; done`
6. Restart one of the nodes

Query the rebooted node and you will not get a response:

```
> curl -XGET https://admin:admin@localhost:9201 -k
Unauthorized
```

If the cache is loaded correctly you will get a response, but access
denied since static resource loading is disabled (this is expected and
it means that internal users have been loaded into the cache):

```
> curl -XGET https://admin:admin@localhost:9200 -k
{"error":{"root_cause":[{"type":"security_exception","reason":"no permissions for [cluster:monitor/main] and User [name=admin, backend_roles=[admin], requestedTenant=null]"}],"type":"security_exception","reason":"no permissions for [cluster:monitor/main] and User [name=admin, backend_roles=[admin], requestedTenant=null]"},"status":403}
```

-----

When checking out this branch you can use `mvn clean install
-DskipTests` to build a jar. The jar is built in the `target` folder.

Signed-off-by: Craig Perkins <[email protected]>
dlin2028 pushed a commit to dlin2028/security that referenced this pull request May 1, 2024
… has completed (opensearch-project#3810)

Introduces another variable on DynamicConfigFactory called
`bgThreadComplete` that behaves differently than the `initialized`
variable. `bgThreadComplete` is a flag that signals to
TransportConfigUpdateAction that it can start accepting updates.

There are 2 ways the security index can be created from scratch:

1. If `plugins.security.allow_default_init_securityindex` is set to
**true** it will create the security index and load all yaml files
2. If `plugins.security.allow_default_init_securityindex` is set to
**false**, the security index is not created on bootstrap and requires a
user to run securityadmin to initialize security. When securityadmin is
utilized, the cluster does depend on
[TransportConfigUpdateAction](https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/tools/SecurityAdmin.java#L975-L977)
to initialize security so there still needs to be an avenue where this
can update the config before `initialized` is set to **true**

This PR sets `bgThreadComplete` to **false** on node startup and
explicitly sets it to **true** once its ok for
TransportConfigUpdateAction to start accepting transport actions. In
case 2) up above, it can be set to **true** before DynamicConfigFactory
is `initialized` so that it can accept requests from securityadmin.

* Category (Enhancement, New feature, Bug fix, Test fix, Refactoring,
Maintenance, Documentation)

Bug fix

- Resolves opensearch-project#3204

- [X] New functionality includes testing
- [ ] New functionality has been documented
- [X] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and
signing off your commits, please check
[here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

---------

Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
Co-authored-by: Peter Nied <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] TransportConfigUpdateAction causing a race condition during node bootstrap leading to 403/500 errors
4 participants