Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point in time security model #2087

Closed
Tracked by #3959
bharath-techie opened this issue Sep 12, 2022 · 16 comments
Closed
Tracked by #3959

Point in time security model #2087

bharath-techie opened this issue Sep 12, 2022 · 16 comments
Assignees
Labels
enhancement New feature or request sprint backlog triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. v2.4.0 'Issues and PRs related to version v2.4.0'

Comments

@bharath-techie
Copy link
Contributor

bharath-techie commented Sep 12, 2022

Point in time security model

Context:

What is point in time ?

User can create a Point In Time for a set of indices. A Point In Time is a set of segments which are frozen in time taken at the moment of creation. We lock the segments of those indices’ shards and create contexts to access and query only those shards.

Such point in time contexts can be reused across multiple search queries and the data that is searched upon will be on the locked segments. This solution is used with search after for pagination with data consistency.

This can be related to ‘scroll’ , whereas ‘scroll’ context gets created for each query ( cannot be reused for different search queries ) and ‘scroll’ can only move forward.
Design doc : opensearch-project/OpenSearch#3960
Meta issue: opensearch-project/OpenSearch#3959

Point in time APIs

  • Create point in time
    • The request implements ‘IndicesRequest.Replaceable“
    • action - “indices:data/read/point_in_time/create”
    • User can create point in time on any index for which the user has access to ( similar to search )
  • Delete point in time
    • User can pass list of PITs to delete (or) user can choose to delete all PITs
    • action - “indices:data/read/point_in_time/delete”
  • List all PITs
    • This API lists all PITs that the user has access to
    • Rest user facing action to list PITs- “indices:data/read/point_in_time/readall”
    • Action to retreive all cluster PITs - “cluster:admin/point_in_time/readall”
  • Point in time segments API
    • This shows the resource utilization of segments held because of a point in time context.
    • User can pass list of PITs to show segments info (or) user can choose to show all PIT segments info
    • Action - "indices:monitor/segments/point_in_time"

Permission model

Action group:

manage_point_in_time:
  reserved: true
  hidden: false
  static: true
  allowed_actions:
    - "indices:data/read/point_in_time/create"
    - "indices:data/read/point_in_time/delete"
    - "indices:data/read/point_in_time/readall"
    - "cluster:admin/point_in_time/readall"
    - "indices:monitor/point_in_time/segments"
  type: "index"
  description: "Manage point in time actions"

Role which has access to all PIT actions :

point_in_time_1:
  reserved: true
  hidden: false
  description: "Migrated from v6 (all types mapped)"
  cluster_permissions:
    - "cluster:admin/point_in_time/read_from_nodes"
  index_permissions:
    - index_patterns:
        - "pit_1"
      dls: null
      fls: null
      masked_fields: null
      allowed_actions:
        - "manage_point_in_time"

Requirement

  • By design, the user is able to ‘create pit’ on any index and ‘search’ using the same PIT if the underlying index permissions are present
  • So the user should be able to delete or list the PITs which they were able to create ( as long as the underlying index permission is present )
  • User should not be restricted to these APIs (delete, list or segments ) if they don’t have all indices read permission

In order to meet the above requirements, we are making changes in security plugin to evaluate permissions of PITs.

Following two items need review:

  1. Proposed security model
  2. List all PITs design

1. Proposed security model

We need to handle two cases for the APIs

  • Access for explicit PITs ( LIST request )
  • Access for ‘all’ type PIT operations
    • Access when no PITs are present in cluster

When is a PIT permitted ?

A PIT is made of one ore many indices. If the user has access to all the indices of a PIT, then PIT is permitted.
This is common for all APIs.

What if user does not have action permissions ?

Operation is denied.

a. Access for explicit PITs
  • All PITs in the request must be permitted, then we allow the operation ( ALLOW )
  • Even if one PIT is not permitted, we deny the operation ( DENY )
  • This is similar to search and other indices APIs ( user must have access to all explicit indices given as part of search request )
b. Access for operation with ‘all’ or ‘ * ’ PITs
  • Out of all PITs in cluster , if any PIT is permitted , then operation will carry forward ( ALLOW )
  • If there are no permitted PITs, throw forbidden error ( DENY )
  • This is inline with search API
No PITs in cluster
  • General case, Deny operation. Throw 403 - forbidden error ( DENY )
  • Special case - Similar to other indices operation, if the config property - do_not_fail_on_forbidden_empty is true - then allow operation ( ALLOW )
@Override
    public boolean isDnfofForEmptyResultsEnabled() {
        return config.dynamic.do_not_fail_on_forbidden_empty;
    }

2. List all PITs Design

Context

There are two options currently at discussion

Option 1 : New action filter for PITs in security plugin

We will introduce a new action filter which will be called by security filter for PIT requests to wrap the listener to call the PIT privilege evaluator to get the permitted PITs.
Check this thread : opensearch-project/OpenSearch#4388 (comment)

filter pit

Pros :
No extra action / permission required

Cons :
Maintainability of additional filter code in security plugin

Option 2 : Dummy action to filter PITs

Dummy action in opensearch to filter PITs

dummy action 1

Pros :
Simple solution, doesn't add any code to security plugin and it is already tested in the current PR

Cons :

  • Extra dummy action is used for nothing other than filtering when security plugin is enabled. No business logic is present.
  • Extra action permission required for user
@bharath-techie bharath-techie added enhancement New feature or request untriaged Require the attention of the repository maintainers and may need to be prioritized labels Sep 12, 2022
@bharath-techie
Copy link
Contributor Author

@peternied @cliu123 please review this.

@reta @Bukhtawar

@cwperks cwperks removed the untriaged Require the attention of the repository maintainers and may need to be prioritized label Sep 12, 2022
@cwperks
Copy link
Member

cwperks commented Sep 12, 2022

Assigning this issue to @peternied to provide next steps.

@peternied
Copy link
Member

peternied commented Sep 12, 2022

@bharath-techie Thanks for writing this up and its really useful to have available. This is a really meaty document with many questions and assumption inside it - this is making it hard for me to build up a model for how this system works. It will be faster to build on smaller scenarios first as we move towards encapsulating all scenarios you'd like to support.

I am having serious issues with the REST API layout, as it isn't following a convention of an addressable resource.

Creation is done via POST /index-1,index-2,index-N/_point_in_time
There is no single retrieval for the PIT instance.
DELETE /_search/point_in_time is built off of search action.

This is creating problems describing the security posture because our typically conventions are grounded in addressable objects. Can we make the access patterns look like the following, and then create straight forward permissions off of them?

POST /_point_in_time
{
   "indices": ["index-1", "index-2", "index-N"],
   "creationTime": 1658146050064 
}

{
   "id": "FOO_BAR",
   "indices": ["index-1", "index-2", "index-N"],
   "creationTime": 1658146050064,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    }
}
GET /_point_in_time/{id}

{
   "id": "FOO_BAR",
   "indices": ["index-1", "index-2", "index-N"],
   "creationTime": 1658146050064 
}
DELETE /_point_in_time/{id}
GET /_point_in_time/_search?q=index-1
{
  "hits": {...}
}

@peternied
Copy link
Member

From List all PITs Design

We will introduce a new action filter which will be called by security filter for PIT requests to wrap the listener to call the PIT privilege evaluator to get the permitted PITs.

I'm not sure I understand the argument against writing the additional code. Using a PIT focus action filter seems like a good separation of concerns.

@bharath-techie
Copy link
Contributor Author

bharath-techie commented Sep 13, 2022

create

List all PITs API is exposed so that user can view all PITs assigned to them. (cat indices for example ) So the need of the API is to get the IDs and additional info - so we can't pass ID to the api.

For delete,
We are not doing something different from existing APIs.
https://opensearch.org/docs/latest/opensearch/rest-api/scroll/ -- scroll supports deleting all and deleting list ( ids are passed in body or in url )

Also these APIs are designed keeping opensearch dashboards use cases as well in mind, so we need all these patterns to be supported.

@peternied
Copy link
Member

Lets focus in on where PIT is closely related to scroll and set aside the cluster/admin* scenarios:

Scroll has fewer permissions indices:data/read/scroll and indices:data/read/scroll/clear. Why do we need the additional permissions?

Looking at the capabilities of a PIT object and what warrants protection:

  • Creation PIT objects allows for allocation of resources on the cluster, safe guarding this with a permission indices:data/read/point_in_time/create looks well justified
  • Deletion of a PIT can cause request to fail that depend on it, so the lifecycle should be guarded as well, indices:data/read/point_in_time/delete looks well justified
  • Enumeration, having the id of PIT discloses the indices its associated with - I believe this is why there is the statement If the user has access to all the indices of a PIT, then PIT is permitted. - could we alter the PIT id not to include this information? Then there is no need to protect it.

If the PIT does not include the index name inside of it, as the use case of PIT is tightly coupled with _search action on an index that permissions check will implicitly occur, and if the PIT is not associated with it the request will fail.

If this understanding of the scenario isn't accurate, please provide scenarios for why an admin would want additional permissions granularity.

@bharath-techie
Copy link
Contributor Author

bharath-techie commented Sep 13, 2022

So you mean to say list all PITs should be a cluster admin operation and no need of granularity to it ?

Only concern is that let's say 'user a' has access to 'index a' and created a pit for it.

If 'user b' has 'list all pits' permissions ( cluster admin permission) but don't have access to any indices, still 'user b' will be able to read the PITs info 'user a' created.

Is that acceptable behavior ?

@bharath-techie
Copy link
Contributor Author

GET /_search
{
  "size": 10000,
  "query": {
    "match" : {
      "user.id" : "elkbee"
    }
  },
  "pit": {
    "id":  "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", 
    "keep_alive": "100m"
  },
  "sort": [ 
    {"@timestamp": {"order": "asc", "format": "strict_date_optional_time_nanos"}},
    {"_shard_doc": "desc"}
  ]
}

This is how search operation uses PIT as well, indices are not present as part of search operation, whenever PIT ID is used as PIT comprises of indices information.

So PIT will include index information as that's per the current design.

@reta
Copy link
Collaborator

reta commented Sep 14, 2022

Thanks @bharath-techie , I think with Option 1: New action filter for PITs in security plugin we could also eliminate the need to have "cluster:admin/point_in_time/readall" and only use "indices:data/read/point_in_time/readall", correct?

@bharath-techie
Copy link
Contributor Author

bharath-techie commented Sep 14, 2022

No @reta if we need to use "indices:data/read/point_in_time/readall", user needs "all indices read permission", which we are trying to avoid.

Because "indices:data/read" actions either needs indices as part of request or user needs all indices read permission.

Only thing action filter will do is filter the permitted PITs which we will get after "cluster:admin/readpits" fetches all the pits.

So we can't get rid of "cluster:admin" call if we're going for granular model.

@davidlago
Copy link

davidlago commented Sep 14, 2022

Perhaps naïve questions but I'll ask regardless:

  • Are FGAC checks (DLS/FLS etc) still enforced based on the user sending the search query using a PIT?
  • Are we comfortable with the information being base64 encoded in the pit_id being known by any user interacting with this feature?
  • Are the copies of the data PIT makes readable in any other way outside of selecting them by pit_id in a search query, where we do check for index permissions on the associated indices? (i.e. any way of bypassing that check through other methods to get to that copied data?)

@bharath-techie
Copy link
Contributor Author

bharath-techie commented Sep 14, 2022

Are we comfortable with the information being base64 encoded in the pit_id being known by any user interacting with this feature?

  • Are FGAC checks (DLS/FLS etc) still enforced based on the user sending the search query using a PIT?
    
  • Yes in rest layer we decode PIT ID to indices and all checks done on a normal search operation will be done for PIT search as well.
  • Are we comfortable with the information being base64 encoded in the pit_id being known by any user interacting with this feature?
    
  • All info available in pit id are publicly available.Reason we are trying to be more restrictive with listing all PITs is because we don't want all the users with just the action permission to read all the PITs ( which will be the case if we make it a cluster admin API ).
  • Are the copies of the data PIT makes readable in any other way outside of selecting them by pit_id in a search query, where we do check for index permissions on the associated indices? (i.e. any way of bypassing that check through other methods to get to that copied data?)
    
  • PIT ( PIT ID ) contains the node / segments information with which we're able to access the data. Without that user will be able to search only live data. And other than search API flow, there are no flows which uses PIT to read data.

@peternied
Copy link
Member

peternied commented Sep 14, 2022

So you mean to say list all PITs should be a cluster admin operation and no need of granularity to it ?

I understand there are implementation considerations, but I'd like us to focus on the user scenarios - we add permission objects to protect a kind of resource/information. Can you help me align use case for applying individual permissions and their rules.

Cases where no PIT permissions are needed

Only concern is that let's say 'user a' has access to 'index a' and created a pit for it. If 'user b' has 'list all pits' permissions ( cluster admin permission) but don't have access to any indices, still 'user b' will be able to read the PITs info 'user a' created.

PITs, as I understand them, are not user-specific resources, they are system-wide resources. User A is not imparting any ownership information onto PIT it has created - should this be supported so User B's access to them can be controlled?
[Edit] From offline conversation, they are system-wide there is no question.

Do I understand the following scenario understood correctly? If not, the following use cases can be augmented to describe the desired behavior

[Edit] From offline, this scenario is correct, because when the search is made using the PIT the indices for the search are implicitly included. User B cannot issue a search that only returns results from Index-2 with this implementation.

User A creates PIT-1 on Index-1 and Index-2
User B only has access to Index-2
User B searches on Index-2 with PIT-1, response is access is denied, because User B does not have access to Index-1

Driving towards closure, making sure we understand use cases

@bharath-techie could you please express the permissions in the form of Use Case for each permission, like the following:

Use Cases for PIT permissions

  1. The User Admin has granted User A 'indices:data/read/point_in_time/create` so they can create PITs on indices.
  2. The User Admin has granted User A 'indices:data/read/point_in_time/delete` so they can delete PITs on indices.

Please fill out/confirm these use cases for the remaining permissions

  1. The User B can search on the index they have access to with a PIT that has been created on this index by User A, they require no '*point_in_time*'permissions.
  2. The User Admin has granted User A 'indices:data/read/point_in_time/readall` so ____.
  3. The User Admin has granted User A 'indices:monitor/point_in_time/segments` so ____.
  4. The User Admin has granted User A 'cluster:admin/point_in_time/read_from_nodes` so ____.

If there are additional scenarios please add them, or we can discuss nuanced use cases as sub-bullets of each use case

@bharath-techie
Copy link
Contributor Author

bharath-techie commented Sep 15, 2022

To simplify the security model and to address the concerns about '_all' type operations, we discussed with @peternied and are proposing this new model.

Granular permissions:

Granular permissions will still be permitted for APIs where explicit PITs are provided by user.

  • Create PIT
  • Search PIT
  • Delete list of PITs
  • Segments list of PITs

Handling 'all' type operations:

User will need ' * ' indices permissions for these actions for the '_all' type operations to work.

To take existing pattern as an example i.e. For cat indices operation, specifying target index just requires target index permission, whereas if no target is specified, user will need ' * ' indices permission for the operation.

Actions:

  1. indices:data/read/point_in_time/create - Create PIT
  2. indices:data/read/point_in_time/delete - Delete PIT
  3. indices:data/read/point_in_time/readall (or) indices:monitor/point_in_time/readpit - List all PITs
  4. indices:monitor/point_in_time/segments - Segments API

Action group:

manage_point_in_time:
  reserved: true
  hidden: false
  static: true
  allowed_actions:
    - "indices:data/read/point_in_time/create"
    - "indices:data/read/point_in_time/delete"
    - "indices:data/read/point_in_time/readall" (or) "indices:monitor/point_in_time/readpit"
    - "indices:monitor/point_in_time/segments"
    - "indices:data/read/search"
  type: "index"
  description: "Manage point in time actions and PIT search"

Role which has access to all PIT actions :

We'll add this as predefined role for Point in time.

point_in_time_full_access:
  reserved: true
  hidden: false
  description: "Migrated from v6 (all types mapped)"
  index_permissions:
    - index_patterns:
        - "*" <----- user will be able to perform all pit apis with explicit or with '_all' as request (list all, delete all)
      dls: null
      fls: null
      masked_fields: null
      allowed_actions:
        - "manage_point_in_time"

We won't give the following role as default role but we can document the usage so that admin can provide granular permissions to perform PIT operation with the limitation of not being able to access '_all' type operations.

point_in_time_granular_access:
  reserved: true
  hidden: false
  description: "Migrated from v6 (all types mapped)"
  index_permissions:
    - index_patterns:
        - "index-1" <----- user will be able to perform create, search, delete list and segments list
      dls: null
      fls: null
      masked_fields: null
      allowed_actions:
        - "manage_point_in_time"

Use Cases for PIT permissions

  • The User Admin has granted User A 'indices:data/read/point_in_time/create` so they can create PITs on indices.

  • The User Admin has granted User A 'indices:data/read/point_in_time/delete`

    • User A can delete PITs passed as part of body , provided they have underlying specific index permissions.
    • User can delete all PITs only if they have ' * ' indices permission
  • The User B can search on the index they have access to with a PIT that has been created on this index by User A, they require no 'point_in_time'permissions. They need 'search' permission.

  • The User Admin has granted User A 'indices:data/read/point_in_time/readall` , so user A will be able to list all PITs only when they have ' * ' indices permission.

  • The User Admin has granted User A 'indices:monitor/point_in_time/segments`

    • so user will be able to view PIT segments when the pass explicit PITs as part of body, if they have associated indices permission .
    • To view all PIT segments info, user needs ' * ' indices permission for the actions

If there are additional scenarios please add them, or we can discuss nuanced use cases as sub-bullets of each use case

Pros:

  1. This model is inline with existing patterns like cat indices
  2. We can get rid of 'cluster:admin/point_in_time/readall' action and follow the same convention as other existing API patterns.
  3. No need of additional filter or dummy action .

Cons:

  1. Cons of this approach is that list all pits, delete all and segments all API will not be supported for users with granular indices permissions.

@reta let us know if this model works

@reta
Copy link
Collaborator

reta commented Sep 15, 2022

@bharath-techie @peternied thank you guys for clearing out the concerns and simplifyng the model. I think it looks logical and concise now, the workaround are gone.

@RyanL1997 RyanL1997 added the v2.4.0 'Issues and PRs related to version v2.4.0' label Sep 26, 2022
@davidlago davidlago added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. sprint backlog labels Oct 10, 2022
@stephen-crawford
Copy link
Contributor

[CLOSE] Close in favor of the existing issue list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sprint backlog triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. v2.4.0 'Issues and PRs related to version v2.4.0'
Projects
None yet
Development

No branches or pull requests

7 participants