-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels/Triage for OpenSearch Core Repo #11617
Conversation
…labels in core for easier understanding of what's there Signed-off-by: Mark Cohen <[email protected]>
Signed-off-by: Mark Cohen <[email protected]>
Compatibility status:Checks if related components are compatible with change 8a3481e Incompatible componentsIncompatible components: [https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git] Skipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/alerting.git] |
❕ Gradle check result for 9491101: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #11617 +/- ##
============================================
- Coverage 71.41% 71.34% -0.08%
+ Complexity 59397 59367 -30
============================================
Files 4923 4923
Lines 279212 279178 -34
Branches 40595 40581 -14
============================================
- Hits 199408 199181 -227
- Misses 63223 63437 +214
+ Partials 16581 16560 -21 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting to write this up, however; I'm not sure that I like this living it its own file, seems like it could be part of the contributing.md How do you see this being used?
1. Read through the issue to understand the request. | ||
1. If the issue does not meet the template requirements, ask the requester for missing details and keep the untriaged label until all the details have been disclosed . | ||
2. Move to 3 if you have all the details on the associated issue. | ||
3. If the issue does not impact the OpenSearch core repository, then transfer the issue to the appropriate team using the Transfer issue button on the lower right hand corner. If you don’t have the access to transfer, then you can add comment by mentioning [@opensearch-project/admin](https://github.com/orgs/opensearch-project/teams/admin) to do the transfer for you. If not, move to step 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the @opensearch-project/triage team able to handle these requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think someone in the triage team can handle #3 unless they have maintainer status in both repos. Otherwise, I'd say they can. Are you suggesting that they do handle them? There are 120 people in that team today and I don't think all of them have eyes on this repo. Maybe maintainers or appointed label owners (could be one or more than one) who are contributors (not necessarily maintainers) to core could do this work.
# Triage in OpenSearch Core | ||
## What Do We Do With These Labels Anyway? | ||
|
||
Across all repositories in opensearch-project, we have a process called “triage.” The goal of this process is to make sure repository owners and maintainers check new issues to understand them. First - we learn if the issue is security related (e.g. CVE), second - we understand if we have enough information to act on the issue, third - we re-label or respond on the issue. During triage we do not work on these issues, but instead just try to get the issue to a next step. Because the OpenSearch core repository (https://github.com/opensearch-project/OpenSearch) has so many different functions with different sets of contributors working on these functions, repo owners decided to treat these functions like separate repositories when triaging to get through the backlog faster and have more focus. There are a few maintainers that do triage in public (see https://meetup.com/OpenSearch to join) and there will be more coming. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triage is described in the project wide guidance [1], what do you think about referencing that guidance and call out how this project is deviates?
@@ -0,0 +1,49 @@ | |||
# Triage in OpenSearch Core | |||
## What Do We Do With These Labels Anyway? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tone is a little casual to my taste, consider changing this title
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Whatever, man. ;)
## What Do We Do With These Labels Anyway? | |
## Why are there Specialized Labels in Core? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually do have a style guide for the whole project in this regard: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#tone
@@ -0,0 +1,49 @@ | |||
# Triage in OpenSearch Core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd highly recommend looking at the security team's triage guidance [1], while the format is based off communicating how the triage meeting operates the FAQ format might be useful to highlight edge cases.
1. Every issue is created with an untriaged label by default. | ||
2. No less than once per week, untriaged issues should be reviewed: | ||
1. Read through the issue to understand the request. | ||
1. If the issue does not meet the template requirements, ask the requester for missing details and keep the untriaged label until all the details have been disclosed . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its important that 'untriaged' label don't linger, with this repro's high volume of incoming issues, I might recommend stronger guidance to close with a message such as "Thanks for filing - after reviewing this issue there are details missing that are needed to accept this issue and have closed this issue. Please update the issue to the missing all details and re-open so we know to look at it again."
In the security repo we handle this by having a separate 'triaged' flag that is only added during the triage meeting. Therefore anyone can quickly look at untriaged issues, remove the untriaged, and only during the open meeting are they accepted. YMMV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a fine line between building the community and continuing to have the problems we have, but I'm open to it. Would definitely like to hear from other maintainers on this one, but I can make the update.
1. If the issue does not meet the template requirements, ask the requester for missing details and keep the untriaged label until all the details have been disclosed . | |
1. If the issue does not meet the template requirements, close the issue with a comment like: "Thanks for contributing - after reviewing this issue there are details missing that are needed to accept this issue and have closed this issue. Please update the issue to include X, Y, and Z re-open so we know to look at it again. We look forward to working with you on this." |
5. At any point a maintainer or issue owner can add or remove a label which will help clarify the work that needs to be done. This RFC will also become a markdown file in the OpenSearch repo and we will add a link to the issue templates in the repo so issue creators can reference this process to help it go smoother. | ||
|
||
|
||
TABLE 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of items manually included on this table, do we need this table? What would you think about pointing folks to the labels in the project or the template so we don't have to keep this up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure these labels stand alone in definition. They may need renaming or a reference. I agree that automation and simplicity would be better. If there was a way to have a description as metadata for a label then I think that could work better than a separate read me. Or if the labels were more self-describing.
3. Make sure to enrich the issue with more details by adding your thoughts, add more context to the issue, or ask the author for more clarification if needed ...etc. | ||
4. Remove the untriaged label | ||
5. Add the component level labels based on TABLE 1 | ||
5. At any point a maintainer or issue owner can add or remove a label which will help clarify the work that needs to be done. This RFC will also become a markdown file in the OpenSearch repo and we will add a link to the issue templates in the repo so issue creators can reference this process to help it go smoother. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RFC will also become a markdown file in the OpenSearch repo and we will add a link to the issue templates in the repo so issue creators can reference this process to help it go smoother.
Should this be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES! Thanks. Will do.
|
||
Across all repositories in opensearch-project, we have a process called “triage.” The goal of this process is to make sure repository owners and maintainers check new issues to understand them. First - we learn if the issue is security related (e.g. CVE), second - we understand if we have enough information to act on the issue, third - we re-label or respond on the issue. During triage we do not work on these issues, but instead just try to get the issue to a next step. Because the OpenSearch core repository (https://github.com/opensearch-project/OpenSearch) has so many different functions with different sets of contributors working on these functions, repo owners decided to treat these functions like separate repositories when triaging to get through the backlog faster and have more focus. There are a few maintainers that do triage in public (see https://meetup.com/OpenSearch to join) and there will be more coming. | ||
|
||
Here is an example of the workflow for a new issue in the core repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be a body of maintainers/contributors that agreeing to work in this way, not a generic example.
Note; I'm not part of any such team for core, so I triage on my own whims and timescales, otherwise I'd volunteer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to whatever the maintainers/other contributors want to do here. I haven't seen any other objections to date so this example remains. Can you suggest something more concrete? Not sure I understand what you mean.
Thanks for taking on this review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any luck finding a first follower(s) on this topic? I am concerned about documenting a process in a document that no one follows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll post this to the Slack #maintainers channel and ask for folks to review this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I side with @peternied here - we should leave the process(es) and workflow(s) up to repository maintainers to decide. The pace of changes varies greatly between repositories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The triaging that's happening now in core isn't enough, IMO. Do maintainers agree?
That is very fair, we don't have triage meetings for core specifically but we should I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @reta. Are we back to the idea of a core public triage meeting, then? It may make sense to develop the process from there and maintainers can document the process used and decide on useful labels with the community there. @peternied, @reta, I can help set something like that up with you (I'm not a core maintainer so I would be open to facilitating and recommending, but the maintainers should determine the process).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@macohen You're trying to document the process that is currently being done, right? Much of this happens internally within Amazon teams, but some of it happens in open community meetings, but to date the actually process isn't documented anywhere in the open. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @reta. Are we back to the idea of a core public triage meeting, then? I
I would love to see that happening (it is going well for security / ml-commons / search relevance) but I do not see (personally, as of today) how we could make it work: there are many components (or subsystems) in there, search relevance is an example of cutting a small piece of the core and looking into it, but there are many others pieces.
Having one meeting to go over the OpenSearch core is an option (we need at least few leads to be present there) - it may be efficient use of maintaners' time but not users times. Having N meetings (one per component or subsystem) is another option that has own pros and cons.
Depending on the way we would like to approach the problem, the triaging rules may differ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@macohen You're trying to document the process that is currently being done, right? Much of this happens internally within Amazon teams, but some of it happens in open community meetings, but to date the actually process isn't documented anywhere in the open. Is that right?
Right. I don't know what the process is across all repos and I think the process has recently become more confusing and unclear so I'm attempting a declarative document that says what should be. If it's the wrong declaration, then let's change the document, but let's document the right thing. And let's keep this specific to OpenSearch repo since that's where the complexity is. I would also say that we need to make this process public and I am even more certain that the maintainers should own it.
@macohen I've merged the doc describing the public triage meeting that I'll be driving. I think most of the content in this PR is covered by TRIAGING.md. I don't think we will be merging this PR, but before closing this one out I'd recommend reviewing the other doc to see if there is anything you've got here that you'd like to include in that doc, Happy to review PRs / issues in this space - thanks for spurring this effort forward! |
❕ Gradle check result for 8ce29c5: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
❌ Gradle check result for 8a3481e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
This PR is stalled because it has been open for 30 days with no activity. |
@macohen I'm going to close out this PR, let us know if you'd like to revisit it |
Description
If maintainers agree, this should provide documentation on the triage process followed in core along with definition of some of the labels. Could use additional definition for some of the blank ones, but this is a start.
Related Issues
Check List
New functionality includes testing.All tests passNew functionality has been documented.New functionality has javadoc addedFailing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)Commit changes are listed out in CHANGELOG.md file (See: Changelog)Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.