Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout for each workflow steps #317

Merged
merged 5 commits into from
Dec 27, 2023

Conversation

jackiehanyang
Copy link
Collaborator

@jackiehanyang jackiehanyang commented Dec 23, 2023

12/26 revision:

  • Change the system default NODE_TIMEOUT_DEFAULT_VALUE to 10s

  • Removed all 10s timeout values in the workflow-step.json file, only leaving the timeout field for those workflow steps that require an overwrite timeout value other than 10s.
    For example, register_local_model has default timeout value as 60s, and deploy_model has default timeout value as 15s

  • Do try catch at where we read workflow-step.json file instead of passing it up.

  • Add a new unit test case to cover all possible ways of how we parse timeout value


Description

This pr does two things:

  • Having a specific timeout for all the workflow steps in workflow-steps.json
  • rename indices:
    .plugins-ai-global-context -> .plugins-flow-framework-templates
    .plugins-workflow-state -> .plugins-flow-framework-state

Issues Resolved

#249

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the backport 2.x backport PRs to 2.x branch label Dec 23, 2023
Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice code, and I learned something! Two general comments:

  • Let's use TimeValue rather than String as early as we can in parsing
  • Let's handle exceptions as early as we can in parsing

@jackiehanyang
Copy link
Collaborator Author

Nice code, and I learned something! Two general comments:

  • Let's use TimeValue rather than String as early as we can in parsing
  • Let's handle exceptions as early as we can in parsing

Thanks for the quick review! Yeah, I immediately regretted the decision not to handle exceptions in parsing right after I sent out this PR and closed my laptop. But since I had already closed my laptop, I wanted to wait until Tuesday to address this. I will send out a new revision soon to address it. :)

@jackiehanyang
Copy link
Collaborator Author

Nice code, and I learned something! Two general comments:

  • Let's use TimeValue rather than String as early as we can in parsing
  • Let's handle exceptions as early as we can in parsing

I prefer to leave timeout as a String value, and leave it as timeout = parser.text() for the following reasons:

  1. TimeValue.parseTimeValue(...) throws NPE if parser.text() is null - https://github.com/opensearch-project/OpenSearch/blob/71d7e4daf7dc05c9598d447e4424072e76de79d1/libs/common/src/main/java/org/opensearch/common/unit/TimeValue.java#L374-L375. But a String type allows it to be null.
  2. The reason we allow it to be null is that we want to handle parsing errors when we parse it, rather than in the actual parser. If we handle it in the actual parser, we would need to catch every switch case. However, if we handle it when we call the parser, we can catch all parsing errors at once. For example, we don't throw exception here, instead we do try and catch here - https://github.com/opensearch-project/flow-framework/blob/main/src/main/java/org/opensearch/flowframework/workflow/WorkflowProcessSorter.java#L143-L144
  3. Since the TIMEOUT in the workflow-step.json file is part of our own configuration, it shouldn't be null. However, just in case it is, we assign a default value in the code to prevent failing the entire workflow. As it's not a user-provided field, it makes no sense for the workflow to fail just because it's null. Therefore, we catch it and assign a default value in every step where possible.

Signed-off-by: Jackie Han <[email protected]>
Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit and few outstanding comments from prior review.

Signed-off-by: Jackie Han <[email protected]>
@dbwiddis dbwiddis merged commit 9fdf03c into opensearch-project:main Dec 27, 2023
19 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 27, 2023
* read timeout from workflow-steps

Signed-off-by: Jackie Han <[email protected]>

* change indices name

Signed-off-by: Jackie Han <[email protected]>

* address comments

Signed-off-by: Jackie Han <[email protected]>

* address comment - parse timeout into TimeValue

Signed-off-by: Jackie Han <[email protected]>

* address minor comments

Signed-off-by: Jackie Han <[email protected]>

---------

Signed-off-by: Jackie Han <[email protected]>
(cherry picked from commit 9fdf03c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dbwiddis pushed a commit that referenced this pull request Dec 28, 2023
Add timeout for each workflow steps (#317)

* read timeout from workflow-steps



* change indices name



* address comments



* address comment - parse timeout into TimeValue



* address minor comments



---------


(cherry picked from commit 9fdf03c)

Signed-off-by: Jackie Han <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport PRs to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CHORE] Change system index names in code [FEATURE] Add node timeout for the workflow steps
2 participants