Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-2235] Add new RESTful API for retrieving application #750

Closed
wants to merge 6 commits into from

Conversation

laysfire
Copy link

@laysfire laysfire commented Dec 7, 2023

What is this PR for?

This PR add 2 RESTful API.
One for retrieving one application object directly via /ws/v1/partition/{partitionName}/application/{applicationID}.
The other for listing application IDs via /ws/v1/partition/{partitionName}/applications/{state}.

What type of PR is it?

  • - Bug Fix
  • - Improvement
  • - Feature
  • - Documentation
  • - Hot Fix
  • - Refactoring

Todos

N/A

What is the Jira issue?

https://issues.apache.org/jira/browse/YUNIKORN-2235

How should this be tested?

It's been locally tested using go test

Screenshots (if appropriate)

N/A

Questions:

N/A

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

pkg/scheduler/partition.go Show resolved Hide resolved
pkg/scheduler/partition.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
pkg/scheduler/partition.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Show resolved Hide resolved
pkg/webservice/handlers_test.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 35 lines in your changes are missing coverage. Please review.

Comparison is base (ebf7107) 77.94% compared to head (b88f25d) 77.91%.
Report is 20 commits behind head on master.

Files Patch % Lines
pkg/metrics/scheduler.go 29.41% 24 Missing ⚠️
pkg/metrics/queue.go 68.00% 8 Missing ⚠️
pkg/webservice/handlers.go 96.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #750      +/-   ##
==========================================
- Coverage   77.94%   77.91%   -0.03%     
==========================================
  Files          82       82              
  Lines       13373    13523     +150     
==========================================
+ Hits        10424    10537     +113     
- Misses       2622     2659      +37     
  Partials      327      327              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice patch! a couple of comments left.

@@ -60,6 +60,12 @@ const (
NodeDoesNotExists = "Node not found"
)

var allowedAppStates = map[string]bool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon me, why only partial states are allowed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was discussed upstream w/ Wilfred. We don't want to return all kinds of states. A comment here would be useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laysfire could you share the discussion to me? Also, adding the comment here can helper readers in the future :)

@@ -56,3 +56,7 @@ type PlaceholderDAOInfo struct {
Replaced int64 `json:"replaced,omitempty"`
TimedOut int64 `json:"timedout,omitempty"`
}

type ApplicationIDsDAOInfo struct {
AppIDs []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need json tag?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For another, other responses name the id-related field "applicationId". Should we keep the consistency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i will add json tag & keep consistency

pkg/webservice/handlers.go Outdated Show resolved Hide resolved
@@ -425,6 +425,10 @@ func (pc *PartitionContext) removeAppInternal(appID string) *objects.Application
return app
}

func (pc *PartitionContext) GetApplication(appID string) *objects.Application {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this exposed method should get lock (pc.RLock()/defer pc.RUnlock()), and the internal method should be unlocked version.

Copy link
Author

@laysfire laysfire Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @chia7712 thanks for review.
The are so many places call the internal method.
Which way do you think is better?
First,

  • Add RLock/RUnlock in exposed method & Remove RLock/RUnlock in internal method
  • Add explicit RLock/RUnlock snippet in every place that call the internal method

Second,
Just change the exposed name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add RLock/RUnlock in exposed method & Remove RLock/RUnlock in internal method

yep

Add explicit RLock/RUnlock snippet in every place that call the internal method

How about just replacing internal method getApplication by public version GetApplication? Small change is good at avoiding new bugs :)

Also, you can file new Jira as follow-up to improve lock usage if you observe any weird use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a package internal method until this point. Lots of places calling it but they are all in the package. The partition is the place where the events from the shim get processed and the scheduling cycle runs from. That means we have multiple go routines that could be changing values stored in the partition. The locking makes sure we have no data races and a consistent view.
The internal method was locked for that reason and needs to stay like that.

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some new comments.

pkg/scheduler/partition.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
@@ -60,6 +60,12 @@ const (
NodeDoesNotExists = "Node not found"
)

var allowedAppStates = map[string]bool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was discussed upstream w/ Wilfred. We don't want to return all kinds of states. A comment here would be useful.

@manirajv06
Copy link
Contributor

@laysfire I have shared my views in jira. Please check.

@laysfire
Copy link
Author

@manirajv06 @chia7712 @pbacsko Address some comment, please help review this again.

route{
"Scheduler",
"GET",
"/ws/v1/partition/:partition/applications/:state",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this endpoint is under discussion. https://issues.apache.org/jira/browse/YUNIKORN-2235

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated jira with more details. Please check.

pkg/webservice/handlers.go Outdated Show resolved Hide resolved
@wilfred-s
Copy link
Contributor

Dropped in a large comment in the jira:

@laysfire
Copy link
Author

@manirajv06 @wilfred-s @chia7712 @pbacsko please help review again.
Summary are as follows:

  • Expose GetApplication on partition with nolock version
  • Make getApplication handler handle /ws/v1/partition/{partitionName}/application/{applicationID}
  • Add getPartitionApplicationByState to retrieve applications by types
  • We only allow 3 state: Active(fake), Rejected and Completed
  • We only allow New, Accepted, Starting, Running, Completing, Failing and Resuming status for Active state
  • I do not modify the return DAOs, like clean allocations for completed application. I think we can done this in follow up jira.

pkg/webservice/handlers.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions.

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost. One more round.

pkg/webservice/handlers.go Show resolved Hide resolved
Comment on lines 664 to 665
partition := vars.ByName(strings.ToLower("partition"))
appState := vars.ByName(strings.ToLower("state"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partitions are case-sensitive, so drop that.
For appState, the proper approach is:

appState := strings.ToLower(vars.ByName("state"))

pkg/webservice/handlers.go Outdated Show resolved Hide resolved
@laysfire
Copy link
Author

@pbacsko Thanks for review. Have support case-insensitive for appState & activeState. But i add many strings.ToLower in code. I'm not sure if this is a good practice?

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this change too much distracting stuff. Let's get rid of the ToLower() calls and just use plain string literals. See my comments.

Comment on lines 83 to 89
allowedAppActiveStates[strings.ToLower(objects.New.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Accepted.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Starting.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Running.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Completing.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Failing.String())] = true
allowedAppActiveStates[strings.ToLower(objects.Resuming.String())] = true
Copy link
Contributor

@pbacsko pbacsko Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a better naming here. It's a bit difficult to distingush between allowedAppStates and allowedAppActiveStates. At least some comments:

var allowedStatesMsg string  // returned error message when the requested application state is invalid
var allowedActiveStatesMsg string // returned error message when the actual state of the application is not an active state like Running, Accepted, etc.
var allowedAppStates map[string]bool // ??? (this one is a bit tough to explain, we mix two existing states with a higher level one which)
var allowedAppActiveStates map[string]bool // list of application states that are valid for filtering

We can actually just drop allowedAppStates because it only contains 3 values. Readability is important, so this is better:

	if appState != "active" && appState != "rejected" && appState != "completed" {
		buildJSONErrorResponse(w, "Only following application states are allowed: active, rejected, completed", http.StatusBadRequest)
		return
	}

You can also just drop the strings.ToLower() calls, it makes the code a bit clumsy. Just write:

        allowedAppActiveStates["new"] = true
	allowedAppActiveStates["accepted"] = true
	allowedAppActiveStates["starting"] = true
...

Obviously this is hard-coded and we no longer reference the code which define the states, but something for something. However, it's pretty unlikely that the state machine will ever change. You can add a comment to application_state.go above the const section:

// Application states are used for filtering in the webservice handlers. Please check&update the logic if needed if the state machine is modified

Sometimes we need to be pragmatic.

pkg/webservice/handlers.go Outdated Show resolved Hide resolved
pkg/webservice/handlers.go Outdated Show resolved Hide resolved
@laysfire laysfire requested a review from pbacsko January 3, 2024 09:09
Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

I'll let others review it, too.

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laysfire thanks for this nice feature. two small comments left. please take a look.

return
}
if appState != "active" && appState != "rejected" && appState != "completed" {
buildJSONErrorResponse(w, "Only following application states are allowed: active, rejected, completed", http.StatusBadRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this line to line#683? for example:

	case "active":
		if status := strings.ToLower(r.URL.Query().Get("status")); status != "" {
			if !allowedAppActiveStates[status] {
				buildJSONErrorResponse(w, allowedActiveStatesMsg, http.StatusBadRequest)
				return
			}
			for _, app := range partitionContext.GetApplications() {
				if strings.ToLower(app.CurrentState()) == status {
					appList = append(appList, app)
				}
			}
		} else {
			appList = partitionContext.GetApplications()
		}
	case "rejected":
		appList = partitionContext.GetRejectedApplications()
	case "completed":
		appList = partitionContext.GetCompletedApplications()
	default:
		buildJSONErrorResponse(w, "Only following application states are allowed: active, rejected, completed", http.StatusBadRequest)
		return
	}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Move #659 to "default" branch make code clearer and default branch would be possible to hit.

case "active":
if status := strings.ToLower(r.URL.Query().Get("status")); status != "" {
if !allowedAppActiveStates[status] {
buildJSONErrorResponse(w, allowedActiveStatesMsg, http.StatusBadRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should allowedActiveStatesMsg be renamed to allowedActiveStatusMsg if the query key is called "statue"?

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laysfire sorry that I am overengineering today. one small comment left.

BTW, please file follow-up to doc the new APIs

for k := range allowedAppActiveStates {
activeStates = append(activeStates, k)
}
allowedActiveStatusMsg = fmt.Sprintf("Only following active states are allowed: %s", strings.Join(activeStates, ","))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change active states to active statuses? The query key is status so I prefer naming consistency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chia7712 Thanks for review. In order to keep consistency, should change allowedAppActiveStates to allowedAppActiveStatuses?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chia7712 chia7712 closed this in 77e19f6 Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants