Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/systemstatus 1291 #3549

Merged
merged 9 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions apps/dashboard/app/controllers/system_status_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
class SystemStatusController < ApplicationController
def index
@job_clusters = OodAppkit.clusters
.select(&:job_allow?)
.reject { |c| c.metadata.hidden }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you pulled this from here? Seems like we shouldn't reimplement it or in fact even need it here - the view can directly use Configuration.job_clusters.each.

def job_clusters
@job_clusters ||= OodCore::Clusters.new(
OodAppkit.clusters
.select(&:job_allow?)
.reject { |c| c.metadata.hidden }
)
end

end
end
105 changes: 105 additions & 0 deletions apps/dashboard/app/helpers/system_status_helper.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# frozen_string_literal: true

# helper for the system status page
module SystemStatusHelper
# Used to not repeatedly request the job adapter
def update_cluster(cluster)
@job_adapter = cluster.job_adapter
@cluster_info = @job_adapter.cluster_info
@cluster = cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want member variables in helpers. They should be treated as functional methods - i.e., they don't hold any state, are idempotent and so on.

end

def job_adapter(cluster)
if @cluster != cluster
update_cluster cluster
end

@job_adapter
end

def cluster_info(cluster)
if @cluster != cluster
update_cluster cluster
end

@cluster_info
end

def title(cluster)
"#{cluster.metadata.title.titleize} Cluster Status"
end

def generic_pct(value, total)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably just call this percent right?

"#{(value.to_f / total * 100).round 2}%"
end

def generic_status(active, total, name)
"#{active} of #{total} #{name} Active (#{total - active} Free)"
end

def node_status(cluster)
c_info = cluster_info cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use parentheses on method calls? I know Ruby allows this, but I feel like it's more readable with the enclosing ().

 c_info = cluster_info(cluster)

generic_status(c_info.active_nodes, c_info.total_nodes, 'Nodes')
end

def node_pct(cluster)
c_info = cluster_info cluster
generic_pct(c_info.active_nodes, c_info.total_nodes)
end

def processor_status(cluster)
c_info = cluster_info cluster
generic_status(c_info.active_processors, c_info.total_processors, 'Processors')
end

def processor_pct(cluster)
c_info = cluster_info cluster
generic_pct(c_info.active_processors, c_info.total_processors)
end

def gpu_status(cluster)
c_info = cluster_info cluster
generic_status(c_info.active_gpus, c_info.total_gpus, 'GPUs')
end

def gpu_pct(cluster)
c_info = cluster_info cluster
generic_pct(c_info.active_gpus, c_info.total_gpus)
end

def active_jobs(cluster)
active_count = 0
job_adapter(cluster).info_all_each do |i|
if i.status.running?
active_count += 1
end
end
active_count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work as well? If you find yourself creating/defining variables outside of loops like active_count = 0, you'll often find there's a nicer way to do it in ruby.

job_adapter(cluster).info_all_each.select do |info|
  info.status.running?
end.length

end

def active_job_status(cluster)
"#{active_jobs cluster} Jobs Running"
end

def queued_jobs(cluster)
queued_count = 0
job_adapter(cluster).info_all_each do |i|
if i.status.queued?
queued_count += 1
end
end
queued_count
end

def queued_job_status(cluster)
"#{queued_jobs cluster} Jobs Queued"
end

def active_job_pct(cluster)
generic_pct(active_jobs(cluster), active_jobs(cluster) + queued_jobs(cluster))
end

def queued_job_pct(cluster)
generic_pct(queued_jobs(cluster), active_jobs(cluster) + queued_jobs(cluster))
end
end
2 changes: 1 addition & 1 deletion apps/dashboard/app/javascript/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,4 @@ export function bcPollDelay() {
export function bcIndexUrl() {
const cfgData = configData();
return cfgData['bcIndexUrl'];
}
}
17 changes: 17 additions & 0 deletions apps/dashboard/app/javascript/system_status.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import { bcPollDelay } from './config'
import { replaceHTML } from './turbo_shim'

function poll() {
url = "";
fetch(url, { headers: { Accept: "text/vnd.turbo-stream.html" } })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this URL set to "" is a bit dangerous as this may be used in other places. I'd suggest adding the route to the app/views/layouts/_config.html.erb and then adding the corresponding method to the config.js file to extract the correct URL.

.then(response => response.ok ? Promise.resolve(response) : Promise.reject(response.text()))
.then((r) => r.text())
.then((html) => replaceHTML("system-status", html))
.then(setTimeout(poll, bcPollDelay()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want a new configuration for this, but let's hard code it to 30000 (30 seconds) for now.

.catch((err) => {
console.log('Cannot retrieve batch connect sessions due to error:');
console.log(err);
});
}

jQuery(poll)
2 changes: 2 additions & 0 deletions apps/dashboard/app/views/system_status/index.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
<div class="row justify-content-center" id="system-status">Loading</div>
<%= javascript_include_tag 'system_status' %>
33 changes: 33 additions & 0 deletions apps/dashboard/app/views/system_status/index.turbo_stream.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
<turbo-stream action="replace" target="system-status">
<template>
<% @job_clusters.each do |c| %>
<div class="col-sm-6 col-xs-12 p-2">
<div class="col border rounded p-2 text-center">
<h4><strong><%= title c %></strong></h4>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful with headings. Semantically, they only kinda make sense if there are also h1-h3's on the page. an h4 by itself is going to throw screen reader users off if they don't see others.

If it's a sizing thing you can directly use the h4 bootstrap class to get the appropriate text size.

<div>Hello world at <%= Time.now %></div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any need for this div anymore.

<div><%= node_status c %></div>
<div class="progress">
<div class="progress-bar" role="progressbar" style="width: <%= node_pct c %>"><%= node_pct c %></div>
</div>
<div><%= processor_status c %></div>
<div class="progress">
<div class="progress-bar" role="progressbar" style="width: <%= processor_pct c %>"><%= processor_pct c %></div>
</div>
<div><%= gpu_status c %></div>
<div class="progress">
<div class="progress-bar" role="progressbar" style="width: <%= gpu_pct c %>"><%= gpu_pct c %></div>
</div>
<h4 class="mt-2 pt-2 border-top">Jobs</h4>
<div><%= active_job_status c %></div>
<div class="progress">
<div class="progress-bar" role="progressbar" style="width: <%= active_job_pct c %>"><%= active_job_pct c %></div>
</div>
<div><%= queued_job_status c %></div>
<div class="progress">
<div class="progress-bar" role="progressbar" style="width: <%= queued_job_pct c %>"><%= queued_job_pct c %></div>
</div>
</div>
</div>
<% end %>
</template>
</turbo-stream>
2 changes: 2 additions & 0 deletions apps/dashboard/config/routes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,8 @@
delete '/activejobs' => 'active_jobs#delete_job', :as => 'delete_job'
end

get '/systemstatus' => 'system_status#index'

get '/jobs/info/:cluster/:id' => 'jobs#info', :defaults => { :format => 'json' }, :as => 'jobs_info'

post 'settings', :to => 'settings#update'
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
require "test_helper"

class SystemStatusControllerTest < ActionDispatch::IntegrationTest
# test "the truth" do
# assert true
# end
end
Loading