Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource prediction #142

Merged
merged 38 commits into from
Feb 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
fcbc9e5
Added max and avg metric per step level in resources page
aleenathomas Nov 16, 2023
382c515
WIP: prediction for dry runs
aleenathomas Nov 16, 2023
a329aab
Added prediction for cpu
aleenathomas Nov 23, 2023
2d3086b
WIP: added predictions for memory
aleenathomas Nov 23, 2023
4bc99b0
WIP - store initial input filesizes in dry run workflow template
aleenathomas Dec 5, 2023
07185e8
Added s3 key in frontend
aleenathomas Dec 5, 2023
ff313b4
WIP - predict duration based on previous dry runs
aleenathomas Dec 8, 2023
22188e8
WIP: user inputs initial filesize in create new dry run
aleenathomas Dec 10, 2023
f73f272
Very basic prediction page
aleenathomas Dec 11, 2023
8e9d44c
Fixed bug with logs display
aleenathomas Dec 13, 2023
b28c302
feat: ✨ artifact size
fungiboletus Dec 13, 2023
273f478
Merge branch 'resource-prediction' of https://github.com/DataCloud-pr…
aleenathomas Dec 13, 2023
b4eace4
WIP: integrating new filesize api
aleenathomas Dec 14, 2023
94d3db7
Eslint
aleenathomas Dec 14, 2023
8b923a1
fix: 🐛 fix renamed type
fungiboletus Dec 14, 2023
876c97f
Merge branch 'resource-prediction' of github.com:DataCloud-project/SI…
fungiboletus Dec 14, 2023
69fdcb0
feat: ✨ support both minio public and internal URLs
fungiboletus Dec 14, 2023
a978c73
WIP fix memory predictions 0B
aleenathomas Dec 19, 2023
58d9714
fix: 🐛 fix capitalisation
fungiboletus Dec 20, 2023
70972bd
WIP - Code refactoring
aleenathomas Jan 18, 2024
64b794d
WIP: renamed routes.
aleenathomas Jan 18, 2024
51217e2
Merge branch 'resource-prediction' of https://github.com/DataCloud-pr…
aleenathomas Jan 18, 2024
ea3e466
Es lint formatting
aleenathomas Jan 18, 2024
dbe8946
Added multiple memory units in prediction
aleenathomas Jan 22, 2024
30cff5b
WIP: improving code quality
aleenathomas Jan 22, 2024
f0605fd
WIP: code refactoring
aleenathomas Jan 22, 2024
0e3d5fb
Added support for argo template field env
aleenathomas Jan 23, 2024
1cd6dfc
Added new type Templates
aleenathomas Jan 23, 2024
d1ff1bc
Added error check for new argo project
aleenathomas Jan 30, 2024
271c7f4
Metrics page; duration bug correction
aleenathomas Jan 30, 2024
18ac013
Eslint
aleenathomas Jan 30, 2024
17474aa
Added argo error message for error/failed dry runs
aleenathomas Jan 31, 2024
1046d27
WIP: improving code quality
aleenathomas Feb 1, 2024
2e3fdcf
Merge branch 'main' of https://github.com/DataCloud-project/SIM-PIPE …
aleenathomas Feb 2, 2024
7a3806d
WIP: corrections on resource prediction page
aleenathomas Feb 2, 2024
135f172
WIP: added error handling to project and dry run pages
aleenathomas Feb 6, 2024
b85c6b8
WIP added error handling
aleenathomas Feb 8, 2024
91e6dce
Deleted unused files folder
aleenathomas Feb 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions charts/simpipe/templates/deployment-controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,10 @@ spec:
key: minioSecretKey
- name: JWT_DEV_MODE
value: "true"
- name: MINIO_URL
value: {{ .Values.controller.minio.url | quote }}
- name: MINIO_PUBLIC_URL
value: {{ .Values.controller.minio.publicUrl | quote }}
- name: MINIO_INTERNAL_URL
value: {{ .Values.controller.minio.internalUrl | quote }}
- name: MINIO_BUCKET_NAME
value: {{ .Values.controller.minio.bucketName | quote }}
- name: MINIO_REGION
Expand Down
3 changes: 2 additions & 1 deletion charts/simpipe/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ controller:

minio:
# Configured by default for the localhost port forwarding in forwarding.py
url: http://localhost:8085
publicUrl: http://localhost:8085
internalUrl: http://simpipe-minio.default.svc.cluster.local:9000
accessKey: simpipe
secretKey: simpipe1234
bucketName: artifacts
Expand Down
6 changes: 3 additions & 3 deletions controller/src/argo/dry-runs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import {
import { DryRunNodePhase, DryRunNodeType, DryRunPhase } from '../server/schema.js';
import getPodName from './get-pod-name.js';
import type {
DryRun, DryRunNode, DryRunNodeArtifact, DryRunNodePod,
Artifact, DryRun, DryRunNode, DryRunNodePod,
} from '../server/schema.js';
import type { ArgoClientActionNames, ArgoNode, ArgoWorkflow } from './argo-client.js';
import type ArgoWorkflowClient from './argo-client.js';
Expand Down Expand Up @@ -125,8 +125,8 @@ export function convertArgoWorkflowNode(node: ArgoNode, argoWorkflow: ArgoWorkfl
const type = convertArgoNodeType(node.type);

let podName: string | undefined;
let inputArtifacts: DryRunNodeArtifact[] | undefined;
let outputArtifacts: DryRunNodeArtifact[] | undefined;
let inputArtifacts: Artifact[] | undefined;
let outputArtifacts: Artifact[] | undefined;

if (type === DryRunNodeType.Pod) {
podName = getPodName(node, argoWorkflow);
Expand Down
3 changes: 2 additions & 1 deletion controller/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ export const minioAccessKey = process.env.MINIO_ACCESS_KEY;
export const minioSecretKey = process.env.MINIO_SECRET_KEY;
export const minioBucketName = process.env.MINIO_BUCKET_NAME ?? 'artifacts';
export const minioRegion = process.env.MINIO_REGION ?? 'no-region';
export const minioUrl = process.env.MINIO_URL ?? 'http://localhost:8085';
export const minioPublicUrl = process.env.MINIO_PUBLIC_URL ?? 'http://localhost:8085';
export const minioInternalUrl = process.env.MINIO_INTERNAL_URL ?? minioPublicUrl;

// Argo client
export const argoClientEndpoint = process.env.ARGO_CLIENT_ENDPOINT ?? 'http://localhost:8084/';
Expand Down
31 changes: 23 additions & 8 deletions controller/src/minio/minio.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ import { URL } from 'node:url';
import {
minioAccessKey,
minioBucketName,
minioInternalUrl,
minioPublicUrl,
minioRegion,
minioSecretKey,
minioUrl,
} from '../config.js';

if (!minioAccessKey) {
Expand All @@ -16,23 +17,37 @@ if (!minioSecretKey) {
throw new Error('MINIO_SECRET_KEY is not set');
}

const parsedUrl = new URL(minioUrl);
const minioClient = new MinioClient({
endPoint: parsedUrl.hostname,
port: Number.parseInt(parsedUrl.port, 10) ?? (parsedUrl.protocol === 'https:' ? 443 : 80),
useSSL: parsedUrl.protocol === 'https:',
const parsedMinioPublicUrl = new URL(minioPublicUrl);
const minioPublicClient = new MinioClient({
endPoint: parsedMinioPublicUrl.hostname,
port: Number.parseInt(parsedMinioPublicUrl.port, 10) ?? (parsedMinioPublicUrl.protocol === 'https:' ? 443 : 80),
useSSL: parsedMinioPublicUrl.protocol === 'https:',
accessKey: minioAccessKey,
secretKey: minioSecretKey,
region: minioRegion,
});
const parsedMinioInternalUrl = new URL(minioInternalUrl);
const minioInternalClient = new MinioClient({
endPoint: parsedMinioInternalUrl.hostname,
port: Number.parseInt(parsedMinioInternalUrl.port, 10) ?? (parsedMinioInternalUrl.protocol === 'https:' ? 443 : 80),
useSSL: parsedMinioInternalUrl.protocol === 'https:',
accessKey: minioAccessKey,
secretKey: minioSecretKey,
region: minioRegion,
});

export async function computePresignedPutUrl(objectName: string): Promise<string> {
const expire = 60 * 60 * 24 * 7; // 7 days
return await minioClient.presignedPutObject(minioBucketName, objectName, expire);
return await minioPublicClient.presignedPutObject(minioBucketName, objectName, expire);
}

export async function computePresignedGetUrl(objectName: string): Promise<string> {
// One hour
const expire = 3600;
return await minioClient.presignedGetObject(minioBucketName, objectName, expire);
return await minioPublicClient.presignedGetObject(minioBucketName, objectName, expire);
}

export async function getObjectSize(objectName: string): Promise<number> {
const stat = await minioInternalClient.statObject(minioBucketName, objectName);
return stat.size;
}
28 changes: 22 additions & 6 deletions controller/src/server/resolvers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,18 @@ import { SIMPIPE_PROJECT_LABEL } from '../k8s/label.js';
import {
createProject, deleteProject, getProject, projects, renameProject,
} from '../k8s/projects.js';
import { computePresignedGetUrl, computePresignedPutUrl } from '../minio/minio.js';
import { computePresignedGetUrl, computePresignedPutUrl, getObjectSize } from '../minio/minio.js';
import { assertPrometheusIsHealthy } from '../prometheus/prometheus.js';
import queryPrometheusResolver from '../prometheus/query-prometheus-resolver.js';
import { NotFoundError, PingError } from './apollo-errors.js';
import type { ArgoWorkflow, ArgoWorkflowTemplate } from '../argo/argo-client.js';
import type ArgoWorkflowClient from '../argo/argo-client.js';
import type K8sClient from '../k8s/k8s-client.js';
import type {
Artifact,
DryRun,
DryRunNode,
DryRunNodeArgs as DryRunNodeArguments,
DryRunNodeArtifact,
DryRunNodeMetrics,
DryRunNodeMetricsCpuSystemSecondsTotalArgs as DryRunNodeMetricsCpuSystemSecondsTotalArguments,
DryRunNodePod, DryRunNodePodLogArgs as DryRunNodePodLogArguments,
Expand Down Expand Up @@ -160,6 +160,13 @@ const resolvers = {
const { sub } = user;
return await getWorkflowTemplate(name, argoClient, sub);
},
async artifacts(
_p: EmptyParent, _a: EmptyArguments, context: AuthenticatedContext,
): Promise<Query['artifacts']> {
// eslint-disable-next-line @typescript-eslint/no-unused-vars
const { argoClient } = context;
return [];
},
} as Required<QueryResolvers<AuthenticatedContext, EmptyParent>>,
Mutation: {
async createDryRun(
Expand Down Expand Up @@ -547,16 +554,25 @@ const resolvers = {
threadsMax: queryPrometheusResolver.bind(undefined, 'simpipe_threads_max', 'main'),
ulimitsSoft: queryPrometheusResolver.bind(undefined, 'simpipe_ulimits_soft', 'main'),
},
DryRunNodeArtifact: {
Artifact: {
async url(
dryRunNodeArtifact: DryRunNodeArtifact,
): Promise<DryRunNodeArtifact['url']> {
const { key } = dryRunNodeArtifact;
artifact: Artifact,
): Promise<Artifact['url']> {
const { key } = artifact;
if (!key) {
return undefined;
}
return await computePresignedGetUrl(key);
},
async size(
artifact: Artifact,
): Promise<Artifact['size']> {
const { key } = artifact;
if (!key) {
return undefined;
}
return await getObjectSize(key);
},
},
WorkflowTemplate: {
async project(
Expand Down
11 changes: 8 additions & 3 deletions controller/src/server/schema.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ type Query {
workflowTemplate(
""" Name of the Argo workflow template to be fetched """
name: String!): WorkflowTemplate @auth

""" List of all the artifacts """
artifacts: [Artifact!]! @auth
}

""" Project is the high level concept for a specific pipeline in SIM-PIPE. It contains all the previous dry runs of the pipeline and its results. """
Expand Down Expand Up @@ -229,10 +232,10 @@ type DryRunNodePod implements DryRunNode {
log(maxLines: Int, grep: String, sinceSeconds: Int, sinceTime: Int): [String!]

""" Input artifacts of the node. """
inputArtifacts: [DryRunNodeArtifact!]
inputArtifacts: [Artifact!]

""" Output artifacts of the node. """
outputArtifacts: [DryRunNodeArtifact!]
outputArtifacts: [Artifact!]
}

type DryRunNodeMisc implements DryRunNode {
Expand Down Expand Up @@ -301,13 +304,15 @@ type DryRunNodeMetrics {
}

""" Contains information about files produced during a dry run execution """
type DryRunNodeArtifact {
type Artifact {
""" The artifact name """
name: String!
""" The artifact path """
key: String
""" URL to download the artifact using an HTTP GET."""
url: String
""" The artifact size """
size: Int
}

"""
Expand Down
56 changes: 31 additions & 25 deletions controller/src/server/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,19 @@ export type Scalars = {
TimeStamp: { input: number; output: number; }
};

/** Contains information about files produced during a dry run execution */
export type Artifact = {
__typename?: 'Artifact';
/** The artifact path */
key?: Maybe<Scalars['String']['output']>;
/** The artifact name */
name: Scalars['String']['output'];
/** The artifact size */
size?: Maybe<Scalars['Int']['output']>;
/** URL to download the artifact using an HTTP GET. */
url?: Maybe<Scalars['String']['output']>;
};

/** The input data to create a new dry run */
export type CreateDryRunInput = {
/**
Expand Down Expand Up @@ -139,17 +152,6 @@ export type DryRunNode = {
type: DryRunNodeType;
};

/** Contains information about files produced during a dry run execution */
export type DryRunNodeArtifact = {
__typename?: 'DryRunNodeArtifact';
/** The artifact path */
key?: Maybe<Scalars['String']['output']>;
/** The artifact name */
name: Scalars['String']['output'];
/** URL to download the artifact using an HTTP GET. */
url?: Maybe<Scalars['String']['output']>;
};

/** Prometheus metrics for the node. */
export type DryRunNodeMetrics = {
__typename?: 'DryRunNodeMetrics';
Expand Down Expand Up @@ -569,14 +571,14 @@ export type DryRunNodePod = DryRunNode & {
finishedAt?: Maybe<Scalars['String']['output']>;
id: Scalars['String']['output'];
/** Input artifacts of the node. */
inputArtifacts?: Maybe<Array<DryRunNodeArtifact>>;
inputArtifacts?: Maybe<Array<Artifact>>;
/** The logs of the node. */
log?: Maybe<Array<Scalars['String']['output']>>;
/** The name of the pod. */
metrics: DryRunNodeMetrics;
name: Scalars['String']['output'];
/** Output artifacts of the node. */
outputArtifacts?: Maybe<Array<DryRunNodeArtifact>>;
outputArtifacts?: Maybe<Array<Artifact>>;
phase: DryRunNodePhase;
/** The name of the pod. */
podName: Scalars['String']['output'];
Expand Down Expand Up @@ -823,6 +825,8 @@ export type PrometheusSample = {

export type Query = {
__typename?: 'Query';
/** List of all the artifacts */
artifacts: Array<Artifact>;
/** List of all docker registry credentials */
dockerRegistryCredentials: Array<DockerRegistryCredential>;
/** Get a dry run by ID */
Expand Down Expand Up @@ -952,6 +956,7 @@ export type ResolversInterfaceTypes<RefType extends Record<string, unknown>> = {
export type ResolversTypes = {
ArgoWorkflow: ResolverTypeWrapper<Scalars['ArgoWorkflow']['output']>;
ArgoWorkflowTemplate: ResolverTypeWrapper<Scalars['ArgoWorkflowTemplate']['output']>;
Artifact: ResolverTypeWrapper<Artifact>;
Boolean: ResolverTypeWrapper<Scalars['Boolean']['output']>;
CreateDryRunInput: CreateDryRunInput;
CreateProjectInput: CreateProjectInput;
Expand All @@ -960,7 +965,6 @@ export type ResolversTypes = {
DockerRegistryCredentialInput: DockerRegistryCredentialInput;
DryRun: ResolverTypeWrapper<DryRun>;
DryRunNode: ResolverTypeWrapper<ResolversInterfaceTypes<ResolversTypes>['DryRunNode']>;
DryRunNodeArtifact: ResolverTypeWrapper<DryRunNodeArtifact>;
DryRunNodeMetrics: ResolverTypeWrapper<DryRunNodeMetrics>;
DryRunNodeMisc: ResolverTypeWrapper<DryRunNodeMisc>;
DryRunNodePhase: DryRunNodePhase;
Expand All @@ -985,6 +989,7 @@ export type ResolversTypes = {
export type ResolversParentTypes = {
ArgoWorkflow: Scalars['ArgoWorkflow']['output'];
ArgoWorkflowTemplate: Scalars['ArgoWorkflowTemplate']['output'];
Artifact: Artifact;
Boolean: Scalars['Boolean']['output'];
CreateDryRunInput: CreateDryRunInput;
CreateProjectInput: CreateProjectInput;
Expand All @@ -993,7 +998,6 @@ export type ResolversParentTypes = {
DockerRegistryCredentialInput: DockerRegistryCredentialInput;
DryRun: DryRun;
DryRunNode: ResolversInterfaceTypes<ResolversParentTypes>['DryRunNode'];
DryRunNodeArtifact: DryRunNodeArtifact;
DryRunNodeMetrics: DryRunNodeMetrics;
DryRunNodeMisc: DryRunNodeMisc;
DryRunNodePod: DryRunNodePod;
Expand All @@ -1019,6 +1023,14 @@ export interface ArgoWorkflowTemplateScalarConfig extends GraphQLScalarTypeConfi
name: 'ArgoWorkflowTemplate';
}

export type ArtifactResolvers<ContextType = any, ParentType extends ResolversParentTypes['Artifact'] = ResolversParentTypes['Artifact']> = {
key?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
name?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
size?: Resolver<Maybe<ResolversTypes['Int']>, ParentType, ContextType>;
url?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
__isTypeOf?: IsTypeOfResolverFn<ParentType, ContextType>;
};

export type DockerRegistryCredentialResolvers<ContextType = any, ParentType extends ResolversParentTypes['DockerRegistryCredential'] = ResolversParentTypes['DockerRegistryCredential']> = {
name?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
server?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
Expand Down Expand Up @@ -1052,13 +1064,6 @@ export type DryRunNodeResolvers<ContextType = any, ParentType extends ResolversP
type?: Resolver<ResolversTypes['DryRunNodeType'], ParentType, ContextType>;
};

export type DryRunNodeArtifactResolvers<ContextType = any, ParentType extends ResolversParentTypes['DryRunNodeArtifact'] = ResolversParentTypes['DryRunNodeArtifact']> = {
key?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
name?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
url?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
__isTypeOf?: IsTypeOfResolverFn<ParentType, ContextType>;
};

export type DryRunNodeMetricsResolvers<ContextType = any, ParentType extends ResolversParentTypes['DryRunNodeMetrics'] = ResolversParentTypes['DryRunNodeMetrics']> = {
cpuSystemSecondsTotal?: Resolver<Maybe<Array<ResolversTypes['PrometheusSample']>>, ParentType, ContextType, Partial<DryRunNodeMetricsCpuSystemSecondsTotalArgs>>;
cpuUsageSecondsTotal?: Resolver<Maybe<Array<ResolversTypes['PrometheusSample']>>, ParentType, ContextType, Partial<DryRunNodeMetricsCpuUsageSecondsTotalArgs>>;
Expand Down Expand Up @@ -1126,11 +1131,11 @@ export type DryRunNodePodResolvers<ContextType = any, ParentType extends Resolve
exitCode?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
finishedAt?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
id?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
inputArtifacts?: Resolver<Maybe<Array<ResolversTypes['DryRunNodeArtifact']>>, ParentType, ContextType>;
inputArtifacts?: Resolver<Maybe<Array<ResolversTypes['Artifact']>>, ParentType, ContextType>;
log?: Resolver<Maybe<Array<ResolversTypes['String']>>, ParentType, ContextType, Partial<DryRunNodePodLogArgs>>;
metrics?: Resolver<ResolversTypes['DryRunNodeMetrics'], ParentType, ContextType>;
name?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
outputArtifacts?: Resolver<Maybe<Array<ResolversTypes['DryRunNodeArtifact']>>, ParentType, ContextType>;
outputArtifacts?: Resolver<Maybe<Array<ResolversTypes['Artifact']>>, ParentType, ContextType>;
phase?: Resolver<ResolversTypes['DryRunNodePhase'], ParentType, ContextType>;
podName?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
progress?: Resolver<Maybe<ResolversTypes['String']>, ParentType, ContextType>;
Expand Down Expand Up @@ -1199,6 +1204,7 @@ export interface PrometheusStringNumberScalarConfig extends GraphQLScalarTypeCon
}

export type QueryResolvers<ContextType = any, ParentType extends ResolversParentTypes['Query'] = ResolversParentTypes['Query']> = {
artifacts?: Resolver<Array<ResolversTypes['Artifact']>, ParentType, ContextType>;
dockerRegistryCredentials?: Resolver<Array<ResolversTypes['DockerRegistryCredential']>, ParentType, ContextType>;
dryRun?: Resolver<ResolversTypes['DryRun'], ParentType, ContextType, RequireFields<QueryDryRunArgs, 'dryRunId'>>;
ping?: Resolver<ResolversTypes['String'], ParentType, ContextType>;
Expand All @@ -1222,10 +1228,10 @@ export type WorkflowTemplateResolvers<ContextType = any, ParentType extends Reso
export type Resolvers<ContextType = any> = {
ArgoWorkflow?: GraphQLScalarType;
ArgoWorkflowTemplate?: GraphQLScalarType;
Artifact?: ArtifactResolvers<ContextType>;
DockerRegistryCredential?: DockerRegistryCredentialResolvers<ContextType>;
DryRun?: DryRunResolvers<ContextType>;
DryRunNode?: DryRunNodeResolvers<ContextType>;
DryRunNodeArtifact?: DryRunNodeArtifactResolvers<ContextType>;
DryRunNodeMetrics?: DryRunNodeMetricsResolvers<ContextType>;
DryRunNodeMisc?: DryRunNodeMiscResolvers<ContextType>;
DryRunNodePod?: DryRunNodePodResolvers<ContextType>;
Expand Down
9 changes: 9 additions & 0 deletions frontend/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading