Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate function logs from the other logs #1400

Open
akihikokuroda opened this issue Jul 8, 2024 · 11 comments
Open

Separate function logs from the other logs #1400

akihikokuroda opened this issue Jul 8, 2024 · 11 comments
Assignees

Comments

@akihikokuroda
Copy link
Collaborator

What is the expected enhancement?

Functions may be provided by different parties from the party who executes the function. The log from the function may be better to be separated from the part of log.

@akihikokuroda
Copy link
Collaborator Author

Here is an idea to implement this.

The log entry from the function must be tagged (marked) like [Function: <function name>] contents [End Function] by putting the tag manually or by the function provided in the Serverless SDK.

Then

The scheduler filters the log entries and put into the new function_log item of the Job and put the new gateway API to return from the item

OR

The gateway filters the log and provide APIs that return the function log or the other part of log

WDYT @Tansito @psschwei @IceKhan13

@akihikokuroda
Copy link
Collaborator Author

Do we need to emit the function log to somewhere?

@akihikokuroda akihikokuroda self-assigned this Jul 8, 2024
@psschwei
Copy link
Collaborator

psschwei commented Jul 8, 2024

One could make the argument that end users just need the function output/result and logs should only be available to the provider. That would probably require providers to generate something to return when jobs fail (so would shift some ownership to them), but it would also remove the need for us to filter logs (reducing our responsibility a bit, which I also like).

@Tansito
Copy link
Member

Tansito commented Jul 9, 2024

Yeah, we were commenting good approximations to this and due to the deadlines that we have our proposal would be:

  • Logs from Provider Functions are going to be accesible only by user providers

So for this, the proposal in what I was thinking it's something similar to what Paul is saying:

  • A user calls to /job/:id/logs end-point
    • if the user is a provider we return the logs (we know that a user is a provider because a provider has an admin_group assigned)
    • if not we return a message
  • A provider can obtain jobs id's from its functions through: /function/:id/jobs
    • This way providers can discover jobs that are being executed by their functions.
  • We need to close a structure for the result field to something like:
    • Format: { status: "ERROR_CODE_PRE_DEFINED", result: "" }
    • This way, as Paul said, providers can manage their errors and show to the user what they want
  • In the Runner, provide a logger to be used by the providers (maybe the one from ray).

WDYT?

@akihikokuroda
Copy link
Collaborator Author

A user calls to /job/:id/logs end-point
if the user is a provider we return the logs (we know that a user is a provider because a provider has an admin_group assigned)
if not we return a message

Our log is a string returned by ray.job_submission.JobSubmissionClient.get_job_logs. It is saved into Job.logs and retrieved via api/v1/job/<job id>/logs.
So this api is changed to return the log only for the provider.

A provider can obtain jobs id's from its functions through: /function/:id/jobs

Here is the relationship among jobs and function.
middleware job id <1 - 1> function <1 - m> runtime job id
These may be useful apis for the provider

  1. retrieve middleware job ids of Function executions to get the logs of execution
  2. retrieve runtime job ids of Middleware job

We need to close a structure for the result field to something like:
Format: { status: "ERROR_CODE_PRE_DEFINED", result: "" }

The result is the return value from the Runner.run function. It must be a json string now. We probably put some recommendations in the SDK doc.

In the Runner, provide a logger to be used by the providers (maybe the one from ray).

We can pre-configure logger to output to a local file in main.tmpl and push it to a new job item (like job.function_logs) at the end of execution.

@Tansito
Copy link
Member

Tansito commented Jul 9, 2024

So this api is changed to return the log only for the provider.

Just to clarify this point, Aki. We are going to continue supporting the current logic. For qiskit functions created by users the user will continue having access to its logs. The difference is that now, we are going to check if the job comes from a qiskit function shared by a provider and in that case only the provider will be able to read those logs.

These may be useful apis for the provider

  1. retrieve middleware job ids of Function executions to get the logs of execution
  2. retrieve runtime job ids of Middleware job

Exactly. What we are trying with this is to offer a way to be able to analyze some executions. And maybe add some filters in the end-point like the status in case the provider could want to analyze FAILED jobs.

  1. Yes
  2. I didn't think about this use-case but I think it can have sense, yep. We can comment it with @pandasa123

The result is the return value from the Runner.run function. It must be a json string now. We probably put some recommendations in the SDK doc.

Start with a recommendation could work, yep.

We can pre-configure logger to output to a local file in main.tmpl and push it to a new job item (like job.function_logs) at the end of execution.

I would like to start first with something simple. Just changing our current print in the runner for a logger is more than enough in this case.

@akihikokuroda
Copy link
Collaborator Author

OK. The required changes right now are:

  1. Change the job.logs() api to check the user and if the job is executing the provider function then change the output accordingly
  2. provide new api doing "retrieve middleware job ids of Function executions to get the logs of execution"

@Tansito
Copy link
Member

Tansito commented Jul 9, 2024

Yep, basically that!

@akihikokuroda
Copy link
Collaborator Author

provide new api doing "retrieve middleware job ids of Function executions to get the logs of execution"

For this, a new api /api/v1/programs/<program id>/get_jobs is added to gateway.

Where is this api called from the client?

  1. add get_jobs(function: QiskitFunction) -> List[job_id: str] to the ServerlessClient
  2. add get_jobs() -> List[job_id: str] to QiskitFunction

any opinion? @psschwei @Tansito @IceKhan13

@Tansito
Copy link
Member

Tansito commented Jul 11, 2024

I was thinking in a workflow like:

function = serverless.get("my-first-pattern")
function.get_jobs()

@akihikokuroda
Copy link
Collaborator Author

OK. It seems good. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants