Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/async start of serve #15972

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

GitAlexxx
Copy link

Before executing serve() - when running configured deployments in bulk via a script - it is often necessary to perform an initialization function (create a block, create/reset a concurrency tag, and others). It is normal that it can be asynchronous (in addition to the fact that you want to have a completely asynchronous code, get_client c sync_client=True is not functionally equal to asynchronous get_client). The current implementation does not allow you to run the described case in main as asyncio.run(main()).

Alexander Sidorenko added 2 commits November 11, 2024 16:53
Before executing serve() - when running configured deployments in bulk via a script - it is often necessary to perform an initialization function (create a block, create/reset a concurrency tag, and others). It is normal that it can be asynchronous (in addition to the fact that you want to have a completely asynchronous code, get_client c sync_client=True is not functionally equal to asynchronous get_client). The current implementation does not allow you to run the described case in main as asyncio.run(main()).
Copy link

codspeed-hq bot commented Nov 11, 2024

CodSpeed Performance Report

Merging #15972 will not alter performance

Comparing GitAlexxx:feat/async_start_of_serve (d0a8312) with main (2bf9d05)

Summary

✅ 3 untouched benchmarks

@desertaxle
Copy link
Member

Thanks for opening a PR @GitAlexxx! Can you elaborate on the situation you're looking to improve? Even though serve is a sync function, you can still call it from an async context. A short script where you're seeing a problem would be handy.

@GitAlexxx
Copy link
Author

GitAlexxx commented Nov 11, 2024

Thanks for opening a PR @GitAlexxx! Can you elaborate on the situation you're looking to improve? Even though serve is a sync function, you can still call it from an async context. A short script where you're seeing a problem would be handy.

Hi @desertaxle !

Yes, serve() can be called in any context, but the commit refers more to the context itself - now it is not possible to perform preparatory asynchronous functions BEFORE calling serve(), now this can only be done by explicitly calling the Runner object (in fact, simply rewriting the existing serve). Previously, in perfect 2.X, serve() was asynchronous and did not implement work with event loop internally, there is not even any transitional compatibility with the codebase + I would like to use uvloop.run() for known performance reasons (but more on that in another commit).
Commit does not change the existing approach, but expands it.

An example of the case described in the commit description (just an example, the point is to be able to run asynchronous functions before serve()):

import asyncio

from prefect import serve, get_client, flow


@flow(name='clustering-flow', log_prints=True)
async def clustering_flow():
    print('Clustering results')


async def init() -> None:
    await create_heavy_concurrency_limit()
    await reset_heavy_concurrency_limit()


async def create_heavy_concurrency_limit() -> None:
    async with get_client() as client:
        await client.create_concurrency_limit(tag='heavy-tag', concurrency_limit=1)


async def reset_heavy_concurrency_limit() -> None:
    async with get_client() as client:
        await client.reset_concurrency_limit_by_tag(tag='heavy-tag')


async def main() -> None:
    await init()  # this won't work in the current prefect implementation.

    clustering_deploy = await clustering_flow.to_deployment(name='clustering-deployment')

    await serve(clustering_deploy)


if __name__ == '__main__':
    asyncio.run(main())

@desertaxle
Copy link
Member

Thanks for the example @GitAlexxx!

In 3.0, we decided to make serve only synchronous because our @sync_compatible hides typing information on decorated functions, and since serve runs indefinitely, we thought that there was less utility in having an async interface for serve.

I can think of two ways we could approach this:

  1. We expose a loop kwarg on serve, allowing users to provide a running event loop. I think that will allow you to use uvloop instead of asyncio, but I'm not entirely sure.
  2. We add a aserve with is an explicitly async version of serve.

If approach 1 works, I would prefer it over approach 2 to avoid duplicating implementation between serve and aserve, but let me know if approach 1 would work for your usecase!

@GitAlexxx
Copy link
Author

Thanks for the example @GitAlexxx!

In 3.0, we decided to make serve only synchronous because our @sync_compatible hides typing information on decorated functions, and since serve runs indefinitely, we thought that there was less utility in having an async interface for serve.

I can think of two ways we could approach this:

  1. We expose a loop kwarg on serve, allowing users to provide a running event loop. I think that will allow you to use uvloop instead of asyncio, but I'm not entirely sure.
  2. We add a aserve with is an explicitly async version of serve.

If approach 1 works, I would prefer it over approach 2 to avoid duplicating implementation between serve and aserve, but let me know if approach 1 would work for your usecase!

@desertaxle, thanks for the clarifications regarding serve() and the reasons for its complete synchronicity, this is really important.
I also note that a simple call to serve() is not always enough for more complex deployment scenarios (including using asynchronous initialization functions), of course there are fewer such cases, but this is not a reason not to consider them.

I have considered both suggested options: the first option solves only the problem with the event loop used (asyncio/uvloop), the second option (creating aserve) solves both problems: setting up an asynchronous context and using its own event loop.
I can't completely agree about the duplication of functionality. By design, they may look similar, but they solve completely different tasks, aserve() not only adds compatibility with the existing user code base, but allows you to configure deployments more flexibly in complex scenarios.

In this regard, I am sending a commit to add aserve().

Copy link
Member

@desertaxle desertaxle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GitAlexxx that approach makes sense! Could you add some test coverage to the new aserve function? Also, I think it'd make sense to remove the event loop handling from serve and instead raise an error when serve is used from a synchronous context that directs users to use aserve instead.

Let me know if you have any questions about those requests!

@GitAlexxx
Copy link
Author

GitAlexxx commented Nov 16, 2024

@GitAlexxx that approach makes sense! Could you add some test coverage to the new aserve function? Also, I think it'd make sense to remove the event loop handling from serve and instead raise an error when serve is used from a synchronous context that directs users to use aserve instead.

Let me know if you have any questions about those requests!

@desertaxle, thanks for the idea, it really differentiates the use of serv/aserv more. I am sending a commit with changes and test coverage.

During the writing of tests (as it happens)) I noticed the serve in the Flow class (I do not use such a launch myself in projects - a separate function is preferable). In fact, it is now repeating serve before the change - with event loop processing and no asynchronous version.

Do you think it makes sense to change it? It seems to me that this method makes no sense at all, given that it simply duplicates the existing serve and does not solve any new task (in the new implementation, separate serve and aserve cover all possible launch cases).

@desertaxle
Copy link
Member

During the writing of tests (as it happens)) I noticed the serve in the Flow class (I do not use such a launch myself in projects - a separate function is preferable). In fact, it is now repeating serve before the change - with event loop processing and no asynchronous version.

Do you think it makes sense to change it? It seems to me that this method makes no sense at all, given that it simply duplicates the existing serve and does not solve any new task (in the new implementation, separate serve and aserve cover all possible launch cases.

Yes, I think it makes sense to update Flow.serve since it is a useful, convenient interface for users. We should update it to have an explicit sync and aysnc interface and delegate execution to serve and aserve, but that can be handled in a seperate PR and doesn't need to be addressed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants