Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[controller][server][vpj][samza][producer] Use lazy initialization of producer in VeniceWriter #713

Closed
wants to merge 2 commits into from

Conversation

sushantmane
Copy link
Contributor

@sushantmane sushantmane commented Oct 25, 2023

Use lazy initialization of producer in VeniceWriter

My old PR #224 removed lazy producer init in VWFactory.
Bringing it back.

How was this PR tested?

CI

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

sixpluszero
sixpluszero previously approved these changes Oct 25, 2023
Copy link
Contributor

@sixpluszero sixpluszero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except above clarification.

OPEN_VENICE_WRITER_COUNT.decrementAndGet();
} catch (Exception e) {
logger.warn("Swallowed an exception while trying to close the VeniceWriter for topic: {}", topicName, e);
VENICE_WRITER_CLOSE_FAILED_COUNT.incrementAndGet();
}
threadPoolExecutor.shutdown();
getThreadPoolExecutor().shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need a bit improvement, shutdown only makes sure no new tasks will be accepted. Take a look at our conventions for close a thread pool, e.g., VeniceParentHelixAdmin.close or somewhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean use shutdownNow to interrupt the running threads?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, shutdownnow + awaitTermination for a limited timeout. does it make any sense?

OPEN_VENICE_WRITER_COUNT.decrementAndGet();
} catch (Exception e) {
logger.warn("Swallowed an exception while trying to close the VeniceWriter for topic: {}", topicName, e);
VENICE_WRITER_CLOSE_FAILED_COUNT.incrementAndGet();
}
threadPoolExecutor.shutdown();
getThreadPoolExecutor().shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@@ -87,7 +88,7 @@ public class VeniceWriter<K, V, U> extends AbstractVeniceWriter<K, V, U> {
new ChunkedPayloadAndManifest(null, null);

// use for running async close and to fetch number of partitions with timeout from producer
private final ThreadPoolExecutor threadPoolExecutor;
private final Lazy<ThreadPoolExecutor> threadPoolExecutorLazy = Lazy.of(this::createThreadPoolExecutor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this static? Do we really want one thread pool per VW? We can have a lot of VW per server...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But with single pool, you won't be able to limit the thread count per VW? Or is it still possible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit would need to be higher, but do we really need some per-VW threads? If we have 1K VWs in a server, do we really want 2K threads spread across 1K TPs? Would it be enough to have a single static TP with, e.g., max 100 threads, and other tasks get queued if we tap out all threads?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah that's probably more reasonable.
I thought the 2 per VW is required for XInfra purpose. Maybe @sushantmane knows better about this...

Copy link
Contributor Author

@sushantmane sushantmane Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I was thinking about having just one ThreadPool for all VWs. But I wanted let callers's have some control over the concurrency of close operations which is where this thread pool is primarily used. Making this a static pool would mean that the concurrency isn't fully controlled by the callers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see... so I guess it could be fine then...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FelixGV - Should we merge this PR then?

Copy link
Contributor Author

@sushantmane sushantmane Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing the scope of this PR slightly:

  1. Keep ThreadPoolExecutor per VW for now
  2. Keep lazy init of PubSubProducer and ThreadPool
  3. Close segments concurrently (VW::endAllSegments)

Do let me know if there are other things that I should address. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update PR with the above changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left another review.

@sushantmane sushantmane marked this pull request as draft October 25, 2023 18:51
@nisargthakkar
Copy link
Contributor

Since this touches VW, can you update the components affected to include everything that uses VW? (or use all) since probably only routers, tc and fc are excluded

controller, server, dvc, vpj, samza, producer, ...

@sushantmane sushantmane changed the title [controller][server] Use lazy initialization of producer in VeniceWriter [controller][server][vpj][samza][producer] Use lazy initialization of producer in VeniceWriter Oct 27, 2023
@sushantmane sushantmane marked this pull request as ready for review October 27, 2023 11:24
Copy link
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments. Thanks for working on this.

@@ -41,6 +43,7 @@ public <K, V, U> VeniceWriter<K, V, U> createVeniceWriter(VeniceWriterOptions op
return new VeniceWriter<>(
options,
props,
producerAdapterFactory.create(props, options.getTopicName(), options.getBrokerAddress()));
Lazy.of(() -> producerAdapterFactory.create(props, options.getTopicName(), options.getBrokerAddress())),
Copy link
Contributor

@FelixGV FelixGV Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this... have you found specific code paths that were instantiating a VW but then not actually using it to produce anything?

In many places, we are already wrapping the VW into Lazy, or else use other patterns of lazy init (e.g. computeIfAbsent). So now we'll be doing double-lazy wrapping in many places. In theory it's fine but it seems a bit dirty to me... maybe the clean thing to do is to remove the "outer" lazy wrapping and just keep this inner one, or perhaps to make the remaining unwrapped VW into Lazy<VW> (and avoid the inner lazy). I don't have a strong opinion either way but I feel like it's something we should consider.

Anyway, I don't mean to say you should embark on a big refactoring in this PR, but I am interested in hearing your opinion on the above... I imagine you looked at this code more recently than I did if you're doing this, so maybe you caught some interesting context IDK about along the way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I just double-checked the code, and, except for two place (and none of them are in server startup path), we're indeed initializing VW only when we need it. One trigger to add lazy initialization for VW's producer was based on hazy log reading, where I saw producers being one after another during startup . I should have properly checked the code before adding this code.

Will update the PR to remove lazy init. Thanks!

Comment on lines +367 to +373
private PubSubProducerAdapter getProducerAdapter() {
return producerAdapterLazy.get();
}

private ThreadPoolExecutor getThreadPoolExecutor() {
return threadPoolExecutorLazy.get();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think personally I would prefer not hiding usages of Lazy in this way. I prefer letting the producerAdapterLazy.get() be present at the call site, as it gives maintainers a nudge to think about whether they really want to initialize the Lazy instance. In some cases, it may be desirable to only use the lazy if it's already initialized, by calling ifPresent(instance -> { ... }) on it.

Copy link
Contributor Author

@sushantmane sushantmane Oct 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Will update the PR.

Comment on lines +411 to +417
getProducerAdapter().close(topicName, closeTimeOutInMs, gracefulClose);
OPEN_VENICE_WRITER_COUNT.decrementAndGet();
} catch (Exception e) {
logger.warn("Swallowed an exception while trying to close the VeniceWriter for topic: {}", topicName, e);
VENICE_WRITER_CLOSE_FAILED_COUNT.incrementAndGet();
}
threadPoolExecutor.shutdown();
getThreadPoolExecutor().shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap inside ifPresent rather than doing a late initialization just for the sake of tearing it down right after...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks

@sushantmane sushantmane closed this Nov 3, 2023
@sushantmane sushantmane deleted the li-fix-flakies branch November 3, 2023 02:44
@sushantmane sushantmane restored the li-fix-flakies branch November 3, 2023 02:44
@sushantmane sushantmane reopened this Nov 3, 2023
@sushantmane sushantmane marked this pull request as draft November 6, 2023 08:07
@sushantmane
Copy link
Contributor Author

I'm gonna put it in draft mode because I want to take some time to think it over and make the PR better.

@sushantmane
Copy link
Contributor Author

sushantmane commented Nov 6, 2023

I'm gonna put it in draft mode because I want to take some time to think it over and make the PR better.

PR #720 has substantially reduced the number of MetaStore VW's per host. I'm considering updating the PR based on Felix's suggestion to use a single thread pool for all VW's instead of a thread pool per VW.

I would also like to consider sending CMs concurrently.

@sushantmane
Copy link
Contributor Author

Closing this PR for now. I would like to complete it at some point..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants