-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support node type based allocation #205
base: master
Are you sure you want to change the base?
Support node type based allocation #205
Conversation
@confluentinc It looks like @whynick1 just signed our Contributor License Agreement. 👍 Always at your service, clabot |
@ewencp could use your feedback here. I'm a bit surprised that it seems ducktape doesn't understand different types of nodes. It appears that by default all nodes are just grabbed from a pool. Kafka seems to get around this by having everything (including tests and ZK) available in all images. I kind of feel we might be "doing it wrong" or missing something. Could use your guidance here. |
You're not doing it wrong, you're just running into some simplifying assumptions that we haven't had to update for almost 4 years now :) We started out with a super simple model b/c with the Vagrant setup combined with the fact that originally we were just booting up a base image and then loading in the extra software, it was fine to assume everything was symmetric and we could have a simple pool. Things have definitely gotten a bit more complex, but for most testing in AK and Confluent Platform, we still just assume uniformity. On the Confluent Platform side, this works out because we actually pre-bake an AMI and use vagrant-aws to launch a bunch of instances. That said, I can see how this wouldn't be ideal for everyone :) In fact, we have run into a bit of this wrt OS support -- testing on mixed OS clusters (e.g. testing a Windows-based client against Linux-based services) requires a mixed cluster and some smarts on allocating nodes. See #155 for where most of the additions to support that were made. tbh, those additions were almost 2 years ago and I'm forgetting the context and details on them, but general idea is that we wanted to extend nodes in the cluster with additional parameters and allow a
you might use something like
OS was specifically the thing being targeted in that PR, but it totally makes sense to support other parameters as well. Probably a generic tagging setup is the right way to handle this. In addition to your use case, I actually want to see this happen for a few other use cases as well. Another place this would definitely be applicable is if you need heterogeneous nodes (e.g. for a perf test you might want beefier machines). I'm more than happy to help get improvements in that support filtering the nodes. The main thing I'd want to make sure we pay attention to is how we express it in the |
Hi @ewencp , thanks for the feedback! I am excited to see the potential use cases. As @criccomini said, it will be great if you can provide some semi-detailed implementation suggests. Meanwhile, I have two questions for you:
So, my plan is to add an optional
Here
To me this is kind of less flexible. Do you have any recommendation? |
(please ignore reference.. copied wrong GH PR link :) ) |
32b566a
to
49010d5
Compare
Waiting for the fix for |
889decc
to
5ed531c
Compare
For 1, you should just be able to use the One interesting design question is how we handle things if there are multiple dimensions and the caller only specifies one (e.g. if we have varying sizes of linux machines, but I only ask for linux machine, how does the test runner decide which node to allocate?). tbh, I'm wary of allowing this to get much more complex than some very small number of dimensions (OS was the obvious one for us, but I think generic For some historical context, we in fact never really wanted to require the For 2, today the node type allocation is very basic and just baked into the Cluster implementation (see https://github.com/confluentinc/ducktape/blob/master/ducktape/cluster/json.py#L103) and the allocation is similar to what you describe, but currently based only on the OS type (see https://github.com/confluentinc/ducktape/blob/master/ducktape/cluster/node_container.py#L33). For vagrant specifically, personally I would try to figure out a way to not tie it to the naming as that's inflexible, requires you to specify "serialization" format if you ever want to support multiple tags, etc. However, due to the way the wrt impl, let me loop back around, though tbh it's been so long since looking at this code in detail that I'm not sure how much I can just summarize without basically implementing myself. Hopefully pointers above are a good starting point and I can take a quick pass at current code to call out gaps/issues I might find easily. |
@ewencp Thank you so much for your comment. I try to make a brief summary for each sections and provide some of my thought, as below: Q1: You should just be able to use the Q2: Generic Q3: For historical reason, we lost the ability to know number of nodes up front automatically. But we want to solve that, and eventually get rid of the Q4: I'm open to different options, how to decide a node's type that fit into a standard model. We want to make it flexible rather than tie to the naming. |
@criccomini I am not working on developer tooling anymore, so it's up to @ewencp and the @confluentinc/tools team :) |
Hi @jarekr would you please leave a comment! 😃Thanks! |
@criccomini This is my fault. Ironically, we wish there were more contributions back to this repo to improve test framework generally (rather than, e.g., hacking around things in tests, which does happen...), but then when they arrive, we're not setup for them because they are so infrequent :) As I'm sure you can appreciate, we also have limited reviewer bandwidth. Most folks that have been tagged here on this PR aren't actually reviewers for this code! We have an assigned team just for ownership purposes, but actually the core framework sees so few changes (whether that's a good or bad thing), that some of the folks mentioned really don't have the context to process or review these changes. I need a bit of time to follow up and review (took me a full day to just process, evaluate, and write up earlier response), but I will allocate time to review followups from @whynick1. And I apologize for delay. Tomorrow looks shit, but I will follow up on this by Monday. Longer term, @confluentinc/quality-eng is a new org that hasn't necessarily taken ownership yet, but is a bit more targeted and likely to be more responsive for cases like this. I think it'd be good to get them looped into this now, especially since we have community interested in testing like this, which is a rare thing! |
No worries! Totally understand, and I know how it goes. I really don't want to fork, but we're nearing a point on our end where we have to since we've developed a bunch of code around ducktape at this point. :P LMK how I can help. |
Yeah, that's roughly what I meant. I think we should consider dropping the need for using a But if that doesn't work, roughly what you're getting at works. (Note that even if we need to construct the
Yeah, tradeoff makes sense. Mainly with things like this I just want to make sure we plan for possible extension and set ourselves up to be able to do so compatibly.
Agreed that as long as we have the current issue with counting automatically, this works fine. And I'm happy to merge something based on your proposal. Was just pointing out that we might also want to think about that eventual future and where/how this metadata should be expressed. If we were able to get rid of the node count, then having to specify things this way isn't great. In general its an issue because rather than naturally expressing it with the
Yeah, I was trying to get a bit of design reasoning behind this update as well because the OS additions were a bit of a hack job that I'm not sure I'm happy with. The reason I was suggesting something other than just naming was due to previous comment about possibly specifying preset dimensions (e.g. cpu, memory, etc). If we aren't going down that path and staying very open ended, then yeah, naming is probably the only reasonable approach. The other downside of using naming is that you can't do things like ask for any node that meets certain criteria. From a test scheduling and cluster utilization perspective, this might not be great in some cases. For example, if I have just a few tests that need very large nodes, given current model you still have to have those available for the entire test run, but those nodes could go unutilized. We can sort of address that if most tests just don't specify and will use any node. If you get into too many classes of nodes, I can see this still possibly being a problem.
If we can't generalize, then we don't need to do it. I was mainly raising this if we do something more than just a fixed string approach. If we decide to just go with strings and user-defined classes without any additional knowledge in the framework, then this wouldn't be needed. |
Agree! Let's do
Thanks for your explanation, and I understand your concern. It is true a generic type tagging with "multiple dimensions" would be more flexible than just based on "name".
This means, possibly we can directly read these setting (
btw, I can start working on this next Monday at the earliest. But beforehand, I do want to hear your thought. 😃 @ewencp cc @criccomini |
@whynick1 Yeah, so re: A4, my point was more of a tagging/matching approach. We rely a ton today on the vagrant cluster type, but really that's a detail (and increasingly I'd like to see VagrantCluster become an implementation detail). For example, docker or ec2 clusters would also work well, but potentially offer flexibility/scalability that Vagrant doesn't. So I had been thinking more along the lines of constraints that cluster resource managers tend to enforce, e.g. I'm not tied to either approach, it probably makes sense to experiment a bit and deprecate if necessary. I'm just prone to extreme care because deprecating and removing functionality can be time consuming and costly, and we're not even at a 1.0 for ducktape yet :) |
Sounds good to me! I think we can provide user with
Another thing I want to discuss with you is, how we tag machine with certain type (a multiple dimension type, including cpu, memory, disk, etc).
we can directly read from machine! Like:
, then attach these configurations to each
@ewencp Does this make sense to you? 😃 |
@whynick1 is going to move forward with the proposed implementation. If you guys don't want it, we'll just fork ducktape. |
Yeah, this approach seems like it works well. I suspect longer term we'll end up with some sort of mix because within a suite of tests it might be handy to be able to alias an actual set of requirements (e.g. In terms of annotating nodes with that info, yeah, I think detection like this should work, though you do need bootstrapping info. For example, info like OS type will necessarily have to come from the cluster implementation before having access because otherwise we don't know how to connect to it (ssh vs rdp) or what commands to run (the examples given won't work on windows). There's actually a related use case where tagging the nodes gets even a bit more complicated -- if you want to run tests against existing services, e.g. you already have a Kafka cluster and want to back KafkaService with that instead of having it allocate out of the standard pool. In that use case, we have yet another source of information for tagging nodes (and we probably want to be able to tag arbitrarily, not just via a fixed set of resource types, though perhaps the fixed built-in set is good baseline). |
2f21918
to
8db4931
Compare
@ewencp Supported node allocation based on (multi-dimension) machine-type according to previous discussion. Before we fixing all the tests, just want to get some feedbacks from you.
1. What support after change?
2. What's the allocation strategy?
cc @criccomini |
@ewencp ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try to take a pass at the code today, but wanted to ask a couple of questions re: ux.
I think in general what you outlined with these updates makes sense. My only concern is that it feels a bit clunky, but I think this is just because you are covering the most complicated case. In particular:
- I assume common case would not be to inline these definitions, but probably to share some common definitions, eg do something like
STORAGE_NODE = {...}
, defining the minimum requirements (so you could, e.g. allocated nodes with arbitrary CPU that still match)? - I assume, similarly, that with the test specs, more frequently than not we'll end up with just a pre-defined spec, matching those in services? e.g. common case would be to just do something like
@cluster(cluster_spec=ClusterSpec.from_list([{BASE_NODE,
num_nodes: 3}, {STORAGE_NODE, 'num_nodes': 1}])
. In fact, now writing this, I realize the above is not quite right and we might want some method onClusterSpec
that copies and changes number of nodes so those entries in the list can be more likeBASE_NODE.with_num_nodes(3)
that copies everything but overridesnum_nodes
to streamline this common use case. - Given that common case, based on the allocation strategy, it is correct that you should never have trouble with the allocation/mapping, since they will always align? So in the common case, it'd still be pretty concise and wrt allocation you wouldn't really need to worry about availability as long as nodes that can match are disjoint (which again, seems like the common case as different node types probably optimize for different types of resources, unless you have broad classifications like
SMALL_NODE
andLARGE_NODE
).
Part of this is just to make sure UX for the common case is nice for the user, but also so we can update documentation w/ both common and advanced cases.
wrt allocation strategy, generally a greedy approach looks good, just want to clarify this point:
Test will fail instantly once a request cannot be satisfied
Does this reflect the existing policy or does this mean compared to the existing approach that can handle lack of available nodes? i.e. does this make TestScheduler
fail-fast if you have parallel test runner enabled and the next test can't be satisfied from remaining nodes? This is pretty critical because as you build up even a pretty small test suite, parallel test runner becomes critical to run in a reasonable amount of time.
@ewencp Based on comment, I propose a more user-friendly pattern.
Let me know if that looks good to you.
Yes, I think the allocation strategy should work for most of the case.
The change I make should not change existing fail-fast policy. Is there any unit test that covers that? |
@ewencp NAG NAG NAG NAG :D |
Looks good to me. Seems simple and straightforward to use.
Ok, just couldn't be sure from the description. As long as we maintain the behavior, we should be good. I think kafka tests still don't have parallel test runner turned on, but Confluent's internal tests using ducktape have had them turned on for a couple of years -- we'd be happy to do a bunch of runs with a modified version of ducktape to check for any regressions. wrt unit tests, tbh I'm not sure since I think the parallel test runner was merged ~3 years ago and hasn't seen much movement since. given general coverage numbers, i'm pretty sure we have some coverage of that code, but not sure about that specific functionality. we can visit this in the code review. |
Hi, I'm new on the Confluent QE team. I apologize if I'm just missing vital information or misunderstanding something, but I'm concerned that using hardware specs this way will cause long-term maintenance problems. Even with your example provided, the specs that are given are in fact being used as a proxy for a specific capability: you are planning to run zookeeper on machines with a given cpu and disk configuration, so you write your test to search for that cpu and disk configuration and assume that it's a zookeeper node. But this assumption may not hold forever. Additionally, by tying it into the source itself, it needs to hold for other people who want to run this test as well. Mentioned above was the question of whether or not "type" would be too limiting, and hardware specs were given as an example of dimensions that could get complicated quickly. I do think "type" is too limiting, but because there is only one per vm, and you could use a provisioned vagrant environment for multiple tests, and there's no guarantee that you'd need exactly the same types for all the tests. What I think we should have instead is a tag system, that could be implemented as simply as:
please forgive my potentially broken Then in the
Then, your There will (/may) be tests that actually require specific hardware configurations, but in my experience it's quite rare. Additionally, intent ("I need to run zookeeper on this machine") is inherently different than technical specs ("this machine has 100 GB of RAM") so rather than try to write one system that anticipates both right now, we can pretend that we don't need to worry about actual machine specs in the ClusterSpec, because as I understand it we don't. |
😩 can you guys get on the same page? 😢 I'm unclear whether this comment is just your opinion, or whether it's actually the team's preference. This is not what we've been discussing with @ewen. We are spending developer $ to work on this, but we need to not flail like this. Please get together and come up with what the heck you want us to do, and then let us know. If it's machine tags, fine. If it's hardware tags, fine. Again, please clarify whether you have appropriate buy-in on your side for this approach, or whether this is just off-the-cuff thoughts from you. |
Sorry; like I said, I’m new here. Those were just my thoughts. I will sync with @ewen and we’ll come to consensus so there should be no more mixed signals. |
@jspong @ewencp @mohnishbasha any update? |
Anyone? |
@jspong @ewencp @mohnishbasha halp |
Forked here for those interested: |
ab512cc
to
70b6cf8
Compare
3202c48
to
1a1f62f
Compare
|
We use VagrantCluster and define a couple of node with VagrantFile as below:
As we have different GCE images (https://github.com/mitchellh/vagrant-google) for zookeeper node and server node, we want to find out a way to have ducktape allocate different types of node for different services.
Therefore, we try to add an optional parameter
node_type
toService
class, so that developer can specify type of node to allocate during integration test.This change is back-compatible, as
node_type
is default toNone
.