-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plumb customer-configured DNS into instances #4265
Conversation
@gjcolombo FYI — requested a review from you because this was written prior to #4194 landing, and I did sort of the bare minimum merge from main just now. I'd like to hear if you think there's a more logical spot to put these changes to Nexus / Sled Agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new instance plumbing LGTM (with my thanks for spending time picking through the merge). I didn't look at the OPTE or service plumbing at all.
hostname: Some( | ||
hardware | ||
.properties | ||
.hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily for this PR, but I think it might be worth an filing issue: I'm guessing that if this conversion fails it implies that the hostname in the instance create parameters was completely bogus such that this instance will never be able to start. If that's so we should see about rejecting invalid hostnames in Nexus::project_instance_create
so that we reject bad hostnames up front instead of letting the create go through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gone and verified this is the case, but I believe that our restrictions on the Name
type make them all valid single-label hostnames. It would be good to prove this out though (or use a single consistent type across Omicron and OPTE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the intent. In RFD 4 names were defined to be compliant with RFC 1035. However, I don't think we ever validated hostname the same way:
omicron/common/src/api/internal/nexus.rs
Lines 38 to 40 in 463cc1a
/// RFC1035-compliant hostname for the instance. | |
// TODO-cleanup different type? | |
pub hostname: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Well. I'll see if that's not too bad to fix along with this PR.
Two follow-up items I'd like to do are:
But we may want to land this before those tasks are done to avoid blocking the release. I can turn these into follow-up issues if we like. |
I think that makes sense. I wouldn't say those are blockers. |
Currently OPTE hardcodes DNS servers to 8.8.8.8, and we would prefer to not do this.
When setting up the rack, customers configure DNS servers. Currently this is added to Nexus's deployment configuration, so that Nexus can resolve external domain names for SAML. This change plumbs these DNS servers through to sled-agent and OPTE.
For non-instance OPTE ports,
DhcpCfg::default()
is used, which is equivalent to "no hostname, no domain name, no DNS servers, no search domains".This is not the final state of the DHCP work we want to do; in talking with @rmustacc and @rcgoodfellow we agree that implementing this minimal DHCP option set is the correct thing to do urgently; and that we still need to design the broader inter-instance networking picture (whether we want an instance-facing recursive resolver on the rack and whether it should resolve domain names for other instances with VPC IPs, what those hostnames and search domains should be, and how customers can modify these settings).