Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installer/MCO: store pointer ignition customizations in MachineConfig #540

Merged

Conversation

hardys
Copy link
Contributor

@hardys hardys commented Nov 16, 2020

This is an alternative to #467

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 16, 2020
@hardys
Copy link
Contributor Author

hardys commented Nov 16, 2020


#### Story 2

As a baremetal IPI user, I need to deploy in an environment where network
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but...as discussed ideally we move baremetal IPI to using the Live ISO flow which already has techniques to address this by effectively having two Ignition runs.

xref coreos/ignition#979

(OK this is covered below, thanks)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the long term plan and discussions are in-progress about how we may integrate the IPI baremetal automation with the assisted-install flow (which uses the live iso) - the idea with this enhancement is an interim solution so we can solve the problem for customers in 4.7 (as bond+vlans is a common requirement on baremetal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that if you embed the pointer into the live iso that still doesn't solve the problem in the case where the MCS URL is on a tagged network, so you need to wire in keyfiles to coreos-install --copy-network - which then aren't managed via the MCO.

So it may be preferable in the eventual IPI+ISO flow to generate an iso on the fly with the rendered config embedded (or provided via a config drive alongside the ISO) - in which case the changes made here are still relevant.

metal3-io/metal3-docs#150 is some early exploratory work in that direction FYI, but it's something for 4.8+

config where config-drive size limits allow. However this results in the [same
issue with losing any user customizations](https://bugzilla.redhat.com/show_bug.cgi?id=1833483) provided directly via the pointer config.

Discussions indicate we may want to deprecate/remove this ignition-config method of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to do this without solving openshift/machine-config-operator#1720
People definitely rely on customizing the pointer config with networking in UPI scenarios (as well as controlling the kernel cmdline).

Has the machine-specific problem been addressed in some other way in baremetal IPI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we don't have a solution for the machine-specific problem in IPI - I'm hoping we can make progress on the MCO issue you mentioned to resolve that 👍

The aim of this enhancement is mostly to solve the bond+vlan case for IPI, where you don't need per-machine configs (like static ips), but you do need to avoid the pointer ignition download, because the vlan isn't configured yet.

@cgwalters
Copy link
Member

Overall if all we're getting out of this is "baremetal IPI can do not-machine-specific bonding"...that's not very much. What if we went for a much more tactical short term fix like having baremetal IPI support injecting network configuration in the same way that it works with the CoreOS ISO? In the end we just want NM configs injected into the target system's /boot - BMIPI could accept that as a tarball even.

@hardys
Copy link
Contributor Author

hardys commented Nov 19, 2020

Overall if all we're getting out of this is "baremetal IPI can do not-machine-specific bonding"...that's not very much

@cgwalters that's not all it gives us, it also unblocks the previously reverted work to manage the pointer ignition config in the MCO?

@runcom what are your thoughts, IMO one of the advantages of this approach is it also solves the problems that caused that work to be reverted? That's a benefit to all platforms AFAICT?

Also, this means side-channel customizations performed via the installer in the IPI flow become visible (on all platforms) at the MachineConfig level, and in the cases where the MCD supports it those configs can even be updated on day 2?

What if we went for a much more tactical short term fix like having baremetal IPI support injecting network configuration in the same way that it works with the CoreOS ISO? In the end we just want NM configs injected into the target system's /boot - BMIPI could accept that as a tarball even.

The interfaces to do that don't currently exist in Ironic, there's no way to inject arbitrary files into the image unless we implement some custom hook in the deploy ramdisk (and even if we do that, it's not clear where the data should come from, e.g do we invent some new configuration interface when users are already familiar with ignition/MachineConfig?)

@hardys
Copy link
Contributor Author

hardys commented Nov 19, 2020

@cgwalters followup question - what would an openshift-install interface to provide arbitrary NM keyfiles to pass into coreos-install --copy-network look like?

I'm not aware of any other platforms doing that, so it's another point of friction if we're considering such an approach (regardless of the underlying tool that installs RHCOS and writes those files)

@kirankt
Copy link

kirankt commented Nov 20, 2020

FYI. I have a WIP branch of an implementation of this enhancement here: https://github.com/kirankt/installer/tree/ignition-update-check

@cgwalters
Copy link
Member

cgwalters commented Nov 20, 2020

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

@kirankt
Copy link

kirankt commented Nov 20, 2020

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

From what I understand, converting the pointerignition to machineconfig during the 'create' target (which is what the above linked implementation does) preserves any customization done by the user. We have couple options before we do the conversion. 1) We can remove the URL and CA Cert parts and just send any customization as a MachineConfig, which from my tests has rendered correctly by MCO. 2) Send the whole pointer ignition as-is. This seems to work OK as well because the MCO, is smart to fetch the contents of the data URL and render the final config.
Perhaps, we can consider this conversion an intermediate step, where we can log/warn the users that we will deprecate direct customization of the pointer ignition and perhaps shepherd them towards creating machineconfigs instead. Further down the road, we can completely do away with the pointer ignition -> machineconfigs code completely.

@hardys
Copy link
Contributor Author

hardys commented Nov 23, 2020

The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems

I'm not sure I follow - the enhancement to manage the pointer ignition was already proposed, accepted and implemented.

It got reverted because it broke this pointer ignition customization interface (for all platforms) without deprecation/warning - this enhancement is an attempt to resolve that.

Note that the changes under discussion in this enhancement would not change the flow at all for UPI, it's only about reflecting pointer ignition customizations via the MCO rendered config in IPI deployments.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

That's why I discussed potentially deprecating this customization interface (potentially only for IPI), but we can't just remove an existing interface without deprecation because we know users are using it, rightly or wrongly.

Currently there's no way to detect the pointer ignition is customized and warn the user to do something else, but if we modify the installer as described then we could tell the user to use a MachineConfig instead.

Note that injecting network configuration via MachineConfig doesn't actually solve the problem we're interested in for baremetal, e.g the case where the controlplane network is on a non-default vlan (because networking needs to be configured at the point of evaluating the pointer config merge URL)

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

BM IPI just injects the pointer ignition, there's nothing special about that compared to other platforms.

To enable bond+vlan configuration, we already implemented a solution where we just injected the MCO rendered config, but had to revert it because some users in the field are using pointer ignition customizations.

For the same reason the work from @runcom last cycle to manage the pointer ignition inside the MCO was also reverted.

The only special thing about BM IPI is that we currently don't use coreos-install, but AFAICT that doesn't help for IPI because there's no way to pass NM keyfiles to the --copy-network interface via the openshift-install anyway? That's just a UPI thing atm, right?

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

No, the idea is we'd just consume the MCO rendered config, and ignore the pointer config, but ensure any user customizations injected via create ignition-configs is included in that config (to avoid breaking the existing interface, even if we document/warn that MachineConfig is preferred)

@runcom
Copy link
Member

runcom commented Nov 23, 2020

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

that drawback was indeed covered in https://github.com/openshift/enhancements/blob/master/enhancements/machine-config/user-data-secret-managed.md#drawbacks
with this enhancement, we're going to be able to manage the pointer ignition (as we would have done and we'll do for migrating it from ign v2 to v3 but we couldn't because we were missing any customization made). Now the customizations will be lifecycled with MCs and can just translate up/down as needed (not that we forsee any new migration soon but still) as well as anything else.
I think the UPI case was and it's still a drawback in this scenario but we're not covering that here (or in my other PR that was reverted)

@cgwalters
Copy link
Member

I'm not sure I follow - the enhancement to manage the pointer ignition was already proposed, accepted and implemented.

Yes. I would clarify that as "manage the pointer config in IPI". Per above we simply cannot rely on doing so in UPI. And that means we need to be careful about what we can design that relates to the pointer config.

@runcom
Copy link
Member

runcom commented Nov 23, 2020

Yes. I would clarify that as "manage the pointer config in IPI". Per above we simply cannot rely on doing so in UPI. And that means we need to be careful about what we can design that relates to the pointer config.

so Colin, this just suffers from the same UPI drawbacks for which we'll really need a better story - for UPI, as soon as the cluster is up and running (which won't then care about where the ignition pointer is) it's still going to work normally as it's still reading the pointer config from an external server anyway on scale up.

@hardys
Copy link
Contributor Author

hardys commented Nov 24, 2020

Installer PR up for review - openshift/installer#4413

Testing that and it seems to work well - when I inject a file via a customization to the master/worker pointer ignition I see:

$ oc get MachineConfig | grep installer
99-installer-ignition-master                                                                  3.1.0             46m
99-installer-ignition-worker                                                                  3.1.0             46m

And that change then shows up as expected in the MCS/rendered output

@cgwalters
Copy link
Member

OK now my concern here is around (mostly UPI) setups that are modifying the pointer config today for per-machine state. I want to be sure we're not breaking them.

The SOP today should be to run create ignition-configs and copy that rather than doing any modifications in place; you'd basically have to do that to have one per machine.

So I think this should be fine. The logic of storing it only if modified makes sense.

I would say though in the end this enhancement is mostly an aid to metal IPI.

Also, related to advanced networking configs - one option is to use a single NIC at install time by injecting kernel arguments just for the firstboot, then switch to bonding via MachineConfig. If metal IPI can inject kernel arguments only for the firstboot it could do this today without this enhancement.

Anyways though
/approve

@hardys
Copy link
Contributor Author

hardys commented Nov 25, 2020

OK now my concern here is around (mostly UPI) setups that are modifying the pointer config today for per-machine state. I want to be sure we're not breaking them.

The SOP today should be to run create ignition-configs and copy that rather than doing any modifications in place; you'd basically have to do that to have one per machine.

So I think this should be fine. The logic of storing it only if modified makes sense.

Yeah I don't think this will impact UPI setups at all.

I would say though in the end this enhancement is mostly an aid to metal IPI.

FWIW the reason I spent time on this proposal as an alternative to #467 is that I think it does provide some benefit beyond solving the day-1 networking problem for metal IPI:

  • It unblocks previously reverted MCO work from @runcom to have MCO manage the pointer ignition
  • It provides us a means to potentially warn IPI users they should prefer MachineConfig over pointer customizations in future

Also, related to advanced networking configs - one option is to use a single NIC at install time by injecting kernel arguments just for the firstboot, then switch to bonding via MachineConfig. If metal IPI can inject kernel arguments only for the firstboot it could do this today without this enhancement.

Yeah we don't have a clean way to do that in metal IPI atm, and there's no openshift-install configuration interface that supports it?

As previously mentioned, I've started work to enable metal IPI to use the RHCOS live-iso via metal3-io/metal3-docs#150 which may give us more options in future, but we'll need to figure out the openshift-install interfaces.

Anyways though
/approve

Thanks!

hardys pushed a commit to hardys/installer that referenced this pull request Dec 1, 2020
In the case where an IPI user customizes the pointer config, this
config is only persisted via the user-data secret, which was an
issue when moving to an MCO templated pointer config:

openshift#4228

It was also a problem for attempts for IPI baremetal to consume
the MCO rendered config directly, with the aim of enabling
network configuration via MachineConfig:

openshift#3589

With this new approach, we detect the case where a change has
been made to the pointer config by the user, and in that case
it is stored via a MachineConfig manifest, such that any
customizations are reflected in the MCO rendered config.

Implements: openshift/enhancements#540
Co-Authored-By: Steven Hardy <[email protected]>
hardys pushed a commit to hardys/installer that referenced this pull request Dec 2, 2020
This restores the work which was previously done via openshift#3276
but then reverted via openshift#3589 due to breaking users who customized
the pointer ignition config in IPI deployments.

A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details.

Note that some additional changes beyond the initial implementation
were required due to the MCO now supporting multiple
ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46

Co-Authored-By: Steven Hardy <[email protected]>
@hardys hardys force-pushed the ignition_machineconfig branch from aed19d3 to e4ff697 Compare December 3, 2020 10:54
@hardys
Copy link
Contributor Author

hardys commented Dec 3, 2020

This was implemented via openshift/installer#4413

@hardys
Copy link
Contributor Author

hardys commented Dec 9, 2020

@kirankt since this is now implemented, perhaps you can give it the final lgtm so it merges?

@kirankt
Copy link

kirankt commented Jan 28, 2021

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 28, 2021
@openshift-ci-robot openshift-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2021
@stbenjam
Copy link
Member

/test markdownlint

@stbenjam
Copy link
Member

/retest

@stbenjam
Copy link
Member

There's a few markdown issues to correct:

 enhancements/machine-config/custom-ignition-machineconfig.md:70:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:72:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:75:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:125:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:126:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:157:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:159:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2] 

@hardys hardys force-pushed the ignition_machineconfig branch from e4ff697 to 7b61959 Compare January 29, 2021 16:45
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2021
@stbenjam
Copy link
Member

/approve
/lgtm

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, kirankt, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 29, 2021
@openshift-merge-robot openshift-merge-robot merged commit 382dbf0 into openshift:master Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants