-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud-init changes behaviour with ubuntu cloud image ova starting from version 20230602 #4188
Comments
Thank you @lethargosapatheia for filing a bug and trying to improve cloud-init. Given that cloud-init is the same version in both images it is highly unlikely that cloud-init is the problem, but something outside of cloud-init in images affecting the behavior you are seeing with login prompt changes and reboots. The initial difference I see in cloud-init behavior between 5/24 and 06/02 is that the first boot in 0524 only runs cloud-init's
On 0602 image that first boot actually runs both Something seems to have triggered reboot of the machine later on 0624, or for some reason systemd didn't include Unfortunately, the journal.txt included by default in cloud-init collect-logs is only the most recent boot (from journalctl -b 0) what I think we'd need to see is journalctl -b 1 output from both images. that'll give us service information run on previous boot to make sure there isn't some sort of systemd ordering cycle that kicked out cloud-init system services from the boot goals. |
Thank you for your fast response. I'm adding here the two logs (these are newly created templates using the same two images, but I'm guessing it shouldn't be a problem, as the behaviour is consistent). When using |
I've also tried using Maybe there's a way to force ssh not to start at first boot. |
Ok, I seem to have found the solution. I was able to tell cloud-init to stop the ssh service only once, before rebooting, and then not run the command again, as it did when simply running bootcmd:
- [ cloud-init-per, once, systemctl, stop, sshd ] |
Glad you found a workaround to your issue. I'm going to close this issue. Please reopen with more details if you see further issue related to cloud-init. |
Well, my solution is a completely different story than the fact that cloudinit has changed its behaviour in the ova ubuntu cloudimages. So someone made a really important change, which now seems to be ignored because I have found a solution. |
Are you sure? @blackboxsw already pointed out that this is the same version of cloud-init, so I'd be surprised if it was a cloud-init behavior change. Something changed, agreed. Perhaps a service that cloud-init depends on didn't start for some reason? |
Well, initially I mentioned this on the #ubuntu-server (not on the cloud-init one) irc channel and @blackboxsw told me that I should file a bug here, which I was a little bit surprised of, also because he himself said that he suspected the cloud-init versions are the same. I also kind of expected a feedback related to the logs that I added afterwards, but I guess there's nothing to infer from there, as far as I understand, so I'm putting it down to a lack of communication. That said, where could I then file a bug related to this issue? This is where I've also posted, but the status now is 'expired' :D |
I'm still at a loss as to why this issue is ignored. The change is really important and no one seems to be bothered by this. I've actually underestimated the issue, because this doesn't only affect the packer templates. When cloning them and using cloud-init again, I get the same behaviour where the virtual machine restarts after terraform connects to it. |
This seems to be related to vmtools (perl guest os costumization in vSphere) which by default reboots the virtual machine after cloud-init is finished. It's kind of a mess and rather difficult to get rid of. |
Hello @lethargosapatheia,
Best regards |
Hi @vitality411. Thanks for reaching back! What I ended up doing was completely removing the symlink Your solution seems much more elegant, of course, because at least it hopefully observes the logic of vmware tools. P.S. I also want to reiterate that, in contrast to your issue as you describe it here (vmware/open-vm-tools#684), in my case the cloud-init scripts do run successfully, but that packer (and terraform for that matter) will connect to the virtual machines over ssh before a last reboot, and the provisioning will be interrupted. On the other hand I can't say for certain that my cloud-init scripts exceed 30 seconds, but I'm not ruling out anything. I use ansible with packes to provision the templates and I use ssh for terraform to retrieve certain pieces of information. In both cases the process is interrupted. |
The consequences of my workaround is that vmware-tools behaves like before: just executing gosc script without waiting for cloud-init and rebooting. In my case cloud-init is able to only execute the Even when your case is a little different I am sure that the cause is the same as ssh server will only start after cloud-init is able to finish I was able to see the difference clearly by running |
I have to correct my workaroud. This does not work on an upstream image as open-vm-tools.service starts before cloud-init and so is running with the default timeout (30s). The only working workaround I have is building my own image with wait-cloudinit-timeout=0 already set. |
For me changing the cloudimage beforehand is a no-go. That's exactly why I use the cloud-image, so that I have a baseline image that I can change afterwards as I like, without having to edit it beforehand and that's the reason why I switched from ubuntu iso images where I'd go through the whole install process to the cloud-image. I also rely on the updates that cloud-images provide and the image is ready-made. I know this has become rather off-topic, but I think this is important for people who come across this issue nonetheless. |
I couldn't agree more. For me changing cloudimage beforehand is also a no-go. |
Bug report
Since the Ubuntu 22.04 20230602 (https://cloud-images.ubuntu.com/jammy/20230602/) cloud image version (tested on OVA) the behaviour of cloudinit has changed. Until this version appeared the virtual machine would start, apply the cloudinit settings, but it wouldn't boot fully, that is to say, I wasn't getting a login prompt and, more importantly, the SSH server didn't start. This is the expected behaviour.
Since 20230602 this behaviour has unexpectedly changed. In this intermediary phase, the virtual machine does boot fully, I get a login prompt and the SSH starts.
For me this looks like a bug, because I expect the SSH server not to start before all the cloud-init settings have been applied, meaning, before the virtual machine actually reboots.
Of course, this behaviour completely disturbs automatic provisioning with tools like Packer, which rely on SSH server being available at the right time, when everything is ready. In my case, packer (1.7.9) starts doing the provisioning through ansible (2.14.6) and it's interrupted by the reboot, making it impossible to automate the process.
I can confirm the provisioning works properly with version 20230524 (https://cloud-images.ubuntu.com/jammy/20230524/) of Ubuntu 22.04, where the VM doesn't boot fully and restarts correctly, without starting the SSH server before rebooting.
I am using vSphere 6.7.0.54000 to provision my virtual machines.
Environment details
cloud-init logs
I'm uploading the cloudinit logs for both images using
cloud-init collect-logs
and those fromvar/log/cloud-init.log
.cloud-init-20230602.tar.gz
var-log-cloud-init-20230602.tar.gz
cloud-init-20230524.tar.gz
var-log-cloud-init-20230524.tar.gz
Please let me know if I need to add any information.
Best regards!
The text was updated successfully, but these errors were encountered: