-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fcos] Enable openshift installer to decompress xz files and upload them as page blobs to Azure #3033
Conversation
Hi @jomeier. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I compiled the installer and tried it out. It decompresses the fcos image, uploads it into a page blob container. But: As explained a few times before the upload takes - at least at my PC - far too long and terminates with a timeout finally. |
After 27 minutes Terraform stopped with this error message:
Seems more like a timeout to me because it comes always after that time on my PC. Next I will now try the upload directly on Azure in a cloud shell. This should work faster because we already are on Azure. |
/retitle [fcos] Enable openshift installer to decompress xz files and upload them as page blobs to Azure |
Extra rhcos92dmz.vhd file included |
Next problem: Upload was much much much faster if I run openshift-installer in the Azure cloud shell. But:
The fcos vhd file seems to be dynamic. But Azure images require FIXED vhd images. So we have to get them a fixed size and use an uploader which takes into account to not upload zero blocks. Don't know if Terraform can do this ... I used this command line tool: azure-vhd-utils It only uploads what is necessary and it outputs a progress counter on the console which would also nice to have. I have no idea of how to convert a VHD file in golang from dynamic to fixed size. |
Correct me if I am wrong but wouldn't it be a good idea if the fcos image would be provided in fixed size and compressed vhd file by the fedora core os team? the fixed size file contains lots of zeroes, they should be compressible very good. The step of converting the dynamic vhd file to a fixed size vhd file wouldn't be necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome progress! Some small comments to help out. Also, I think you might have accidentally committed rhcos92dmz.vhd
.
pkg/tfvars/internal/cache/cache.go
Outdated
return err | ||
} | ||
defer writer.Close() | ||
//defer os.Remove(dest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor question, Is there a reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgreene570: No. I took it from the original code from Vadim. It may be that some code snippets aren't necessary anymore.
go.mod
Outdated
@@ -100,6 +100,7 @@ require ( | |||
github.com/terraform-providers/terraform-provider-openstack v1.24.0 | |||
github.com/terraform-providers/terraform-provider-random v1.3.2-0.20191204175905-53436297444a | |||
github.com/terraform-providers/terraform-provider-vsphere v1.14.0 | |||
github.com/ulikunitz/xz v0.5.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you run go mod vendor
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Will do that.
pkg/tfvars/internal/cache/cache.go
Outdated
return nil | ||
} | ||
|
||
func decompressFileXZ(src, dest string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a lot like decompressFileGzip
. Do you think we could combine them to save some lines? 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lines are not needed anymore because everything we need is already in pkg/tfvars/internal/cache/cache.go
/assign |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The code which should convert dynamic vhd to fixed vhd size is commited. The installer can download, decompress, convert dynamic to fixed vhd size and upload the decompressed VHD file to Azure and a VM image can also be created. But: It's neither creating a running VM on Azure nor in Windows Hyper V. For sure it has something to do with the dynamic to fixed VHD size conversion. I blow the file up with zeroes to fulfill the requirements of Azure. Azure complains if the blob file size doesn‘t fit the size given in the VHD footer :-( If I use the Powershell command:
(would love to see its source code) I can start a VM with hyper-V on my windows PC with it. So this seems to be a good test setup for finding the problem. I found a vhd analyzer tool: https://github.com/franciozzy/VHD-Sherlock But it's not reporting any issues. Maybe someone can help me with that? I'm no VHD file structure expert. I'm giving up for today. |
I would like to propose a different strategy because this becomes more and more ugly :-) I would use Terraform to create a helper VM on Azure where all this download, extract, conversion, upload stuff will be performed. We could use standard tools (qemu-img, azcopy) for that and things would be much faster because we already do everything on Azure. Does this sound feasible ? |
In my opinion Terraform is configured wrong in the installer. The basic features local-exec and remote-exec don't work. The installer can't handle "internal-plugin" commands. They are necessary to finally get this fcos decompression, conversion and upload code running. I'd love to get your feedback on this image upload strategy. I don't know how to implement this without remote-exec. How has written the Terraform code in the installer? Maybe he or she can help here? |
I found out why the built in provisioners (remote-exec, local-exec, ...) don't work in the openshift-installer. The Terraform library spawns a sub process with the binary it's linked to. This sub process gets a few command line args starting with "internal-plugin". The original Terraform binary knows this parameter, openshift-installer doesn't. All I had to do is to add a command line parameter in Cobra for "internal-plugin" and reach a few arguments through to the Terraform library function, which takes care of it. In my opinion this will enhance openshift-installer's Terraform capabilites a lot. And it will help to get my PoC for the image upload running. Currently I'm proud like a king :-) PR takes a while ... |
…s (local-exec, remote-exec, ...) are used.
… converts it to a fixed image and uploads it to Azure. Everything with standard tools (AZcopy, qemu-img, ...).
What does this PR do:
This 'supporting VM' thing might look a little bit confusing at first but ... Advantage to the previous solution:
Try it out, please and don't hesitate to give me feedback. Thanks a lot also to @vrutkovs and @mjudeikis for not giving up on me ;-) |
This would be a temporary solution until the fcos image lands in the Azure Marketplace. |
@jomeier thank you for working on this! coreos/fedora-coreos-tracker#148 would then just be one more step to automate. |
/hold |
@LorbusChris: If the coreos team converts the fcos image (I think thats a good idea): How does this 8GByte big image come into the Azure storage account? If the installer must upload it this will last very long. Seems not to be very feasible, if it's necessary each time we run the installer. Yes it would be great if we wouldn't have to deal with the fcos image at all. I hope it gets to the Azure market place very soon. |
I think this is a larger topic and needs to be decided on the architects' side. You could create an enhancement proposal for the installer to support remote-exec and local-exec as a first step in https://github.com/openshift/enhancements However, I don't think running an additional VM for this step is the general direction we want to take here (at least not by default). Right now we want to keep the delta beetween fcos and master branches small, so I'm very hesitant to accept this as-is. xref'ing: #3042 |
@jomeier btw do you mind updating the first comment in this issue with the summary you wrote just above here? ^ |
@LorbusChris: Treat this PR as a way (even if it's hacky) to get the image automatically uploaded to the correct location during the installation process so people can test Azure support of OKD until the fcos image is available on Azure. I fully understand your arguments and I don't expect you to accept the PR. Everything is fine. |
Hi,
If the coreos team converts the fcos image to a fixed size VHD file (I think thats a good idea): How does this 8GByte big image come into the Azure storage account?
If the installer must upload it, this will last very long. Seems not to be very feasible, if it's necessary each time we run the installer.
Yes it would be great I we wouldn't have to deal with the fcos image at all. I hope it gets to the Azure market place very soon.
Greetings,
Josef