Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure: Fixed size vhd file necessary to create Azure VM images #361

Closed
jomeier opened this issue Jan 31, 2020 · 22 comments
Closed

Azure: Fixed size vhd file necessary to create Azure VM images #361

jomeier opened this issue Jan 31, 2020 · 22 comments

Comments

@jomeier
Copy link

jomeier commented Jan 31, 2020

Hi,

currently I try to get OpenShift OKD working with the Azure fcos image.

The OKD installer downloads the xz compressed VHD file, decompresses it and uploads it into an Azure storage blob.

If we want to create an Azure VM image out of this VHD file Azure complains about that it is not fixed size but dynamic size.

Thats a problem because the common tools for converting VHD files from dynamic to fixed size are mostly written in Powershell or similar languages.

Is it possible that you offer fcos as a compressed fixed size VHD image so we don't have to convert it on our own which is not trivial but required for Azure ?

Because the image will be blown up mostly with zeroes, the compression should being able to take care about it.

Greetings,

Josef

@jomeier
Copy link
Author

jomeier commented Feb 1, 2020

And if possible: offer it on the Azure marketplace, please. This would make things so much easier.

@LorbusChris
Copy link
Contributor

xref: openshift/installer#3033 (comment)

Could we change coreos-assembler buildextend-azure to output a compressed fixed size VHD image that could be uploaded directly to Azure without any further steps?

cc'ing @dustymabe and @darkmuggle who I believe have worked in that cmd in the past.

@jlebon
Copy link
Member

jlebon commented Feb 3, 2020

@jomeier Looks like you've already figured out the magic qemu-img convert flags to pass in openshift/installer#3033. Do you want to suggest it as a change to cosa? (Likely just needs a tweak in qemuvariants.py).

Would be good too to sanity check that the compressed image size indeed isn't affected much.

@lucab
Copy link
Contributor

lucab commented Feb 3, 2020

offer it on the Azure marketplace

This is tracked already at #148. Let's keep it out of this ticket here.

The OKD installer downloads the xz compressed VHD file, decompresses it and uploads it into an Azure storage blob.
If we want to create an Azure VM image out of this VHD file Azure complains about that it is not fixed size but dynamic size.

I feel like there is some piece that I'm missing.
To the best of my knowledge, Azure VHD images for FCOS and RHCOS are built the same way (but just differ in the final compression format). The latter seems to work with the blob we upload, but the former doesn't when the OKD installer uploads it.
Are we doing some hidden post-processing on RHCOS? Or is there some specific setting at upload time that we should document for FCOS users?

@LorbusChris
Copy link
Contributor

@jlebon
Copy link
Member

jlebon commented Feb 3, 2020

Ahh, I think I see it. ore uses the azure-vhd-utils Go package, which conveniently converts dynamic to fixed on-the-fly: https://github.com/microsoft/azure-vhd-utils#how-upload-work. And again, RHCOS does uploads today, FCOS doesn't.

So presumably, the installer could vendor that package too and use that API. Not against just having it fixed in the first place in cosa though if the size difference is indeed negligeable.

@darkmuggle
Copy link
Contributor

xref: openshift/installer#3033 (comment)

Could we change coreos-assembler buildextend-azure to output a compressed fixed size VHD image that could be uploaded directly to Azure without any further steps?

cc'ing @dustymabe and @darkmuggle who I believe have worked in that cmd in the past.

File an issue against COSA. We can look.

@LorbusChris
Copy link
Contributor

Possibly relevant here, from https://docs.microsoft.com/en-us/azure/virtual-machines/windows/prepare-for-upload-vhd-image#convert-the-virtual-disk-to-a-fixed-size-and-to-vhd:

Regarding the size of the VHD:

All VHDs on Azure must have a virtual size aligned to 1MB. When converting from a raw disk to VHD you must ensure that the raw disk size is a multiple of 1 MB before conversion. Fractions of a megabyte will cause errors when creating images from the uploaded VHD.

@jomeier
Copy link
Author

jomeier commented Feb 3, 2020

@jlebon:
You are right: azure-vhd-utils seems to do the conversion job. Currently I upload the decompressed vhd file and the storage blob is type page blob with 8GByte size.

The advantage of azure-vhd-utils is that it only transfers the data which is necessary. Because the conversion to fixed size means, that the VHD file is also expanded in size, efficiency plays a big role.

@LorbusChris: I made my first OKD tests on Azure with fcos and azure-vhd-utils. It does the job.

@jomeier
Copy link
Author

jomeier commented Feb 3, 2020

Maybe it's absolutely sufficient to use azure-vhd-utils also with fcos in the fedora coreos ci/cd pipeline.

@lucab
Copy link
Contributor

lucab commented Feb 3, 2020

Not against just having it fixed in the first place in cosa though if the size difference is indeed negligeable.

I fetched the latest testing-devel image, converted to "fixed disktype" VHD (via qemu-img), and re-compressed (via xz):

  • the final compressed size is only slightly larger (470MiB vs 435 MiB)
  • intermediate uncompressed size is much larger (9.8 GiB vs 1.7 GiB)
  • the compression step (xz -T 0) on my quadcore for "fixed" has run-time 1.5x than current (6 mins vs 4 mins)

@jomeier
Copy link
Author

jomeier commented Feb 3, 2020

@lucab:
The size must be a multiple of 1024 Bytes:

#!/bin/bash

sudo apt-get update -qq
sudo apt-get install qemu-utils -y

wget ${vhd_url} -O fcos.vhd.xz
xz -d fcos.vhd.xz

echo "Convert vhd file to raw image..."
qemu-img convert -f vpc -O raw fcos.vhd fcos.raw

# Calculcate new size which is a multiple of 1MByte (rounded up).
rawdisk="fcos.raw"
vhddisk="fcos.vhd"
MB=$((1024*1024))
size=$(qemu-img info -f raw --output json "$rawdisk" | gawk 'match($0, /"virtual-size": ([0-9]+),/, val) {print val[1]}')
rounded_size=$((($size/$MB + 1) * $MB))

echo "Resize raw image..."
qemu-img resize fcos.raw $rounded_size
qemu-img convert -f raw -o subformat=fixed,force_size -O vpc fcos.raw fcos-fixed.vhd

Its absolutely important to use '-o subformat=fixed,force_size'. qemu-img needs this to create the fixed size for Azure.

@jlebon
Copy link
Member

jlebon commented Feb 3, 2020

intermediate uncompressed size is much larger (9.8 GiB vs 1.7 GiB)

Ouch. Hmm, I guess the qemu-img convert output file isn't sparsified at the host filesystem level? If you still have that file around, does fallocate -d help? Worse case, if we do bring this into cosa, maybe what we want is something similar to what we do with GCP, which is to compress at the build step too.

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Feb 3, 2020
We changed this behaviour during the refactor. There's quite a bit of
history here on why do this. But a major one at least is that we want to
be able to just `rename(2)` the final build artifacts into place. This
saves a bunch of time and I/O.

I noticed this due to the fact that we were losing sparsity from the
output of `qemu-img convert` because `shutil.move` doesn't do the
equivalent of `cp --sparse=auto`. This patch fixes that, though I think
we should also be able to change that call to a simple `os.rename()` in
a follow-up to make it explicit.

Related: coreos/fedora-coreos-tracker#361
@jlebon
Copy link
Member

jlebon commented Feb 3, 2020

OK, so the image qemu-img convert outputs is sparse already. There's one tweak though which prevented this from happening in cosa: coreos/coreos-assembler#1097. With that patch:

$ du -h *-azure*
1.8G    fedora-coreos-31.20200203.dev.0-azure.x86_64.vhd
8.1G    fedora-coreos-31.20200203.dev.0-azure.x86_64.vhd.bak 

(The .bak there is without that patch.)

@jomeier
Copy link
Author

jomeier commented Feb 4, 2020

@darkmuggle
I tried to convert the vhd file to a fixed size without the force_size parameter and it failed to boot neither in hyper-v nor in an Azure VM.

There is a known incompatibility in qemu regarding Azure VMs.

The conversion should look like this IMHO:

qemu-img convert -f raw -o subformat=fixed,force_size -O vpc fcos.raw fcos-fixed.vhd

https://possiblelossofprecision.net/?p=2452
https://serverfault.com/questions/770378/problems-preparing-a-disk-image-for-upload-to-azure

Greetings,

Josef

jlebon added a commit to coreos/coreos-assembler that referenced this issue Feb 4, 2020
We changed this behaviour during the refactor. There's quite a bit of
history here on why do this. But a major one at least is that we want to
be able to just `rename(2)` the final build artifacts into place. This
saves a bunch of time and I/O.

I noticed this due to the fact that we were losing sparsity from the
output of `qemu-img convert` because `shutil.move` doesn't do the
equivalent of `cp --sparse=auto`. This patch fixes that, though I think
we should also be able to change that call to a simple `os.rename()` in
a follow-up to make it explicit.

Related: coreos/fedora-coreos-tracker#361
@darkmuggle
Copy link
Contributor

I'm hitting a problem where qemu-img is rendering the VHD to a raw file and its missing qemu-header.

@LorbusChris
Copy link
Contributor

Is this a bug we should raise at https://bugs.launchpad.net/qemu/ ?

@jomeier
Copy link
Author

jomeier commented Feb 10, 2020

@darkmuggle:
Have you tried to run each step in my script above? Currently I process the fcos vhd.xz file like that and upload the converted image this way and it works like a charme.

@jlebon
Copy link
Member

jlebon commented Mar 30, 2020

Can someone verify whether the latest Azure images still suffer from this? It should be fixed already by coreos/coreos-assembler#1131.

@jomeier
Copy link
Author

jomeier commented Mar 30, 2020

@jlebon:
Just trying it out. I'll do this:

@jomeier
Copy link
Author

jomeier commented Mar 30, 2020

@Lebon @LorbusChris
It works !

Uploading the expanded VHD file from my PC takes forever (max (!) 5MByte/s). As always I used a helper VM on Azure. It's much faster to download/upload the VHD file to Azure if you already are on Azure :-)

I think we can close this issue.

Thanks a lot !

@LorbusChris
Copy link
Contributor

🎉
Thank you for verifying @jomeier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants