-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR: Add libpng and libjpeg-turbo requirement into conda recipe #2301
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, it's great that all tests pass!
I have a few questions / comments.
In particular, I'm thinking if using the raw libjpeg API would enable us to fix some of the incompatibilities that arises from the other libraries -- IIRC, libjpeg-turbo had the same ABI as libjpeg, so one could switch to use it at runtime.
Also, could you add a note somewhere (maybe in the README?) explaining what are the steps that the user should do to get the image extensions compiled depending on their system?
# Pillow introduces unwanted conflicts with libjpeg-turbo, as it depends on jpeg | ||
# The fix depends on https://github.com/conda-forge/conda-forge.github.io/issues/673 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain when / how these conflicts materialize?
Could that lead to segfaults when the user imports torchvision and PIL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conflict here is that we're substituting the libjpeg library, which is used primarily by Pillow, however this may not affect us as the ABI between jpegturbo and libjpeg is the same AFAIK.
Most Linux distributions distinguish between libturbojpeg (the library itself) and libjpeg-turbo (the libjpeg version using turbojpeg), thus, they do not have this problem, as we link against libturbojpeg, rather than against libjpeg-turbo.
In conda-forge, we have this conflict because the recipe for libjpeg-turbo produces both the turbo flavored libjpeg and libturbojpeg. I spoke to @isuruf, one of the maintainers in conda-forge, and he told me that they are working towards a solution in conda-forge/conda-forge.github.io#673, but right now there are no alternatives to this conflict.
I guess we are not having any problem with this conflicting installation, as the tests are passing. However, we should encourage users to install PyTorch and torchvision on a separate environment, so we prevent other errors that could be caused as part of this conflict and we're not aware of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, this makes me a bit worried, because AFAIK we don't have tests running on OSX, only Linux and Windows (we do compile on OSX though).
So if there is a problem happening in OSX we wouldn't be able to see it in CI.
Do you think that, if we were to use libjpeg API, everything would be safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't they running on the binary_macos_conda_*_pyxx
CircleCI pipelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, yeah, we were not running tests for OSX with wheels, only conda.
But still, do you think if we were to use libjpeg API it would make things safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think right now we don't have much trouble on our setup, as we are linking against libturbojpeg and not libjpeg directly. If we were compilling against libjpeg, then we would be in trouble. Right now the only conflict that we have is the one on conda-forge.
A provisional solution would be compilling libturbojpeg (Without libjpeg) ourselves and publish it into the conda pytorch channel until conda-forge/conda-forge.github.io#673 is fixed. What do you think about this?
@@ -1,3 +1,6 @@ | |||
channel_sources: | |||
- defaults,conda-forge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it now safe to use conda-forge
as well? At some point we had issues with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems only libturbojpeg is being pulled from there, so all other packages are being pulled from defaults or the main PyTorch channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @andfoy !
Caused failure on master: _________________________ ImageTester.test_decode_jpeg _________________________
RuntimeError: No such operator image::decode_jpeg
During handling of the above exception, another exception occurred:
self = <test_image.ImageTester testMethod=test_decode_jpeg>
def test_decode_jpeg(self):
for img_path in get_images(IMAGE_ROOT, "jpg"):
img_pil = torch.from_numpy(np.array(Image.open(img_path)))
size = os.path.getsize(img_path)
img_ljpeg = decode_jpeg(torch.from_file(img_path, dtype=torch.uint8, size=size))
norm = img_ljpeg.shape[0] * img_ljpeg.shape[1] * img_ljpeg.shape[2] * 255
err = torch.abs(img_ljpeg.flatten().float() - img_pil.flatten().float()).sum().float() / (norm)
self.assertLessEqual(err, 1e-2)
with self.assertRaisesRegex(ValueError, "Expected a non empty 1-dimensional tensor."):
decode_jpeg(torch.empty((100, 1), dtype=torch.uint8))
with self.assertRaisesRegex(ValueError, "Expected a torch.uint8 tensor."):
decode_jpeg(torch.empty((100, ), dtype=torch.float16))
with self.assertRaisesRegex(RuntimeError, "Error while reading jpeg headers"):
> decode_jpeg(torch.empty((100), dtype=torch.uint8))
E AssertionError: "Error while reading jpeg headers" does not match "No such operator image::decode_jpeg"
|
This PR has broken the doc push job on PyTorch main repo:
I'm going to hotfix it by pinning doc push to an older version of PyTorch, but this will need to get fixed eventually. |
@ezyang can you point to the location of the location of where the doc push job is defined? This issue arises because the image might have an older version of libpng. We should guard against this case (I thought we did, but looks like we missed something) |
@andfoy the errors pointed out by @peterjc123 are due to the latest CI changes that were merged just before yours, in #2328 |
@ezyang, is it possible to get access to the full log of that build? |
…ch#2301) * Add libpng requirement into conda recipe * Try to install libjpeg-turbo * Add PNG reading capabilities * Remove newline * Add image extension to compilation instructions * Include png functions as part of the main library * Update CMakeLists * Detect if building on conda-build * Debug * More debug messages * Print globbed libreries * Print globbed libreries * Point to correct PNG path * Remove libJPEG preventively * Debug extension loading * Link libpng explicitly * Link with PNG * Add PNG reading capabilities * Add libpng requirement into conda recipe * Try to install libjpeg-turbo * Remove newline * Add image extension to compilation instructions * Include png functions as part of the main library * Update CMakeLists * Detect if building on conda-build * Debug * More debug messages * Print globbed libreries * Print globbed libreries * Point to correct PNG path * Remove libJPEG preventively * Debug extension loading * Link libpng explicitly * Link with PNG * Install libpng on conda-based wheel distributions * Add -y flag * Add -y flag to yum * Locate LibPNG on windows conda * Remove empty else * Copy libpng16.so * Copy dylib on Mac * Improve check on Windows * Try to install ninja using conda on windows * Use libpng on Windows * Package lib on windows wheel * Point library to the correct place * Include binaries as part of wheel * Copy libpng.so on linux * Look for png.h on Windows when using conda-build * Do not skip png tests on Mac/Win * Restore libjpeg-turbo * Install jpeg-turbo on wheel distributions * Install libjpeg-turbo from conda-forge on wheel distributions * Do not pull av on conda-build * Add pillow disclaimer * Vendors libjpeg-turbo 2.0.4 * Merge JPEG work * Remove submodules * Regenerate circle config * Fix style issues * Fix C++ style issues * More style corrections * Add JPEG-turbo to linking libraries * More style corrections * More style corrections * More style corrections * Install libjpeg-turbo-devel * Install libturbo-jpeg on typing pipeline * Update Circle template * Windows and Unix turbojpeg have the same linking name * Install turbojpeg-devel instead of libjpeg-turbo * Copy TurboJPEG binaries to wheel * Move test image * Move back test image * Update JPEG test path * Remove dot from extension * Move image functions to extension * Use stdout arg in subprocess * Disable image extension if libpng or turbojpeg are not found * Append libpng stdout * Prevent list appending on lists * Minor path correction * Minor error correction * Add linking flags * Style issues correction * Address minor review corrections * Refactor library search * Restore access index * Fix JPEG tests * Update libpng version in Travis * Add -y flag * Remove dot * Update libpng using apt * Check libpng version * Change libturbojpeg binary * Update import * Change call * Restore av in conda recipe * Minor error correction * Remove unused comment in travis.yml * Update README * Fix missing links * Remove fixes for 16.04 Co-authored-by: Ryad ZENINE <[email protected]>
pytorch#2301)" (pytorch#2375) This reverts commit 766721b.
Fixes #2291