Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS registry is not working properly with 21.4.26 #343

Open
YevheniiSemendiak opened this issue Jun 29, 2021 · 7 comments
Open

AWS registry is not working properly with 21.4.26 #343

YevheniiSemendiak opened this issue Jun 29, 2021 · 7 comments
Labels
2reproduce bug Something isn't working Stale

Comments

@YevheniiSemendiak
Copy link
Contributor

We cannot build the images using Kaniko with the nevest version of platformregistery on AWS clusters.

The problem appears when the Kaniko pulls cached layer from our registry to speedup the image build.
At this step, our registry fails with error:

aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
2021-06-29 07:55:44,393 - aiohttp.server - ERROR - Error handling request
Traceback (most recent call last):
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/integrations/aiohttp.py", line 123, in sentry_app_handle
    reraise(*_capture_exception(hub))
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/_compat.py", line 54, in reraise
    raise value
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/integrations/aiohttp.py", line 113, in sentry_app_handle
    response = await old_handle(self, request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp_remotes/x_forwarded.py", line 94, in middleware
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/platform_registry_api/api.py", line 599, in handle
    auth_headers=auth_headers,
  File "/root/.local/lib/python3.7/site-packages/platform_registry_api/api.py", line 711, in _proxy_request
    async for chunk in client_response.content.iter_any():
  File "/root/.local/lib/python3.7/site-packages/aiohttp/streams.py", line 39, in __anext__
    rv = await self.read_func()
  File "/root/.local/lib/python3.7/site-packages/aiohttp/streams.py", line 386, in readany
    raise self._exception
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed

Which means that the image layer was not download compeletely and, as a result, Kaniko fails on the layer checksum verification.
Example of Kaniko error message:

INFO[0153] Found cached layer, extracting to filesystem
error building image: error building stage: failed to execute command: extracting fs from image: error verifying sha256 checksum; got "9a593afe07a07cc862136409824a9b391a352945a4dcbbd7de8084ab8b13d572", want "92c6c2f9925e03fe28f4e343e6aa128aeb78eccc59c50589cefa15db665857c8"

This problem is not reproducible with platform registry api v21.2.11.
As a result, we downgraded our platformregistryapi on AWS clusters from v21.4.26 to v21.2.11.

IDK yet, what might be the reason for this.

@YevheniiSemendiak YevheniiSemendiak added the bug Something isn't working label Jun 29, 2021
@sentry-io
Copy link

sentry-io bot commented Jul 20, 2021

Sentry issue: STAGING-Q1

@anayden
Copy link
Contributor

anayden commented Jul 28, 2021

uvloop 0.15.x causes this behavior with AWS + Kaniko. uvloop 0.14.0 works just fine.

In order to verify whether subsequent releases of uvloop have a fix, do the following:

1. Make sure you use AWS registry
2. git clone https://github.com/neuro-inc/mlops-demo-oss-names
3. git checkout ab4bfe9779cab394e8d5fd9174e1f2179bc0acac
4. neuro-flow build myimage
5. neuro-flow build myimage -F

If the bug is present, step 5 would try to use layer cache from step 4 and fail due to incomplete download. If the bug is fixed, the second image is built successfully.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the Stale label Jan 26, 2022
@YevheniiSemendiak
Copy link
Contributor Author

There is a new release of uvloop, 0.16.0.
We could may try to bump it and test w.r.t. STRs reported by Alexey.

@github-actions github-actions bot removed the Stale label Jan 27, 2022
@asvetlov
Copy link
Contributor

Who is a volunteer?

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the Stale label Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2reproduce bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

4 participants