Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AEM 6.5] Asset share : Downloads limited to < 2147483647 bytes (2.147GB) due to int type #1115

Open
wt-jking opened this issue Apr 17, 2024 · 13 comments

Comments

@wt-jking
Copy link

When you have a file > 2.147 GB, this class will throw a NumberFormatException. We're trying to add large file download support for a client and ran into this issue. An alternative would be to use BigInteger or Long when calculating the content-length header.

https://github.com/adobe/asset-share-commons/blob/develop/core/src/main/java/com/adobe/aem/commons/assetshare/content/renditions/download/impl/AssetRenditionDownloadResponse.java#L102

@davidjgonzalez
Copy link
Contributor

Sounds reasonable! Care to make a PR?

@wt-jking
Copy link
Author

After further review, this isn't the only problem trying to support large files. Because of the extensive use of ByteArrayOutputStream throughout asset share downloads, it will require a significant rewrite that uses more memory friendly types.

@davidjgonzalez
Copy link
Contributor

@wt-jking where/how exactly are you seeing this error?

I uploaded a 2.8 GB files and can direct download it (the original rendition), as well as add it to a zip file for download.

I am testing this on the AEM SDK locally right now. My content length header is coming back with 3004301321 which is bigger than an Int.

@wt-jking
Copy link
Author

wt-jking commented Apr 22, 2024

@davidjgonzalez Interesting indeed. Our use case was to support up to 6GB. I was first getting a NumberFormatException on the code I commented on, then after patching that, we are getting JDK OOM errors because during the zip step, it's trying to read the entire file into memory. Our file is an MXF mov (5.3GB) video with two other renditions (mp4 and webm). I'd be happy to connect with you more on this, as we still haven't found a solution to this (trying to avoid rewriting the whole download package if possible)

@davidjgonzalez
Copy link
Contributor

@wt-jking im grabbing a 9GB file right now (taking some time to download)... Few thoughts.

Im fairly certain the ByteArrayOutputSteam can handle more than INT.MAX bytes (even tho tho the size is in bytes). I believe the limit is more on your heap size (since it would create it in memory) which would explain the OOM - if you dont have enough mem for the JVM to handle all the bits in memory.

Few questions:

  1. What version of AEM?
  2. Are you on local AEM Quickstart?
  3. What version fo Asset Share Commons?
  4. What exact operation(s) are triggering this?
  5. What size are the mp4 and webm renditions? (I assume? youre trying to zip the original, mp4 and webm and thats when this is failing?)

It sounds like if we can just change that setHeader(..) call to set a Long, instead of a Int, whatever is calling that in AEM (i dont see this method called in ASC) wouldnt fail on a number exception. You'd still have memory constraints (assuming your on AEM 6.5; AEM CS uses AEM's download framework which is more robust). I think we could do something like:

2024-04-22 at 5 04 PM

@wt-jking
Copy link
Author

wt-jking commented Apr 23, 2024

I don't think it's quite that simple, casting a Long to an int will result in an invalid content-length value. I'm not sure the browser would complain or not.

Relevant Stack trace before fix

17.04.2024 17:21:48.652 *ERROR* [10.50.1.45 [1713374508645] POST /content/marinesdam/marines-asset-search/actions/download/_jcr_content/root/responsivegrid/download.download-asset-renditions.zip HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Uncaught Throwable
java.lang.NumberFormatException: For input string: "5671541444"
	at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.base/java.lang.Integer.parseInt(Integer.java:652)
	at java.base/java.lang.Integer.parseInt(Integer.java:770)
	at com.adobe.aem.commons.assetshare.content.renditions.download.impl.AssetRenditionDownloadResponse.setHeader(AssetRenditionDownloadResponse.java:73) [com.adobe.aem.commons.assetshare.core:2.1.10]

After fix, this is the stack trace with the OOM issue. Note, I'm testing when using Author in preview mode.

image

image

It's failing in AssetRenditionStreamerImpl on this line

            assetRenditionDownloadResponse = sendDispatchForAssetRendition(request, response, asset, renditionName);
2024-04-23 09:16:43.410 DEBUG [com.adobe.aem.commons.assetshare.content.renditions.impl.dispatchers.StaticRenditionDispatcherImpl] Serving internal static rendition [ /content/dam/legacy/dam/test-large-files/JK-TEST-A024C816_221210W9_CANON.MXF.mov/jcr:content/renditions/original ] with resolved rendition name [ original ] through internal Sling Forward
2024-04-23 09:16:43.415 ERROR [com.adobe.aem.commons.assetshare.content.renditions.download.impl.AssetRenditionLargeFileDownloadResponse] Unable to parse content-length value [5671541444]
23.04.2024 09:16:43.806 *ERROR* [[0:0:0:0:0:0:0:1] [1713878203397] POST /content/marinesdam/marines-asset-search/actions/download/_jcr_content/root/responsivegrid/download.download-asset-renditions.zip HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Uncaught SlingException
java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
	at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
	at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
	at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
	at com.adobe.acs.commons.util.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:53)
	at org.apache.sling.servlets.get.impl.helpers.StreamRenderer.streamResource(StreamRenderer.java:258) [org.apache.sling.servlets.get:2.1.38]
	at org.apache.sling.servlets.get.impl.helpers.StreamRenderer.render(StreamRenderer.java:164) [org.apache.sling.servlets.get:2.1.38]
	at org.apache.sling.servlets.get.impl.DefaultGetServlet.doGet(DefaultGetServlet.java:316) [org.apache.sling.servlets.get:2.1.38]
	at org.apache.sling.api.servlets.SlingSafeMethodsServlet.mayService(SlingSafeMethodsServlet.java:266) [org.apache.sling.api:2.22.0]
	at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:342) [org.apache.sling.api:2.22.0]
	at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:374) [org.apache.sling.api:2.22.0]
	at org.apache.sling.engine.impl.request.RequestData.service(RequestData.java:570) [org.apache.sling.engine:2.7.4]
	at org.apache.sling.engine.impl.filter.SlingComponentFilterChain.render(SlingComponentFilterChain.java:45) [org.apache.sling.engine:2.7.4]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:82) [org.apache.sling.engine:2.7.4]
	at com.day.cq.wcm.core.impl.WCMDeveloperModeFilter.doFilter(WCMDeveloperModeFilter.java:119) [com.day.cq.wcm.cq-wcm-core:5.12.172]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.day.cq.wcm.core.impl.WCMDebugFilter.doFilterWithErrorHandling(WCMDebugFilter.java:192) [com.day.cq.wcm.cq-wcm-core:5.12.172]
	at com.day.cq.wcm.core.impl.WCMDebugFilter.doFilter(WCMDebugFilter.java:159) [com.day.cq.wcm.cq-wcm-core:5.12.172]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.adobe.acs.commons.granite.ui.components.impl.include.IncludeDecoratorFilterImpl.doFilter(IncludeDecoratorFilterImpl.java:92) [com.adobe.acs.acs-aem-commons-bundle:5.0.10]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.day.cq.wcm.core.impl.WCMComponentFilter.doFilter(WCMComponentFilter.java:249) [com.day.cq.wcm.cq-wcm-core:5.12.172]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.day.cq.wcm.core.impl.page.PageLockFilter.doFilter(PageLockFilter.java:91) [com.day.cq.wcm.cq-wcm-core:5.12.172]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.day.cq.personalization.impl.TargetComponentFilter.doFilter(TargetComponentFilter.java:94) [com.day.cq.cq-personalization:5.12.44]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at com.adobe.granite.csrf.impl.CSRFFilter.doFilter(CSRFFilter.java:217) [com.adobe.granite.csrf:1.0.20.CQ650-B0002]
	at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.4]
	at org.apache.sling.engine.impl.SlingRequestProcessorImpl.processComponent(SlingRequestProcessorImpl.java:283) [org.apache.sling.engine:2.7.4]
	at org.apache.sling.engine.impl.SlingRequestProcessorImpl.dispatchRequest(SlingRequestProcessorImpl.java:323) [org.apache.sling.engine:2.7.4]
	at org.apache.sling.engine.impl.request.SlingRequestDispatcher.dispatch(SlingRequestDispatcher.java:211) [org.apache.sling.engine:2.7.4]

@davidjgonzalez
Copy link
Contributor

@wt-jking IIUC there are 2 issues here:

  1. The content length was being set at int, which causes the Number format exception (the browser doesn't care if the value in the header is bigger than an Java int). It seems? this is fixed with the patch?
  2. Running out of memory when zipping large files. Im not sure theres a way around this, or especially a way that wouldn't require significant rewrites. This is why AEM team built the AEM as a Cloud Service Async Download Framework. Even the AEM 6.5 zip downloads are limited to heap space.

What version of Java are you running? IIRC since 11? heapspace is automatically allocated based on overall free memory. If youre on Java i believe you still specify the max heap space. Curious as to what your settings are, and how much system/free memory you have when you hit this.

I was able to zip a 9GB file on my local -- but im on Java11 and 32GB memory.

@wt-jking
Copy link
Author

Test setup in my local

#Start AEM Author
java -Xmx16G -jar aem-6.5-author-p4502.jar -debug 30303

g001206:author jking$ java -version
openjdk version "11.0.22" 2024-01-16
OpenJDK Runtime Environment Homebrew (build 11.0.22+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.22+0, mixed mode)

It is worth noting, that we are running an older version of asset-share-commons (2.1.10). You think we'd have better results running the latest? I didn't see any code changes in 2 years in the download package. i agree from what I saw in the codebase, the zipping would require a rewrite, which is why I originally closed this. We're hoping to find another solution leveraging brand portal, but if we do end up making this work for our needs, we'll be sure to share that with this project. We are limited to AMS hosting only due to client requirements.

@davidjgonzalez
Copy link
Contributor

davidjgonzalez commented Apr 24, 2024

What version of AEM 6.5?

Also, im a shocked youre running out of heap on a 2GB file w/ a 16GB heap... on a local, which ostensibly is doing very little else.

FWIW - though i dont think it would cause this - OpenJDK isnt supported. downloads.experiencecloud.adobe.com has Oracle JDK downloads if you want.

@davidjgonzalez
Copy link
Contributor

@wt-jking also - curious - if you goto AEM Authors > Assets > and select this 2GB file and another smaller file (or maybe its renditions) to download ... does that work? AFIAK this ASC zipper/streamer should ~work the same (in terms of mem footprint) as AEM 6.5/BP

@wt-jking
Copy link
Author

I'm testing with a 5GB file locally. My version is

Installed Products
Adobe Experience Manager (6.5.10.0)

I'm able to download any combination of assets from author > assets without issue. We are certain it doesn't work the same as asset share.

PS - we just learned of a 4GB limit with brand portal as well, so this may our only option to support large file downloads.

@wt-jking
Copy link
Author

Same result with Oracle JDK 11.

@davidjgonzalez
Copy link
Contributor

davidjgonzalez commented Apr 24, 2024

@wt-jking ok - im seeing the same on 6.5.14 w/ latest ASC ... Agree this is going to be a bit of work to resolve as it starts to leak into the Rendition Dispatchers as well.

I think the problem is the path through the Zipper -> Streamer -> Rendition Dispatcher ends up copying the streams around rather than just streaming it through... but because of the reliance on Sling includes, it's tricky to understand exactly what's going on under the covers ... (atleast without a deep dive .. been awhile since i looked at the AEM 6.5 set download code)

@davidjgonzalez davidjgonzalez changed the title Asset share : Downloads limited to < 2147483647 bytes (2.147GB) due to int type [AEM 6.5] Asset share : Downloads limited to < 2147483647 bytes (2.147GB) due to int type Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants