-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HTTPCLIENT-1843] - Delegate compression handling to Apache Commons Compress #580
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.27.1 is the current version of Commons Compress.
a5a5890
to
48de43f
Compare
changed. |
httpclient5/src/main/java/org/apache/hc/client5/http/entity/CompressingEntity.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/CompressingEntity.java
Outdated
Show resolved
Hide resolved
|
||
private static final Map<String, String> COMPRESSION_ALIASES; | ||
static { | ||
final Map<String, String> aliases = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arturobernalg
Why just those stream types?
How about using:
org.apache.commons.compress.compressors.CompressorStreamProvider.getInputStreamCompressorNames()
org.apache.commons.compress.compressors.CompressorStreamProvider.getOutputStreamCompressorNames()
- Should I add an API to return
org.apache.commons.compress.compressors.CompressorStreamFactory.ALL_NAMES
?
httpclient5/src/main/java/org/apache/hc/client5/http/entity/CompressorFactory.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DecompressEntity.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DecompressEntity.java
Outdated
Show resolved
Hide resolved
public InputStream getContent() throws IOException { | ||
if (super.isStreaming()) { | ||
if (content == null) { | ||
content = getDecompressingStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not lock, we should document this class as not thread-safe.
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DecompressEntity.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DecompressEntity.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DeflateInputStream.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DeflateInputStreamFactory.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/DeflateDecompressingEntity.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/entity/LazyDecompressInputStream.java
Outdated
Show resolved
Hide resolved
300da7f
to
0c68966
Compare
HI @garydgregory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arturobernalg
Thank you for your update.
- I have comments embedded throughout.
- There is some lazy-initialization code here and there, this should be called out in the class Javadoc making the class thread-unsafe, we have an annotation for that:
org.apache.hc.core5.annotation.ThreadingBehavior
.
...ng/src/test/java/org/apache/hc/client5/testing/extension/sync/StandardTestClientBuilder.java
Outdated
Show resolved
Hide resolved
...ng/src/test/java/org/apache/hc/client5/testing/extension/sync/StandardTestClientBuilder.java
Outdated
Show resolved
Hide resolved
...t5-testing/src/test/java/org/apache/hc/client5/testing/extension/sync/TestClientBuilder.java
Outdated
Show resolved
Hide resolved
...-testing/src/test/java/org/apache/hc/client5/testing/extension/sync/TestClientResources.java
Outdated
Show resolved
Hide resolved
...t5-testing/src/test/java/org/apache/hc/client5/testing/sync/AbstractIntegrationTestBase.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/impl/classic/HttpClientBuilder.java
Outdated
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/impl/classic/HttpClientBuilder.java
Show resolved
Hide resolved
httpclient5/src/main/java/org/apache/hc/client5/http/impl/classic/HttpClientBuilder.java
Outdated
Show resolved
Hide resolved
httpclient5/src/test/java/org/apache/hc/client5/http/entity/TestEntityBuilder.java
Outdated
Show resolved
Hide resolved
final GzipDecompressingEntity gunzipe = new GzipDecompressingEntity(out); | ||
Assertions.assertEquals("some kind of text", EntityUtils.toString(gunzipe, StandardCharsets.US_ASCII)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra blank lines, the comments suffice.
d20a4d8
to
4cdc08a
Compare
HI @garydgregory |
a6aab71
to
ebfbd8c
Compare
5b84d95
to
9793eaf
Compare
* @since 5.5 | ||
*/ | ||
@Contract(threading = ThreadingBehavior.UNSAFE) | ||
public class DecompressEntity extends HttpEntityWrapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class name is mismatched with the "CompressingEntity" name, either they should both be "...ing" names, or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I git-hub resolved most of the comment I wrote. I think we should consistently name ALL our new classes either with or without the "ing" word part, either [Compress|Decompress]Entity
or [Compressing|Decompressing]Entity
. I think "ing" is better and in keeping with our existing other HttpEntity
implementations like IncomingHttpEntity
and DecompressingEntity
(conflicting).
I also think that all this new compression/decompression code should live in its own package, maybe org.apache.hc.client5.http.entity.compress
.
Please @garydgregory take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arturobernalg
Thank you for the updates! 😄
I added more comments throughout.
We're very close I think 😄
TY!
*/ | ||
public String getFormattedName(final String name) { | ||
if (name == null || name.isEmpty()) { | ||
LOG.warn("Compression name is null or empty"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API should not do any logging, especially at such a high level. This is just a string-to-string conversion API, it has no business rasing warnings IMO.
Args.notNull(entity, "Entity"); | ||
Args.notNull(contentEncoding, "Content Encoding"); | ||
if (!isSupported(contentEncoding, false)) { | ||
LOG.warn("Unsupported decompression type: {}", contentEncoding); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class should not do any logging IMO, especially warnings, it should throw an illegal argument exception or let the call site deal with a null result.
private Set<String> fetchAvailableInputProviders() { | ||
final Set<String> inputNames = compressorStreamFactory.getInputStreamCompressorNames(); | ||
return inputNames.stream() | ||
.map(String::toLowerCase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might have some odd results depending on the locale context (https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/) if you do not use a Locale like the PR does above using ROOT.
private Set<String> fetchAvailableOutputProviders() { | ||
final Set<String> outputNames = compressorStreamFactory.getOutputStreamCompressorNames(); | ||
return outputNames.stream() | ||
.map(String::toLowerCase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
* | ||
* @return a set of available output stream compression providers in lowercase. | ||
*/ | ||
private Set<String> fetchAvailableOutputProviders() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "fetch" method name prefix is unusual and surprising to me, I would follow the "get" naming convention.
@@ -38,7 +38,9 @@ | |||
* Common base class for decompressing {@link HttpEntity} implementations. | |||
* | |||
* @since 4.4 | |||
* @deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc tag is missing its comment to redirect reader.
try { | ||
compressorStream = CompressingFactory.INSTANCE.getCompressorOutputStream(contentEncoding, outStream); | ||
} catch (final CompressorException e) { | ||
throw new IOException("Error initializing decompression stream", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you mean "Error initializing compression stream"?
// Write compressed data | ||
super.writeTo(compressorStream); | ||
// Close the compressor stream after writing | ||
compressorStream.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this method be rewritten to use a try-with-resources block?
} | ||
|
||
/** | ||
* Returns the formatted name of the provided compression format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this Javadoc means. What's a "formatted" name? Maybe the API is misnamed? Is the job of the method to map from a content type value to a Commons Compress key? Is it something else? I think the comment should at least make it clear, which should guide us to a better API name.
final String formattedName = getFormattedName(name); | ||
return isSupported(formattedName, false) | ||
? createCompressorInputStream(formattedName, inputStream, noWrap) | ||
: inputStream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc and implementation don't match:
- Javadoc states decompression is performed
- The code doesn't decompress if the format isn't supported.
Either the Javadoc needs updating, or, more likely, we should throw an exception indicating the operation is not supported for that format type name.
I must say I find the API and Javadoc confusing, we are creating a "Compressor" input stream in order to "decompress" with a "compression" format! What?
- Maybe calling the API
getDecompressorInputStream
would help because it "decompresses". I expect a "Compressor" to compress, not the other way around. - Maybe using the term "format type name" instead "compression format" would remove the need for the reader to constantly parse "compress" vs. "decompress" in the text.
- Maybe only using one term "decompress" or "compress" for one method (Javadoc and code) would alleviate the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @garydgregory . Agreed—the Javadoc and implementation don’t match. We’ll:
-
Throw an exception on unsupported formats instead of returning the original stream, aligning with Javadoc expectations.
-
Rename to getDecompressorInputStream to reflect decompression.
-
Standardize to “format type” instead of “compression format” to clarify.
These changes should make both the API and docs clearer. Thanks for flagging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @garydgregory
Can you do another pass??
Thank you.
197c21a
to
63088bc
Compare
…ompress * Integrated Apache Commons Compress into CompressorFactory to handle compression and decompression of HTTP entities using supported algorithms (gzip, deflate, etc.).
63088bc
to
153155f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- TY for the work 👍
- Minor: For the Javadoc
since
tags, I think the next feature release will be5.5
since the current release is5.4.1
and the git master POM already has 5.5 in it. - Side note: In Commons Compress,
CompressorException
usually wraps anIOException
, so it's too bad that we wrap it in another IOException here ;-) Since compression and decompression is IO, I'll see about making CompressorException extend IOException.
…se 'ing' form (e.g., CompressingEntity) to align with existing conventions like DecompressingEntity. Moved compression/decompression classes to org.apache.hc.client5.http.entity.compress package
ea08574
to
363c7ca
Compare
Hi @garydgregory ,
Let me know if there’s anything else to address. |
Hi @arturobernalg, What do you think about apache/commons-compress#605 |
Hi @garydgregory |
Thank you for the review on apache/commons-compress#605, I merged that PR. |
All right. Let me know if we need to do another change. |
This PR addresses HTTPCLIENT-1843, allowing Apache HttpClient to delegate compression handling to Apache Commons Compress. The goal is to provide broader support for various compression algorithms while reducing custom implementation and maintenance.