Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CryptoStreamFactory creates sympathetic chunking OutputStreams #587

Merged
merged 3 commits into from
Nov 12, 2021

Conversation

carterkozak
Copy link
Contributor

For detailed analysis of the problem, see
#586

CipherOutputStream appears to perform very poorly on large
buffers, but substantially better when data is segmented into
small chunks which can be done by looping over the original buffer.

With this in place, the openssl wrappeer no longer provides any
benefit over JCE.

==COMMIT_MSG==
CryptoStreamFactory creates sympathetic chunking OutputStreams with performance characteristics matching the apache commons-crypto implementation
==COMMIT_MSG==

For detailed analysis of the problem, see
#586

CipherOutputStream appears to perform _very_ poorly on large
buffers, but substantially better when data is segmented into
small chunks which can be done by looping over the original buffer.

With this in place, the openssl wrappeer no longer provides any
benefit over JCE.
@changelog-app
Copy link

changelog-app bot commented Nov 9, 2021

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

CryptoStreamFactory creates sympathetic chunking OutputStreams with performance characteristics matching the apache commons-crypto implementation

Check the box to generate changelog(s)

  • Generate changelog entry

@policy-bot policy-bot bot requested a review from xRuiAlves November 9, 2021 16:06
@carterkozak carterkozak requested review from ellisjoe and robert3005 and removed request for xRuiAlves November 9, 2021 16:06
* in order to prevent degraded performance on large buffers as described in
* <a href="https://github.com/palantir/hadoop-crypto/pull/586">hadoop-crypto#586</a>.
*/
static final class ChunkingOutputStream extends FilterOutputStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason a BufferedOutputStream won't work for you here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! BufferedOutputStream actually does the opposite of what we want here! Given a large input (beyond the configured buffer size), the BufferedOutputStream will flush any data in its buffer, and write the input buffer directly to the delegate stream as is.

In most cases a BufferedOutputStream is helpful to avoid native overhead for crypto, but I think that's a bit outside of the scope of this change, and introducing a buffered stream around our chunking stream may result in additional unnecessary copies.

@ellisjoe ellisjoe merged commit f24f1ca into develop Nov 12, 2021
@schlosna schlosna deleted the ckozak/CryptoStreamFactory_chunking branch October 3, 2022 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants