Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement steaming encryption #321

Closed
wants to merge 30 commits into from

Conversation

evilaliv3
Copy link
Contributor

This pull requests supersedes @hellais #306.

In relation to that one it:

  • removes the deterministic key generation part that will be evenutally provided in a separated pull request;
  • uses stream opts.objectMode = true; to make the patch compatible with Browsers/Phantomjs;
  • adds unittests towards all encryption algos: tripledes, cast5, blowfish, aes128, aes192, aes256, twofish;
  • adds unittests for both integregrity protexted modes: true/false.

@evilaliv3 evilaliv3 force-pushed the feature/streaming branch 2 times, most recently from 538ff9a to 32de087 Compare May 25, 2015 13:42
@hellais
Copy link

hellais commented May 25, 2015

Ok great. @tanx said to also remove the dist build. Then it's good to go. Will now close mine.

@evilaliv3
Copy link
Contributor Author

yep @hellais i've already removed the dist changes from the pull request.

@evilaliv3 evilaliv3 force-pushed the feature/streaming branch 3 times, most recently from 7690724 to e7e5101 Compare May 26, 2015 12:46
@evilaliv3 evilaliv3 force-pushed the feature/streaming branch 3 times, most recently from 550a591 to 891d39e Compare June 13, 2015 23:51
@evilaliv3 evilaliv3 force-pushed the feature/streaming branch from 0bf96a5 to b0f3cf4 Compare June 30, 2015 15:07
@evilaliv3
Copy link
Contributor Author

@tanx @toberndo: the patch should be ready; as agreed i worked in removing the Buffer usage.

i've removed also the Stream usage cause using the node stream object just to have its event API was causing Node to push in also the Buffer library anyhow.

@bartbutler can you take a look to the library and provide your feedback?

@evilaliv3
Copy link
Contributor Author

@tanx / @toberndo ; can you take care of reviewing and merging this?

@evilaliv3 evilaliv3 force-pushed the master branch 13 times, most recently from 9ddd66d to 808ae07 Compare January 9, 2016 17:21
@@ -0,0 +1,284 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer use forge. It has been replaced by Rusha and asmCrypto on the current master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better! going to change it

@tanx
Copy link
Member

tanx commented Feb 13, 2016

@evilaliv3 I did a first rough review of this PR and made some comments. In general this PR is in no state to merge. It does not seem to be rebased onto the v2.x branch (now master) correctly and the travis build doesn't pass either.

That being said, I am strongly opposed to merging this patch to the core OpenPGP.js library at all. Here's why:

  1. This PR introduces many node specific apis, such as stream, buffer and util.inherits. This creates a lot of overhead, for a feature that most users of the library will never use.
  2. I have not heard any interest from other OpenPGP.js users for streams and the feature seems very Globalleaks specific to me. Considering the amount of baggage that comes with this feature, why should all other users of OpenPGP.js have to include of all the baggage in their app?

I see two options:

a.) either Globalleaks maintains their streams additions on their own fork of OpenPGP.js.
b.) Globalleaks creates a separate module (github repository) that is basically a plugin to openpgp.js which adds a streams abstraction to the core library. But the core library does not depend on it and can be built without it.

I don't see a good enough reason though, why the streams module should be included in the core library. Sorry guys!

@tanx tanx closed this Feb 13, 2016
@evilaliv3
Copy link
Contributor Author

you are right it is wrongly rebased; i'm fixing it now anyhow and if others are not interested in we can do what you suggest, but francly i don't understand the reasons.

i a lot convinced that this feature is needed for general use cases;

by now the file encryption possibilities of openpgp are highly capped by the RAM of the user, in fact supposing one (e.g. protonmail) wants to upload a file of 30MB, for sure would have to keep in ram at once 30MB (of plaintext read at once) + at least 30MB of encrypted payload => twice the size of the file uploaded.

having a streaming capability instead one would have in ram even only a plaintext chunk + an encrypted chunk depending on the ram availability.

@bartbutler @toberndo could you express what you think?

if that is not valuable for you i'm giving up to persist and i will mock openpgpjs in my project as @tanx suggests or maintain an aligned fork if that result more easy given the build process.

@drakkhen
Copy link

I am extremely interested in having this capability with OpenPGP.js.

Would it be possible to include a separate, optional file (OpenPGP-Streaming.js?) that augments core functionality for those who need it? I've seen other large JS projects handle similar issues this way.

@fpietrosanti
Copy link

+1 (streaming is going to be required when handling/consuming large data such as big files)

@tanx
Copy link
Member

tanx commented Feb 14, 2016

by now the file encryption possibilities of openpgp are highly capped by the RAM of the user, in fact supposing one (e.g. protonmail) wants to upload a file of 30MB, for sure would have to keep in ram at once 30MB (of plaintext read at once) + at least 30MB of encrypted payload => twice the size of the file uploaded.

having a streaming capability instead one would have in ram even only a plaintext chunk + an encrypted chunk depending on the ram availability.

I understand the need for streaming and generally have nothing against support for it in OpenPGP.js. But not how it is currently implemented in the proposed PR. I'm currently trying to create a more modular design by pushing parts that do not need to be in the core library out into their own GitHub repos and consuming them via npm e.g. see the asmCrypto-lite module, which is a dependency of the core library.

I don't want to become the next angular.js with a big monolithic library that everyone is afraid to include into their site. The angular developers have learned from their mistake and Angular 2 is much more modular and the core library is much leaner. I think this is the way modern JS libraries need to be written going forward... put only the important stuff in the core library and push everything optional into separate modules.

How's this?

  1. Port the new stream module to a separate repo. I created this one under the openpgpjs org for you: https://github.com/openpgpjs/openpgp-stream
  2. Include all the dependencies you need including forge and whatnot (our forth SHA lib it seems) in that repo.
  3. Create a browserify build in the openpgp-stream repo that creates a file called openpgp-stream.js
  4. Change require('./stream') to require('openpgp-stream') here.
  5. Set the 'openpgp-stream' module as external in the current browserify builds
  6. Add the openpgp-stream.js to your app sitting right next the the openpgp.js core lib.

This setup should allow you to have everything you need, but keep all the dependencies out of the core lib. Everyone wins. What do you think?

@evilaliv3
Copy link
Contributor Author

i love this!

for what relates to forge it is not needed i can use one of the existen libraries, simply i need a sha256 with it's update function and i hope one of the three you have already offer that.

i've already rebased it all, i need to understand what in the v2.x is making the streaming failing and i would kindly ask you to help me in the packaging. is it fine for you?

@evilaliv3
Copy link
Contributor Author

@tanx i managed to align all my streaming patch to the new API: evilaliv3@1787e54 and i've followed your suggestions for correction.

i had to keep node-forge as the "ripped" version of jsSHA is not implementing the 'update' function; if you find valuable i can manage implement and i would suggest to have anyway a canonical sha('type'), update(), digest() or even to provide the uintarray patch upstram to jsSHA to not have to maintain it ripped internally.

i've removed the node.inherits as you are right it is no more needed.

can i please ask you to make for me the design you identified as valuable? i think you master better this packaging staff and this way you can do it exactly as you highlight.

@evilaliv3
Copy link
Contributor Author

@bartbutler i've some comments in relation to the jsSHA edits in relation to the uint8arrays; as you did them probably you can confirm them and we can work on some improvements.

the reason i'm suggesting to have anyhow an update function is that for larger payload (a reasonable file of 30MB), aside from the streaming capability, doing a sha at once will duplicate the ram usage for sure as currently the ripped jsSHA function does a type conversion at once, and so allocating at least ~60MB only for calculating the sha

in addition i've noted that the typed2binb function allocate memory in a loop of array.push; i think that even if you do not consider to have an update (that i kindly suggest) we should evaluate to do the allocation one shot, that probably would cost less overhead and cause less fragmentation to the browsers. the same apply for all the functions like b642binb, bytes2binb, hex2binb, str2binb and probably others.

@bartbutler
Copy link
Member

@evilaliv3 are we talking about the forge implementation or the rusha/asmcrypto SHA implementations? The forge implementation does conversion because I didn't want to rewrite it completely to use Uint8Arrays--same goes from some of the RSA encrypt/decrypt stuff, though that doesn't involve lots of data so it's less of an issue. Note that the forge SHA is no longer used to calculate SHA for data packets.

You're right, it would be great to have a SHA implementation that supports 'update' and do SHA calculation incrementally, and of course that's required for streaming, but as @tanx said, that will happen when we get rid of forge SHA for good, because it's really really slow.

I didn't write any of the '2binb' and don't recall what they are used for. If you can rewrite them to pre-allocate that works for me, though I'm not sure that's possible with non-typed arrays, which is what I think we are talking about, as the elements aren't a predefined size.

General comment is that I really like the idea of eventually doing streaming, but mostly coupled to some kind of chunked external interface to facilitate encrypted upload/writing large files, which is going to take a tremendous amount of work. Your example is 30 MB files: have you tried the 2.x release for 30 MB files? When I initially wrote the typed arrays branch I put a lot of work into reducing the memory usage and CPU time. On my computer encrypting a 15MB file takes under a second with the new branch. I'm not certain streaming is necessary for 30 MB. For 500 MB, sure, but 30MB should be much improved as-is.

@evilaliv3
Copy link
Contributor Author

@bartbutler i'm talking about the jsSHA that i think is the one you optimized right?
https://github.com/openpgpjs/openpgpjs/blob/master/src/crypto/hash/sha.js#L1278
if i'm not wrong it uses the typed2binb that has the loop of array.push i'm talking about.

the forge has been removed completely by @tanx and is not used at all in the master branch.

@tanx
Copy link
Member

tanx commented Feb 15, 2016

can i please ask you to make for me the design you identified as valuable? i think you master better this packaging staff and this way you can do it exactly as you highlight.

@evilaliv3 sorry, but I'm not the one that needs streaming. Not gonna do your work for ya ;)

@evilaliv3
Copy link
Contributor Author

ehehe :) i wasn't thinking that simple.

can you simply clarify what you think should go in that repository? only my modified files or all the library? because from what you commented above i was thinking you want the openpgpjs library use the stream-module and not vice versa, while i'm thiniing more easy to make the steam-module use openpgp dependency and add cuntionalities

@tanx
Copy link
Member

tanx commented Feb 15, 2016

So what I described is just a rough sketch. Ideally the stream addition leaves no footprint to the core lib at all. Angular modules e.g. just attach themselves to the global angular object. So if you could do it that way so that your app loads the core lib first and then openpgp-stream.js that would be fantastic. So in code:

openpgp-stream.js:

var openpgp = typeof window !== 'undefined' ? window.openpgp : require('openpgp');
openpgp.stream = openpgpStreamObj;

You don't even need babel or ES6 if you don't want to use that in your module. Babel compiles ES6 modules to common.js syntax anyway.

@tanx
Copy link
Member

tanx commented Feb 15, 2016

So to answer your question the stream github repo depends on the core lib. This way you can also implement all the tests in the respective repo and add as many dependencies as you like e.g. forge.

@bartbutler
Copy link
Member

@evilaliv3 I didn't touch that one, what I did was make bulk hashes (i.e. data packets) use Rusha/asmCrypto instead. If jsSHA is used at all (it must be or @tanx would have removed it) it's used for small hashes that I doubt matter much for performance. We should still work to get rid of using multiple SHA libraries.

@evilaliv3
Copy link
Contributor Author

ah okay! sorry for the confusion, i was thinking so given the amount of changes to that file you and tanx did recently.

@tanx
Copy link
Member

tanx commented Feb 15, 2016

Yeah. Forge was removed completely in favor of rusha/asmCrypto (which provide only SHA-1 and SHA-256). jsSHA provides the rest of the hash functions and it's still there only for decode compatability to other PGP implementations. It's not used for encoding, since SHA-256 is the prefered hash set in the config. I would removed it but unfortunately asmCrypto does not offer all SHA hashes.

@tanx
Copy link
Member

tanx commented May 24, 2016

@evilaliv3 FYI. A streaming webcrypto is under discussion w3c/webcrypto#73

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants