Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite source to typescript for avoid @types/megajs is wrong #66

Closed
bluelovers opened this issue Jun 3, 2020 · 11 comments
Closed

Rewrite source to typescript for avoid @types/megajs is wrong #66

bluelovers opened this issue Jun 3, 2020 · 11 comments

Comments

@bluelovers
Copy link
Contributor

Rewrite source to typescript for avoid @types/megajs is wrong

@qgustavor
Copy link
Owner

It is one of the points of #64. I will leave that issue open as the main topic for the discussion. This issue can be left open if you find it better.

Note that, although I agree that using TypeScript would be better, my experience with it is just this project. I had to use Parcel because it was the only way I found to set up TypeScript. Before I tried the standalone compiler and the Rollup plugin, but without success.

The current build process use Rollup. If possible can you create a fork setting up it to work with Typescript so we can have a starting point for this rewrite? Or you think it's better to replace it with other bundler?

@AlttiRi
Copy link

AlttiRi commented Jun 4, 2020

Using of JSDocs can be an alternative of index.d.ts.

@qgustavor
Copy link
Owner

qgustavor commented Jun 17, 2020

I don't have many issues writing Typescript, just setting up it. I'm trying it using Deno and it helped it a lot. As the encryption code will need to be rewritten type checking will be important to avoid bugs. Creating a index.d.ts is just a bonus.

At the moment I got it working with Deno and Parcel. Deno is not the good tool for this job, of course. As I said before I had issues with the Rollup plugin. Seems Parcel can generate libraries. There is also microbundle. I think everyone works with webpack, sadly it can't be used as it can't output ES modules without breaking tree shaking.

One issue with building the library today is that it needs to be build with four different configurations:

As it will be a rewrite the API will be unified (just the tree shakeable API will remain), it will be less an issue. In the other hand I need to find a way to build the browser version as I don't have any plans to dropping support for that as I use it in some projects (although the last time I used it in a browser application was last September).

Maybe in the browser build would be better to replace Node built-ins (like Buffers) with standardized web APIs or simpler modules, so it can be smaller and, as a bonus, compatible with Deno and other JavaScript runtimes (Cloudflare Workers, maybe?). It needs some experimenting.

I will try using Parcel or microbundle to simplify setting up TS. In the last case, if none of those tools help solving this issue, I will try setting up @rollup/plugin-typescript again.

Update:

  • Parcel is fast and easy to use, I could set it to output ESM and CJS Node bundles and ESM and UMD browser bundles. It already handles Node built-ins: in my tests it included a 22 KB Buffer implementation which, unlike the implementation currently being used, don't use __proto__ so it's Deno compatible! In the other I could not find a way to make it generate TypeScript definitions;
  • Because of this issue seems hard to make microbundle output both Node and browser bundles.

I will try the Babel plugin again...

Update 2: it's going to be another mess.
Update 3: Is possible to tsc generate type definitions even using Parcel.
Update 4: I will try to rewrite following this guide except by the TypeScript rule as the proposal it suggests as far from mature as TypeScript, jsDoc seems not handle everything TS does and I don't mind rewriting it again back to JS.

Update 5: I tried implementing some functions using async generators, like this:

export declare function encrypt(key?: Uint8Array): AsyncGenerator<Uint8Array, void, Uint8Array>;
export declare function decrypt(key: Uint8Array): AsyncGenerator<Uint8Array, void, Uint8Array>;
export declare function verify(key: Uint8Array): AsyncGenerator<void, boolean, Uint8Array>;

The issue is that there isn't any function already in Node or browsers that can convert a AsyncGenerator into a Node writable stream or a WHATWG writable stream. There is stream.Readable.from and there is this proposal, but those are only for readable streams thus don't help us as those three functions need to be writable in some way.

Requiring users to import a helper function for that is bad and adding a helper function for that goes against the idea of rewriting the code without using Node built-ins. While is possible to replace buffers with typed arrays and TextDecoder, and most cases we use events can be handled using promises, seems it's too soon to think that Streams API can replace Node Streams.

rollup-plugin-node-globals is not updated since 2018 and seems that is the main reason the browserified version of Buffer of it uses __proto__, which is not standard, not compatible with Deno and was one of the reasons I rewrote tonistiigi's mega in the first place. Parcel, in the other hand, uses a modern version without this issue, so I think it's better to use Node Streams, at least for now.

Update 6: I could not find any working alias plugin for Parcel 2. The built-in alias functionality is too simple and aliases can't be changed in different builds. Because of that I can't just create a import './aes' that imports a module that uses Node crypto in Node and other that uses a pure JavaScript AES implementation in other engines. I also tried implementing a plugin reading the current Parcel 2 documentation but without success.

Returning to Rollup is an option but then I will need to replace rollup-plugin-node-globals with something else that is better maintained. I could avoid this issue by writing the code to be browser first using Web Streams then adding a Node wrapper around it but it would be quite bad as it would hurt performance and most users of this library (which use the Node build) would be affected.

@qgustavor
Copy link
Owner

About how the build tool: I'm using esbuild in two projects. This tool supports TypeScript and from my experience it's easy to use, also saying that's fast is an understatement.

About how the library would be exported: I think we can use just ES exports as it is stable in Node and supported by most browsers.

About the platform compatibility issue: an idea would be exporting a core module and wrappers for Node, Deno and Browsers that will provide HTTPS request and crypto functionality. In the other hand how binary data is streamed is something that can't be handled well in this way.

My experience with WHATWG streams is that there are many compatibility issues between platforms, as an example typeof WritableStream is undefined in Firefox 90, so I prefer avoiding it. IIRC there are performance issues with AsyncGenerator in Node. Seems the best idea is keeping things as it is today and using Node Streams and some kind of polyfill for other platforms. Deno can use the compatibility module. As most library users use Node seems reasonable to optimize the library for it.

About the API design: that's something I still not decided. I wish something that can be tree-shaken (i.e. do not export a function as the default export, do not export anything inside objects), some high level functions for most users (loading public files, logging into accounts, creating accounts, etc.) and some low level functions (encryption, decryption, MAC verification).

I don't know if is possible to use the wrapper idea along with tree-shaking. Does someone know how Vue manages to handle tree-shaking and also export those many variations?


No, this project is not dead. As the library is working for me and as "if a thing ain't broken, don't fix it" I'm spending my time with other projects.

In the other hand I would like to fix those issues. The MAC verification issue, for me, seems bad because it forces library users trust MEGA servers, but if MEGA would start tampering data then other clients would notice that, so I don't think that's too bad. Rewriting it to TS is good as it helps reducing bugs and providing type hints. Fixing Deno support would be great as I'm using Deno for some projects and I can't run megajs on it because it uses __proto__ from buffer-es6 . Well, I can use Deno's Buffer implementation instead.

@qgustavor
Copy link
Owner

Update on this issue: this (unexpected huge) pull request fixes issues with browsers and Deno, keeping almost the same API (request was removed, so code that change its options was removed) and, important, adding types is in the plans. If someone can help there are a few issues blocking it such as:

  • Since the library is available in CJS and ESM, how to handle multiple exports?
  • How to automate type generation? I read TypeScript documentation and have a little more experience with Deno, so I now rewriting will be hard, because of that I'm fine with @AlttiRi 's suggestion of using JSDocs (if it meant in order to generate .d.ts files instead of rewriting everything in .ts)
  • Does the browser/Deno builds need types too?

@qgustavor
Copy link
Owner

qgustavor commented Feb 19, 2022

Looks like 1.0 fixed this issue. For 2.0 I want to rewrite the library to change the API design from...

console.log(await import('megajs'))
// { Storage, File, file, decrypt, encrypt, verify, ...  }

... to something like this:

console.log(await import('megajs'))
// { login, loadUrl, upload, download, share, createAccount, ... }

That would isolate huge functions such as upload and download from the rest of the library which would make tree-shaking way more efficient. Also, those functions would return upload and download controllers, not streams, so those can be handled like fetch() and will make error handling easier (fixing #30), example:

const downloader = await download(file) // get downloader
buffer = await downloader.arrayBuffer({ start, end }) // get buffer
stream = await downloader.stream({ start, end }) // get stream
stream = downloader.url // get raw download URL

Other changes moving internals to use promises and removing callbacks entirely, separating huge functions from classes to exports, trying to reduce the number of classes, replacing Buffers and another Node built-ins with modern JavaScript features, move internals to use TypeScript, replace RSA functions to use native crypto, implementing account sign ups and file dropping.

@jimmywarting
Copy link

When i read things like

console.log(await import('megajs'))
// { Storage, File, file, decrypt, encrypt, verify, ...  }

And have a look at the your own custom file class
then i can't help but feel that there are a naming conflict with the spec, browser have had it for a long time, nodejs now has blob, and will maybe even get a file class one day.

i got one idea that you could use (if you like) for a major breaking change
and that's to somehow utilize fetch-blob, it's generic enough to be able to accept any blob look-a-like item.

you could technically build a thing such as this blob look-a-like http-reader that has the same blob IDL stuff attached to it and wrap it in a fetch-blob this is essentially what we did with BlobDataItem that wraps a local file into a blob look-a-like object that works with FormData and fetch

then user can utilize it however they want without holding anything in memory.
it would also mean that it can function with formdata-polyfill and also undini and node-fetch formdata implementation that both accept 3th party blob like object

imagine if you build something like our BlobDataItem or the http-blob reader

// and user could somehow do:
const blob = await mega.getFileLikeObject()

// then user would be able to basically have nothing in the memory
// and they could read some small chunk of it using

const head = await blob.slice(0, 1024).arrayBuffer()
// or stream it
await blob.stream().pipeTo(destination).then(doneSaving)

// or re-upload it to some other place else
fd = new FormData()
fd.append('file', blob)
fetch(url, { method: 'post', body: fd })

/*
from this point you have done nothing to download the hole file into memory
or the disk, yet it will still be able to have a imaginary blob that don't hold all
the data and is basically just seen as a handle with some know properties
such as size, type, and lastmodified and some standard read methods 
that you can pass around to anything
*/

@AlttiRi
Copy link

AlttiRi commented Feb 19, 2022

BTW, is implemented bunch (bulk) downloading of thumbnails, previews; and grouped API requests?


I have implemented it in my unfinished library, but... it was almost 2 years ago and the library is still backlogged.

Not the best code, and it should be refactored, but maybe you will find something interested:
https://github.com/AlttiRi/meganz-api

(I have made it public now.)

@qgustavor
Copy link
Owner

then i can't help but feel that there are a naming conflict with the spec,

I noticed that when I was testing TypeScript types. That's the reason I reacted positively to when Deno renamed Deno.File to Deno.FsFile. That's why the example for 2.0 does not include a File export, instead a loadUrl function (I wrote "load URL" because it could load shared files or folders... and getSharedNode sounded a bit weird)! For 1.0 the library will keep File as it would be, as you said, a major change, but for 2.0 I think is better renaming it to something else to avoid the conflict.

About the blob idea: that's great! I was thinking that using blobs would make uploading simpler, but I didn't knew that Node supported it. Adding blob support to upload is possible and would be a minor change. The thing is: the upload function is a mess, it accepts too many different styles for accepting inputs. I want to simplify it so the number of bugs are also reduced. Blobs are so great they could be the only accepted way: memory friendly, it's easy to implement chunked uploading, cross-environment compatible. In the other hand, making it the only accepted way to upload files would be a major change, so that's better to leave it for 2.0.

Also, using the same blob interface for downloads would be great too so new library newcomers would already know how to download files if they already used blob before. I would still prefer having a loadUrl function to get file metadata, a login function to replace new Storage() (having it being called "login" would make it more intuitive too) and a download function that accepts the file objects returned from those first two functions, so if someone is using this library in a browser application and never calls download then all the complex download logic (chunking, streaming, AES-CTR, AES-ECBC-MAC, etc) would be removed by tree shaking. The upload function would also be separated because it's complex and so it would be used even when logged-out (for uploading files in folders shared using MEGAdrop).

BTW, is implemented bunch (bulk) downloading of thumbnails, previews; and grouped API requests?

Thumbnail and preview downloads is still not implemented (issue #23, you commented it). Grouped API requests is something I was thinking about but I could not think a good way to implement it. Thinking better looks like a way to implementing it is changing the API class so request do not call fetch immediately, instead it but requests into a queue, wait for the next tick and do a single fetch with all request in the queue. I will check your code. I think those features can be implemented in a minor release.

By the way, I will be enabling GitHub discussions and moving discussions to there soon. I have been working on 1.0 for about one month so I want to take some time off this project before continuing work on it.

@jimmywarting
Copy link

some of the blob like function could be applied to the already existing file class. like the .arrayBuffer(), and .text(), the .stream() could be a tiny bit tricky since it uses a whatwg stream.

but if .stream() returns a async iterable that yields uint8arrays then you can easily wrap it in a fetch-blob and avoid most of the hazard to eg validate slice option, implementing a own text() and arrayBuffer() function and strip BOM. most functions comes for free when using our generic blob class

import { File } from 'fetch-blob/file.js'

const blob = new MegaBlob(...)
const file = new File([blob], FileName, { lastModifed, type })

our File class is generic enough to accept most blob lock-a-like item and normalize its methods as long as the blob-file like object has as a known size and a stream or a arrayBuffer method

you can even make a file out of this:

const blobLike = [{
  size: 123,
  slice(start, end) {
    // start and end is normalized by our blob class to not be
    // out of range and you don't need to worry about the type either
    
    // you also don't need to worry about the type argument...
  },
  async * stream() { 
    yield new Uint8Array(123)
  },
  [Symbol.toStringTag]: 'File'
}]
// and you would be able to do:
new Blob(blobLike).stream() // whatwg readable stream
await new Blob(blobLike).arrayBuffer() // will read from the async stream() generator fn you provided
await new Blob(blobLike).text()

i guess it wouldn't take much to make it also work with FormData either

@qgustavor
Copy link
Owner

Seems to be a good idea for the next minor version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants