This repository has been archived by the owner on Aug 11, 2021. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Major API changes #50
Major API changes #50
Changes from 1 commit
128decd
190de47
930398b
888cd27
3e79d97
55585d4
f04424d
bf2184d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really encourage to remove requirement to return Buffer in favor of Uint8Array. This requirement makes browser use case unnecessarily complicated, and in practice isn't really necessary. Other IPFS libraries can do the casting where necessary.
P.S. node Buffer is also a valid Uint8Array so node case will require no change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love if there was some word about errors as well. Specifically how decoder can return error and how that error propagates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving from Node.js to Browser based primitives is certainly on my todo list, but out of scope for these API changes. I don't want to do everything at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did wrote more about this above but I think I want add one more note here - Requirement for implementation to provide a proxy just to avoid full decoding makes a high bar, in fact I would argue that custom
DataView
class with getters is probably a far better alternative. However that still implies 5 allocations when accessing something like/deeply/A0/01/FC-nested/path.txt
which may seem insignificant but in some cases (e.g. representing Virtual DOM changelists in dominion library) that makes a difference between fast and unusable performance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a requirement, "may also be a Proxy object", but I'm very negative on Proxies regardless. Aside from performance concerns, I think we've learned from many previous attempts in other languages that things like Proxies should be reserved for metaprogramming and other reflection use-cases rather than general magical data structures. Pitfalls abound.
Perhaps the way forward is to remove the reference to proxies completely and leave it up to each format and allow experimentation.
Alternatively go the container route which is much nicer--I'm not sure
DataView
is the right container, maybe a bit too low-level, something that feels more like a Map might be more accessible. In fact, a whole custom Map-like that you could.toObject()
to fully instantiate as a non-lazy object representation might provide the best of both worlds?It would require specification, and even provision of a custom class, maybe just a simple extension of
Map
.For deserialization where you know the result is cheap and you want it all you could just instantiate it in the same call:
Are there possible use-cases where getters might need to be async or is it safe to frontload that in
decode()
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may have misunderstood what I meant by
DataView
- just a container tgat wraps underlying buffer and provides getters to access / decode fields on demandget key() {
. So pretty much what you described as Map except named getters on prototype instead ofget(key)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole point of IPLD is to provide common interface for traversing linked data no ? If each implementation is free to design one, the common interface is gone & I think that’s a step in a wrong direction - I much rather have common imperfect interface that can be used across arbitrary formats than have to learn each one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this then won't be lazy for nested objects, will it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For implementing eager case this will make no difference. For lazy case avoiding proxies and lazy getters would be significantly simpler than needing to have them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I wrote my initial response in this thread I was trying to imagine a use-case where the properties would need to be asynchronously accessed and I couldn't see an obvious one except for very large blobs that need to be stored on disk--but those wouldn't be passed in to a
deserialize(blob)
call in the first place. Maybe there's a case if there was some really heavy crypto applied to sections of a blob, but even then it's questionable. Async computational tasks are usually made pointless by the boundary crossing problems and is why async methods in Node's'crypto'
module have been mostly resisted, computationally intensive thread offloading can't be demonstrated except for very heavy work. The browser is a slightly different case because you're not maximising core use as with a server and responsiveness is important, maybe that's a worthy argument? I'm doubtful.So again, it looks like premature optimisation in search of a use-case.
Agreed, and it'll help make quicker progress on this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care about JS implementation in browser context (on server go is likely a better choice anyway). In browser crypto APIs are async (for good reasons) ignoring that seems strange to me.
And honestly it’s not about optimizations, just about not making currently well supported use cases second class.
P.S. I am working on crypto heavy use case for browsers right now and this API channge will require me to switch to handrolled solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gozala ok this is good context, excuse my ignorance. In your use case, is it necessary that the crypto be handled at the format layer or is it reasonable to push it up to ipld itself? I was imagining ipld/ipld#64 being something in ipld and the blocks just providing the encrypted binary data. Perhaps an encrypted block coming from a format gets transposed into an unencrypted ipld block where the key is provided but this happens at the ipld layer, not in the format. Or are you seeing a different mechanism? Again, I'm fairly new to thinking about this stuff so additional insight is appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created #51 which hopefully accommodates all the points brought up here. It's a smaller change to the current API. Let's have a discussion over there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a question.
From what I can tell implementation of the
cid
is pretty much the same across the board. Is there reason to require implementation ofcid
in first place ? Could this be made optional, so only implementation that need a custom behavior can provide it ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An IPLD Format has all the information it needs to calculate the CID. The multicodec code as well as the hash. So you can call
cid()
from layers that build on top of IPLD Formats without knowing those details.