Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Update the spec (new methods) #11

Closed
wants to merge 5 commits into from
Closed

Update the spec (new methods) #11

wants to merge 5 commits into from

Conversation

daviddias
Copy link
Member

No description provided.

- version - the CID version to be used (defaults to 1)
- hashAlg - the hash algorithm to be used (default to the one set by the format)

`callback` must have the signature `function (err, cid)`, where `err` is an Error is the function fails and `cid` is a CID instance of the dagNode.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanderer resisted swapping dagNode by serialized node and instead decided to create another method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIDs of serialized nodes should never change. IMO, the version/hash algorithm should generally be specified when the node is created and carried along with the node; providing options here is, IMO, a footgun. In go-ipfs, we deal with this by storing the CID prefix along with the block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the block service level, we have the Block class that carries the CID there. However, at the DAG Level, the user should be able to specify its CID.

I believe we have the same :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid I'd like to keep the surface of this API as little as possible. Higher level APIs should be part of js-ipld. Hence I'd prefer having a single util.cid() which works on the serialized node. If you want to get the CID of a DAG node, you could just call serialize() first, which you would need to call internally anyway, wouldn't you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmx 👍 from me.


> get the CID of the dagNode

Options include:
- version - the CID version to be used (defaults to 1)
- hashAlg - the hash algorithm to be used (default to the one set by the format)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these options so that we can pick the right hash function and not just default on the format (the format can always say that it only supports one hash function)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I take it that dagNodes are abstract editable nodes that have never been assigned a CID?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say assigned it is more. A dagNode is a thing in which we can calculate the CID of it, but it doesn't necessarily carry the CID as property.

The main reason why is because we want to have pure JSON blobs being first class dagNodes with CBOR and to add the CID in there I would have to wrap the JSON object into another object and that would be just a mess API wise.


> return a copy of the dagNode

`callback` must have the signature `function (err, dagNodeCopy)`, where `err` is an Error is the function fails and `dagNodeCopy` is a copy of the dagNode.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this method. It will be useful and go-ipld has to have it anyway so better make it part of the spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case? I'm hesitate to add new methods just because the seem useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmx folks shooting themselves in the foot, mostly.


`callback` must have the signature `function (err, nodeJSON)`, where `err` is an Error is the function fails and `nodeJSON` is a JSON representation of the dagNode.

This method was introduced to add some convinience when dealing with formats that have a non practical structure. For formats like dag-cbor, you won't need to use it as the dagNode itself is already a JSON 1:1 mapping to the CBOR representation.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kumavis followed your request and added this method

Copy link

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, go-ipld does this very differently. While I'm not a fan of many parts of the go-ipld API (cruft from the pre-ipld days), the core structure is actually pretty nice.

  1. It uses interfaces instead of a passing around binary data to a bunch of free-floating functions.

  2. Serialized nodes are typed, not just binary blobs.

At the lowest level is the Block. Blocks have methods to retrieve their associated CIDs and data.

At the next level is the Node. Nodes are blocks (have CIDs and raw data) that also know how to resolve paths through themselves, return list of resolvable paths, and return a list of links to other nodes.

The next level that we're currently missing is the concept of an abstract, general-purpose, CID-less node for DAG construction that can be converted to and from any fully-featured DAG format (i.e., JSON, CBOR, not DAGProtoBuf). That is, we need a way to fully deserialize into an abstract in-memory node and then re-serialize into a Node of a given format. In js-ipld, this looks like dagNode.


This gives us a few nice properties:

  1. We don't have to re-parse serialized nodes to traverse them multiple times. When we first start handling a node, we "decode" it into an node object that satisfies the Node interface. The implementation may decide to either decode the entire object up-front, never decode it (traverse it in-place), or decode it lazily as needed. Basically, Node objects are type-safe, efficient "views" into serialized nodes.

  2. When reading a DAG, one generally doesn't have to care about node formats; it's all abstracted away. One will still have to choose the format one wants to use when editing/creating DAGs but that's unavoidable.

  3. Because we use these Node and Block types instead of passing around raw serialized blocks of data, nodes/blocks can carry CID information with them. This makes it hard to accidentally change CIDs, by, e.g., choosing a new hash function.


> get the CID of the dagNode

Options include:
- version - the CID version to be used (defaults to 1)
- hashAlg - the hash algorithm to be used (default to the one set by the format)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I take it that dagNodes are abstract editable nodes that have never been assigned a CID?

- version - the CID version to be used (defaults to 1)
- hashAlg - the hash algorithm to be used (default to the one set by the format)

`callback` must have the signature `function (err, cid)`, where `err` is an Error is the function fails and `cid` is a CID instance of the dagNode.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIDs of serialized nodes should never change. IMO, the version/hash algorithm should generally be specified when the node is created and carried along with the node; providing options here is, IMO, a footgun. In go-ipfs, we deal with this by storing the CID prefix along with the block.

@Stebalien
Copy link

Is there any spec defining the format/interface for dagNodes as used in this spec? Given the toJSON method, it can't always be JSON.

@daviddias
Copy link
Member Author

Is there any spec defining the format/interface for dagNodes as used in this spec?

In JS land, that is really up for the format. Note, for the IPLD Resolver/Service, a dagNode type is really an opaque thing, we never call anything directly because we always call functions provided by the format interface and the format interface implementation should know how to touch it.

`callback` must have the signature `function (err, cid)`, where `err` is an Error is the function fails and `cid` is a CID instance of the dagNode.

#### `util.cidSerialized(binaryBlob[, options], callback)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not to build an anti-corrosion-bike-shelter - but the name would make me think this produces a cid that is serialized. Maybe cidFromBinary / cidFromBlob / addCidPrefix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of adding from, like cidFromSerialized.

We use the from in other modules, so it will fit well here.

@daviddias
Copy link
Member Author

@vmx as you review this PR and handle outstanding requests, can you also make sure to have PR's for the multiple formats and ensure that everything fits well with ipld-resolver?

vmx added a commit that referenced this pull request Mar 15, 2018
This update of the spec is based on the feedback from
#11

The changes are:

 - `util.cid()` no longer takes a DAG node, but a serialized node
   as input. It also gains optional options
 - `util.toJSON()` is added
 - `resolver.isLink()` is removed, it should be part of js-ipld
   instead.
@vmx vmx mentioned this pull request Mar 15, 2018
@vmx
Copy link
Member

vmx commented May 7, 2018

I think I've addressed everything from this PR with individual PRs and the spec is now what this PR intended. If you don't agree, please re-open this issue.

PS: One thing which isn't part of the updated spec is copy(), but I'd rather not do this for now, hence I left it out until there's a strong use case for it.

@vmx vmx closed this May 7, 2018
@daviddias daviddias deleted the update-spec branch May 7, 2018 12:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants