Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering on codec type in a datastore #8907

Closed
3 tasks done
Ericson2314 opened this issue Apr 22, 2022 · 3 comments
Closed
3 tasks done

Filtering on codec type in a datastore #8907

Ericson2314 opened this issue Apr 22, 2022 · 3 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@Ericson2314
Copy link

Checklist

  • My issue is specific & actionable.
  • I am not suggesting a protocol enhancement.
  • I have searched on the issue tracker for my issue.

Description

It would perhaps be nice to expose codecs to data store plugins. This could be done with a new sort of data store, or some sort of optional codec provided as hint. This would allow "domain specific" datastores (which don't actually store the raw bytes, but store something post parsing) to preemptively filter out requests they know they cannot handle.

This use-case came up when working on https://www.softwareheritage.org/2022/02/10/building-bridge-to-the-software-heritage-archive/. The SWH archive only has content corresponding matching certain multicodecs, so there is an opportunity to preemptively avoid some spurious look-ups that are known to fail.

A background assumption is that bitswap does in deed use CIDs not raw multihashes. https://github.com/ipfs/go-ipfs/issues/4189#issuecomment-1094388711 alleges that is no longer the case, so my apologies if I am out of date.

@Ericson2314 Ericson2314 added the kind/enhancement A net-new feature or improvement to an existing feature label Apr 22, 2022
@Ericson2314
Copy link
Author

(In #8907 I brought up making the key type generic so use-case-specific key types can be used instead. That would presumably make implementing this much less annoying.)

@aschmahmann
Copy link
Contributor

The blockstore drops the codec data before it reaches the datastore https://github.com/ipfs/go-ipfs-blockstore/blob/1e9b86f2c6cd6dc0452c6dac1daaac5a7196f188/blockstore.go#L147, so what you really need is a blockstore plugin.

If you want to go the plugin route then you should give a +1 to #9010, or give it a test and that should allow you swapping out the existing blockstore with a custom implementations.

That being said, my understanding from #9155 is that you are trying to make an IPFS implementation that serves up SWH data backed from a remote source. How much of kubo are you actually using for this? Perhaps you'd have an easier time taking the pieces you need, such as in https://github.com/hsanjuan/ipfs-lite, or some of the other Go implementations in https://docs.ipfs.tech/basics/ipfs-implementations/. If you've got some questions on how to put the pieces together, open an issue on discuss.ipfs.io or say hi on one of the chat platforms https://docs.ipfs.tech/community/chat/#discord and checkout the #ipfs-implementers channel.

@Ericson2314
Copy link
Author

@aschmahmann Thanks for the info.

I've now refactored our repo to implement both the Datastore and Blockstore interfaces.

I do indeed prefer library composition of plugins / wanton dependency injection, but this project having less code / preferring stable interfaces for easy keeping up with kubo is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants