Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: roundabout gets raw cids as blobs #359

Merged
merged 2 commits into from
Apr 29, 2024

Conversation

vasco-santos
Copy link
Contributor

@vasco-santos vasco-santos commented Apr 26, 2024

Part of storacha/project-tracking#49

Note that currently Roundabout is used in production traffic for SPs to download Piece bytes, and is planned to be used by w3filecoin storefront to validate a Piece CID.

SP reads

  1. SPs request comes with a PieceCID, where we get equivalency claim for this Piece to some content.
  2. In current world (store/* protocol), it will in most cases be a CAR CID that we can get from R2 carpark-prod-0 as carCid/carCid.car. However, store/add does not really require this to be a CAR, so it could end up being other CIDs that are still stored with same key format in R2 bucket.
  3. With new world (blob/* protocol), it will be a RAW CID that we can get from R2 carpark-prod-0 as b58btc(multihash)/b58btc(multihash).blob.

w3filecoin reads

  1. filecoin/offer is performed with a given content CID
  2. In current client world, a CarCID is provided on filecoin/offer. This CID is used to get bytes for the content, in order to derive Piece for validation. In addition, equivalency claim is issued with CarCID
  3. With new world, we aim to have filecoin/offer to rely on RAW CIDs, which will be used for both reading content and issuing equivalency claims.

This PR

We need a transition period where we support both worlds.

This PR enables roundabout to attempt to distinguish between a Blob and a CAR when it gets a retrieval request. If the CID requested is a CAR (or a Piece that equals a CAR), we can assume the old path and key format immediately. On the other hand, if CID requested is RAW, we may need to give back a Blob object or a "CAR" like stored object.

For the transition period, this PR proposed that if we have a RAW content to locate, we MUST do a HEAD request to see if a Blob exists, and if so redirect to presigned URL for it. Otherwise, we need to fallback into old key formats. As an alternative, we could make the decision to make store/add handler not accept anymore non CAR CIDs, even though we would lose the ability to retrieve old things from Roundabout (which may be fine as well 🤔 ).

Please note that this is still not hooked with content claims to figure out which bucket to use, and still relies on assumption of CF R2 carpark-prod-0. Just uses equivalency claims to map PieceCID to ContentCID

Copy link

seed-deploy bot commented Apr 26, 2024

View stack outputs

@vasco-santos vasco-santos force-pushed the feat/roundabout-gets-raw-cids-as-blobs branch from dea1465 to 77a9a9b Compare April 26, 2024 10:57
@vasco-santos vasco-santos force-pushed the feat/roundabout-gets-raw-cids-as-blobs branch from 77a9a9b to 14edcc6 Compare April 26, 2024 11:02
@vasco-santos vasco-santos force-pushed the feat/roundabout-gets-raw-cids-as-blobs branch from 14edcc6 to 5ac61d3 Compare April 26, 2024 11:28
@vasco-santos vasco-santos marked this pull request as ready for review April 26, 2024 11:41
@vasco-santos vasco-santos requested review from Gozala and alanshaw April 26, 2024 11:43
Copy link
Contributor

@Gozala Gozala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should reject non CARs added via store/add so we do not have to lookup blob before we redirect, but I think we need to decide what to do with stuff that is already a non-car data that we captured as car before we proceed. So I think we should land this and maybe add that into a backlog. I am guessing we could look in store/add record to identify all such instances and migrate those into blob format

} catch (err) {
if (err?.$metadata?.httpStatusCode === 404) {
// Fallback to attempt CAR CID
return signer.getUrl(carKey, { expiresIn })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we do not have CAR cid either ? Would we redirect to 404 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, as we talked this will give 404 when client tries to read from presigned URL. Advantage is we don't lose performance and pay for redundant HEAD reqs to get to the same

@Gozala
Copy link
Contributor

Gozala commented Apr 26, 2024

Created PR to restrict store API to CARs so no new non-car content can not be stored via store API storacha/w3up#1415

@vasco-santos vasco-santos force-pushed the feat/roundabout-gets-raw-cids-as-blobs branch from c615d01 to 3c8dafd Compare April 26, 2024 17:21
@seed-deploy seed-deploy bot temporarily deployed to pr359 April 26, 2024 17:21 Inactive
@vasco-santos vasco-santos merged commit cdb7c65 into main Apr 29, 2024
3 checks passed
@vasco-santos vasco-santos deleted the feat/roundabout-gets-raw-cids-as-blobs branch April 29, 2024 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants