This repository has been archived by the owner on Jun 11, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add a storetheindex delegated provider #158
Add a storetheindex delegated provider #158
Changes from 6 commits
fc37576
f8eaf00
6db49e3
172615e
73f8538
ef5b8a2
6a1c2ac
222ad10
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this will try the caching provider store concurrently, which we expect to fail, which will then enqueue an async DHT lookup. Those are expensive, will always fail, and will contend for resources (the work queue) with regular DHT queries...is there a way to avoid that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying you're concerned that all the requests that end up being fulfilled by the indexers will result in DHT queries that will likely fail and you're concerned about the load?
If so we have some options here depending on the semantics we want. One might be that if we make a "fallback provider" that instead of trying all the providers in parallel does them sequentially only if the previous ones fail. In this case we could then decide to only do a DHT lookup in the event the Datastore and Indexer systems returned no records.
This wouldn't cover every case (e.g. if there's some record in the DHT that we're missing for a given multihash, but the data is separately advertised by the indexers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there's a change in logic here it should be in a different PR in order to keep the scope of this one reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, but can we agree to not deploy this to the Hydras until this is fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not making the current situation any worse right?
Do we have a consensus agreement for something better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change does make it worse, for the reasons listed above.
What @aschmahmann brought up seems like a good compromise. I can make the change if it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with Gus. In practice merging this code will make it worse. The change I proposed should be pretty small though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i wasn't clear earlier - what i meant by 'this change' was that this PR uses the same composition structure as the already merged delegated routing structure. I agree that spinning up load in this path is something we need to watch in case it leads to lots of failing dht queries, and that the change to the composition structure propose seems good.
There isn't going to be substantial amount of upstream bitswap data that we expect loaded into store the index in the coming week. it would be useful for providers to begin testing the end-to-end flow though, so if the additional change is going to take more than this coming week, we should consider if we can get away without it temporarily.
@guseggert if you're able to make the proposed change, that would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine, but why wouldn't we do this check further up next to the
if r.Err == nil
check when accumulating the addresses before merging them?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we aren't doing an explicit iteration through the addrInfo's / keys during that part of accumulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to use multibase b58 encoding here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but this is an existing endpoint which we are planning on replacing with the delegated routing one anyway.
Side note: @willscott you probably want to change the endpoint at some point in the future to use multibase. The cost of not having that one extra character is almost never worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was just wondering if we should change the endpoint to use multibase encoding, if it's not too late. If getting replaced soon, then disregard :). (and consider doing this for the replacement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole setup is for the 'works now' variant until
go-delegated-routing
is solidified and migrated to.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this always has one then why is it in an array? Is it expected the change in the future? If so can we just loop over the array so this can be forwards compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the query endpoint allows an array of multihashes to be queried. this client only queries for an individual one at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these fields for? They look unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the response from the indexer node contains these fields, which are used by some providers. They're here for completeness of the message format. There has been some conversations of providers considering using them for cases that they could be relevant here, for instance in expressing priorities, or that multiple records with the same contextID should be de-duplicated