Add a storetheindex delegated provider #158

willscott · 2022-02-02T21:48:46Z

This looks a lot like the current delegated provider, but makes the request with the json / url format spoken by the current storetheindex find http server.

This PR does connect up metrics to views so that stats on these delegated providers will become visible to prometheus.

The code in providers/storetheindex is a re-homing of this PR which has an end-to-end test. The go-delegated-routing repo isn't a good home for it, as this is more of a current cludge than the long-term protocol we want to support.
I'm not including the test from that PR in this repo as it depends on the storetheindex codebase, which is a newer/incompatible version of all the libp2p core dependencies.

… interface

petar

This seems fine. One small bug/typo which I noted. Next steps:

We keep this PR in this branch and don't merge it until we test it in production.
Tommy has a way of deploying a target commit to a single Hydra machine.

head/head.go

aschmahmann

Just did a very cursory look at this and found a config option bug. I'd recommend running this locally and testing that it works before trying to deploy this into production.

FWIW you may want to use something like https://github.com/aschmahmann/vole to issue DHT queries to a head.

head/head.go

providers/storetheindex/findproviders.go

Co-authored-by: Adin Schmahmann <[email protected]>

willscott · 2022-02-04T10:19:15Z

tested locally using vole.
made queries, and can see
a) the result from sti
b) the success measure on the prometheus /metrics http site go up
c) the wireshark http request/response made to the STI instance

willscott · 2022-02-04T10:21:42Z

@thattommyhall when you get a chance can you test with the latest commit in this branch, and the config flag

-store-the-index-addr https://a190ab46c53bb433487ff687e39d34b6-795906228.us-east-1.elb.amazonaws.com/

or the env setting:

HYDRA_STORE_THE_INDEX_ADDR=https://a190ab46c53bb433487ff687e39d34b6-795906228.us-east-1.elb.amazonaws.com/

aschmahmann · 2022-02-10T14:53:35Z

providers/combined.go

@@ -80,7 +80,9 @@ func mergeAddrInfos(infos []peer.AddrInfo) []peer.AddrInfo {
 	}
 	var r []peer.AddrInfo
 	for k, v := range m {
-		r = append(r, peer.AddrInfo{ID: k, Addrs: v})
+		if k.Validate() == nil {


This seems fine, but why wouldn't we do this check further up next to the if r.Err == nil check when accumulating the addresses before merging them?

because we aren't doing an explicit iteration through the addrInfo's / keys during that part of accumulation

aschmahmann · 2022-02-11T18:42:04Z

This needs to be rebased on master before we can merge it since it has conflicts

willscott · 2022-02-11T18:43:42Z

that's the merge commit that got pushed this morning, no? github says there are not conflicts

guseggert · 2022-02-11T18:47:00Z

providers/storetheindex/findproviders.go

+
+func (c *client) FindProviders(ctx context.Context, mh multihash.Multihash) ([]peer.AddrInfo, error) {
+	// encode request in URL
+	u := fmt.Sprint(c.endpoint.String(), "/", mh.B58String())


would it make sense to use multibase b58 encoding here?

yes, but this is an existing endpoint which we are planning on replacing with the delegated routing one anyway.

Side note: @willscott you probably want to change the endpoint at some point in the future to use multibase. The cost of not having that one extra character is almost never worth it.

yes, but this is an existing endpoint which we are planning on replacing with the delegated routing one anyway.

Yeah I was just wondering if we should change the endpoint to use multibase encoding, if it's not too late. If getting replaced soon, then disregard :). (and consider doing this for the replacement)

this whole setup is for the 'works now' variant until go-delegated-routing is solidified and migrated to.

providers/storetheindex/findproviders.go

guseggert · 2022-02-11T18:59:02Z

providers/storetheindex/findproviders.go

+	if len(parsedResponse.MultihashResults) != 1 {
+		return nil, fmt.Errorf("unexpected number of responses")
+	}


If this always has one then why is it in an array? Is it expected the change in the future? If so can we just loop over the array so this can be forwards compatible?

the query endpoint allows an array of multihashes to be queried. this client only queries for an individual one at a time.

guseggert · 2022-02-11T19:00:52Z

providers/storetheindex/findproviders.go

+	ContextID []byte
+	Metadata  json.RawMessage


What are these fields for? They look unused

the response from the indexer node contains these fields, which are used by some providers. They're here for completeness of the message format. There has been some conversations of providers considering using them for cases that they could be relevant here, for instance in expressing priorities, or that multiple records with the same contextID should be de-duplicated

providers/storetheindex.go

main.go

k8s/alasybil.yaml

guseggert · 2022-02-11T19:53:59Z

head/head.go

+		if err != nil {
+			return nil, nil, fmt.Errorf("failed to instantiate delegation client (%w)", err)
+		}
+		providerStore = hproviders.CombineProviders(providerStore, hproviders.AddProviderNotSupported(stiProvider))


IIUC this will try the caching provider store concurrently, which we expect to fail, which will then enqueue an async DHT lookup. Those are expensive, will always fail, and will contend for resources (the work queue) with regular DHT queries...is there a way to avoid that?

Are you saying you're concerned that all the requests that end up being fulfilled by the indexers will result in DHT queries that will likely fail and you're concerned about the load?

If so we have some options here depending on the semantics we want. One might be that if we make a "fallback provider" that instead of trying all the providers in parallel does them sequentially only if the previous ones fail. In this case we could then decide to only do a DHT lookup in the event the Datastore and Indexer systems returned no records.

This wouldn't cover every case (e.g. if there's some record in the DHT that we're missing for a given multihash, but the data is separately advertised by the indexers)

if there's a change in logic here it should be in a different PR in order to keep the scope of this one reasonable.

Sounds good, but can we agree to not deploy this to the Hydras until this is fixed?

This change is not making the current situation any worse right?

Do we have a consensus agreement for something better?

I think this change does make it worse, for the reasons listed above.

What @aschmahmann brought up seems like a good compromise. I can make the change if it helps.

Agreed with Gus. In practice merging this code will make it worse. The change I proposed should be pretty small though

I think i wasn't clear earlier - what i meant by 'this change' was that this PR uses the same composition structure as the already merged delegated routing structure. I agree that spinning up load in this path is something we need to watch in case it leads to lots of failing dht queries, and that the change to the composition structure propose seems good.

There isn't going to be substantial amount of upstream bitswap data that we expect loaded into store the index in the coming week. it would be useful for providers to begin testing the end-to-end flow though, so if the additional change is going to take more than this coming week, we should consider if we can get away without it temporarily.

@guseggert if you're able to make the proposed change, that would be great!

head/head.go

willscott · 2022-02-12T09:50:45Z

I think I've responded to the actionable things brought up. please take another look.

guseggert · 2022-02-13T15:02:43Z

I pushed some changes to do the logic layed out above, and fixes a few other things, let me know if this works for you.

guseggert · 2022-02-13T15:11:52Z

Actually there are some problems with my commit, let me fix them.

guseggert · 2022-02-13T22:06:17Z

Okay I think it's correct now, and fixed up the end-to-end test to exercise the StoreTheIndex code path.

Also: * Reuse the delegated routing HTTP client across all heads * Don't set arbitrary error string as Prometheus label, to avoid hitting time series limit * Unexport some structs * Rip out the other delegated routing stuff since it's unused & dangerous

BigLep · 2022-02-15T17:07:43Z

2022-02-15: @petar will review. Potentially sync with @guseggert .

petar

lgtm + fyi notes

petar · 2022-02-15T23:00:09Z

head/head.go

 	}
+	if cfg.ProvidersFinder != nil && cfg.StoreTheIndexAddr != "" {


I think this sequence of if statements is correct, but for future reference: When your intention is to have mutually-exclusive cases, both for readability and safety, it is best to capture them either as a switch statement, or with "if else" chain. Here switch statement would be best:

switch {
case cfg.ProvidersFinder != nil && cfg.StoreTheIndexAddr == "":
...
break
case cfg.ProvidersFinder != nil && cfg.StoreTheIndexAddr != "":
...
break
case cfg.ProvidersFinder == nil && cfg.StoreTheIndexAddr != "":
...
break
default:
... something is not right ...
}

petar · 2022-02-15T23:08:22Z

providers/caching.go

 	if err != nil {
 		return addrInfos, err
 	}

 	if len(addrInfos) > 0 {
-		recordPrefetches(ctx, "local")
 		return addrInfos, nil
 	}

 	return nil, d.Finder.Find(ctx, d.Router, key, func(ai peer.AddrInfo) {


This is probably not worth the effort, but for future reference. It is bad style to reuse the error return value for two different purposes. Above, returned errors indicate errors in the providerstores. Here, errors indicate errors in the finder. Considering that the finder is an async functionality, independent of the GetProviders execution path, its error should be logged, not returned.

Otherwise this function seems to be a correct implementation of the logic we discussed in colo.

I haven't heard of that style guideline before, if a method is supposed to do two things, and it can't do one of them, then it's normal to return an error, regardless of which one failed. The finder is best effort, so an error from the finder here doesn't represent an error in the finding itself, it means that we couldn't even try to find (e.g. in async case, that the work couldn't be queued for some reason). My intention was for the CachingProviderStore to not know nor care that things are happening async, but given that we don't have a sync version of this, I can see how that just adds confusion, so I can rename things a bit here to clarify.

petar · 2022-02-16T17:21:23Z

No need to make any changes. I guess it’s a matter of preference/taste.

…

On Wed, Feb 16, 2022 at 9:17 AM Gus Eggert ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In providers/caching.go <#158 (comment)>: > if err != nil { return addrInfos, err } if len(addrInfos) > 0 { - recordPrefetches(ctx, "local") return addrInfos, nil } return nil, d.Finder.Find(ctx, d.Router, key, func(ai peer.AddrInfo) { I haven't heard of that style guideline before, if a method is supposed to do two things, and it can't do one of them, then it's normal to return an error, regardless of which one failed. The finder is best effort, so an error from the finder here doesn't represent an error in the finding itself, it means that we couldn't even *try* to find (e.g. in async case, that the work couldn't be queued for some reason). My intention was for the CachingProviderStore to not know nor care that things are happening async, but given that we don't have a sync version of this, I can see how that just adds confusion, so I can rename things a bit here to clarify. — Reply to this email directly, view it on GitHub <#158 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACFTSZH2XMTC2RM3AH2Y3LU3PL3JANCNFSM5NNHH4KA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

BigLep · 2022-03-09T00:59:49Z

This work will be undone in future as part of #162

Add a parallel to delegated provider interface for storetheindex find…

fc37576

… interface

petar reviewed Feb 2, 2022

View reviewed changes

head/head.go Outdated Show resolved Hide resolved

fix flag setting

f8eaf00

willscott had a problem deploying to DockerBuilders February 2, 2022 22:15 Failure

aschmahmann suggested changes Feb 4, 2022

View reviewed changes

head/head.go Outdated Show resolved Hide resolved

providers/storetheindex/findproviders.go Outdated Show resolved Hide resolved

Apply suggestions from code review

6db49e3

Co-authored-by: Adin Schmahmann <[email protected]>

willscott had a problem deploying to DockerBuilders February 4, 2022 10:02 Failure

specify delegated timeout in ms

172615e

willscott temporarily deployed to DockerBuilders February 9, 2022 13:51 Inactive

don't include non-validateable addrs when merging

73f8538

willscott temporarily deployed to DockerBuilders February 10, 2022 14:25 Inactive

aschmahmann reviewed Feb 10, 2022

View reviewed changes

willscott requested a review from aschmahmann February 11, 2022 09:48

Merge remote-tracking branch 'origin/master' into storetheindex

ef5b8a2

willscott had a problem deploying to DockerBuilders February 11, 2022 09:51 Failure

willscott requested a review from guseggert February 11, 2022 14:14

guseggert reviewed Feb 11, 2022

View reviewed changes

aschmahmann reviewed Feb 11, 2022

View reviewed changes

head/head.go Outdated Show resolved Hide resolved

code review

6a1c2ac

willscott had a problem deploying to DockerBuilders February 12, 2022 09:48 Failure

willscott requested a review from aschmahmann February 12, 2022 11:46

guseggert had a problem deploying to DockerBuilders February 13, 2022 14:59 Failure

guseggert force-pushed the storetheindex branch from 3bac15c to dfa283a Compare February 13, 2022 18:31

guseggert had a problem deploying to DockerBuilders February 13, 2022 18:32 Failure

guseggert force-pushed the storetheindex branch from dfa283a to 598cc78 Compare February 13, 2022 21:39

guseggert had a problem deploying to DockerBuilders February 13, 2022 21:39 Failure

guseggert force-pushed the storetheindex branch from 598cc78 to 222ad10 Compare February 14, 2022 23:11

guseggert had a problem deploying to DockerBuilders February 14, 2022 23:11 Failure

petar approved these changes Feb 15, 2022

View reviewed changes

guseggert merged commit 21dbc8b into master Feb 16, 2022

guseggert deleted the storetheindex branch February 16, 2022 17:30

This was referenced Mar 9, 2022

Update to use delegated-routing for querying storetheindex #162

Closed

Configure delegated routing to invoke Indexers #141

Closed

		}
		if cfg.ProvidersFinder != nil && cfg.StoreTheIndexAddr != "" {

Add a storetheindex delegated provider #158

Add a storetheindex delegated provider #158

Conversation

willscott commented Feb 2, 2022 • edited Loading

petar left a comment

Choose a reason for hiding this comment

aschmahmann left a comment • edited Loading

Choose a reason for hiding this comment

willscott commented Feb 4, 2022

willscott commented Feb 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aschmahmann commented Feb 11, 2022

willscott commented Feb 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guseggert Feb 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guseggert Feb 11, 2022 • edited Loading

Choose a reason for hiding this comment

aschmahmann Feb 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

willscott commented Feb 12, 2022

guseggert commented Feb 13, 2022

guseggert commented Feb 13, 2022

guseggert commented Feb 13, 2022

BigLep commented Feb 15, 2022

petar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petar commented Feb 16, 2022 via email

BigLep commented Mar 9, 2022

willscott commented Feb 2, 2022 •

edited

Loading

aschmahmann left a comment •

edited

Loading

willscott commented Feb 4, 2022 •

edited

Loading

guseggert Feb 11, 2022 •

edited

Loading

guseggert Feb 11, 2022 •

edited

Loading

aschmahmann Feb 11, 2022 •

edited

Loading