Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correctly apply addrFilters in the dht #872

Merged
merged 4 commits into from
Sep 4, 2023

Conversation

Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Aug 23, 2023

This still does not do the fullrt client but it wasn't doing it before either.

Fixes smart dialler regression from #870.

@Jorropo Jorropo requested a review from dennis-tra August 23, 2023 22:44
@Jorropo Jorropo requested review from guillaumemichel and a team as code owners August 23, 2023 22:44
dht.go Show resolved Hide resolved
pb/protocol_messenger.go Outdated Show resolved Hide resolved
@Jorropo Jorropo requested a review from dennis-tra August 24, 2023 15:00
@Jorropo Jorropo force-pushed the stop-sending-addrs-in-add-provide branch from a3e39f6 to 16835eb Compare August 24, 2023 15:05
This still does not do the fullrt client but it wasn't doing it before either.
@Jorropo Jorropo force-pushed the stop-sending-addrs-in-add-provide branch from 16835eb to fe4a592 Compare August 24, 2023 15:12
Copy link
Contributor

@guillaumemichel guillaumemichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Jorropo
Copy link
Contributor Author

Jorropo commented Aug 24, 2023

cc @dennis-tra @guillaumemichel, @BigLep really wants us to have a test here, I don't have time to do it, could you please pick it up ?

@guillaumemichel
Copy link
Contributor

I cannot take this up at the moment 👎🏻

@BigLep
Copy link

BigLep commented Aug 24, 2023

@dennis-tra : are you able to take? My understanding is that #870 hit a regression. Bummer, but that's life. I like to hold to the standard that we at least add regression tests when we trigger breakage/bugs. (Or is this code that will all be going away in the next month with refactoring?)

@dennis-tra
Copy link
Contributor

dennis-tra commented Aug 24, 2023

@BigLep, @Jorropo and I have had discussions around #870 and we think differently about it. From my point of view it's certainly not a regression but instead puts a spotlight on the missing filters.

#870 finally sets and increases the address TTL as intended in the code and suggested by RFM 17.1 (here are details about when the bug was introduced). We're trading increased pressure on the smart dialler against DHT walks. Filtering the addresses (this PR) is complementary and the right way to decrease pressure in the general case (and not only specifically because of the changes in #870).

The increased pressure on the smart dialler is dwarfed by the speed-up of skipping a second DHT walk to find the addresses.

That being said, I'm afraid I'm not sure what a regression test should test if I don't see a regression here.

I could verify that the literal changes of the code in this PR do what they aim to do - filter addresses when storing/fetching/sending multiaddresses to/from the peerstore or other peers. However, that's only implicitly related to increased pressure on the smart dialler and I wouldn't call it a regression test. Terminology aside, I'm happy to add these tests!

However, this area of the code will and has changed in the v2 codebase. I've already ported the handlers to the v2-develop branch in #864 where I also already test the address filtering.

One genuine note to the end, I really appreciate that @Jorropo you sharpen my eye for the implications of seemingly minor changes, as in #870.

@Jorropo
Copy link
Contributor Author

Jorropo commented Aug 25, 2023

@dennis-tra I don't understand what you mean, this is not reverting your change (#870), but applying address filters, right now your change cache private and public ips on the wan dht of kubo, instead only public ips will be cached.

@dennis-tra
Copy link
Contributor

dennis-tra commented Aug 25, 2023

@Jorropo I see you're not reverting #870 here and I also understand the significance of not caching private IPs.

Let's get this change in and if we get the chance to chat in sync about it, do it during one of the next co-los.

@yiannisbot
Copy link

Terminology aside, I'm happy to add these tests!

The impact of the original change here is expected to be significant, i.e., avoiding a second DHT lookup (that's our hope and we've got all the measurements in place to monitor the result). So, I'd say let's add these tests (I agree they're not regression tests as this PR here doesn't fix a regression, but it's an optimisation on top of #870. I'd call them tests on the optimisation proposed by @Jorropo) and make sure this goes into the next libp2p release.

@BigLep
Copy link

BigLep commented Aug 25, 2023

Thanks all. I'm not close to the details and not going to dive in more unless needed. I just want to make sure that we hold the line of making sure the any new code we write or touch has the proper test coverage. Basically, we aren't going to go on a campaign to fix the sins of the past, but I want to make sure our present and future selves does the right thing, and when we get tripped up by decisions of the past that we bring them up to our desired standard. As long as we're following that principle, I'm good. And if we're not following it, I want to make sure we're upfront and intentional about it (where this can mean something like "not adding a backfill test because this code will be replaced in one month and the proper assertions have been added to the new code in commit X").

@dennis-tra
Copy link
Contributor

dennis-tra commented Aug 29, 2023

added two tests where testing infrastructure was already present.

I think the serving part is missing. When we hand out addresses we should also filter them. When the peer connects to us to store provider records we'll store its addresses (private + public) in the peerstore because of the identify exchange. This means in the first 30min (the ttl for recently connected peers) we serve both. The same applies when we're connected to the peer.

@guillaumemichel guillaumemichel self-requested a review August 30, 2023 09:32
@guillaumemichel
Copy link
Contributor

guillaumemichel commented Aug 30, 2023

Thank you @dennis-tra for the tests!


All calls to add peers to the peerstore go through maybeAddAddrs

go-libp2p-kad-dht/dht.go

Lines 923 to 936 in 42d3453

func (dht *IpfsDHT) maybeAddAddrs(p peer.ID, addrs []ma.Multiaddr, ttl time.Duration) {
// Don't add addresses for self or our connected peers. We have better ones.
if p == dht.self || dht.host.Network().Connectedness(p) == network.Connected {
return
}
dht.peerstore.AddAddrs(p, dht.filterAddrs(addrs), ttl)
}
func (dht *IpfsDHT) filterAddrs(addrs []ma.Multiaddr) []ma.Multiaddr {
if f := dht.addrFilter; f != nil {
return f(addrs)
}
return addrs
}

The only call that isn't filtered is the following:

// AddProvider adds a provider
func (pm *ProviderManager) AddProvider(ctx context.Context, k []byte, provInfo peer.AddrInfo) error {
ctx, span := internal.StartSpan(ctx, "ProviderManager.AddProvider")
defer span.End()
if provInfo.ID != pm.self { // don't add own addrs.
pm.pstore.AddAddrs(provInfo.ID, provInfo.Addrs, ProviderAddrTTL)
}
prov := &addProv{
ctx: ctx,
key: k,
val: provInfo.ID,
}
select {
case pm.newprovs <- prov:
return nil
case <-ctx.Done():
return ctx.Err()
}
}

We cannot call maybeAddAddrs because the ProviderManager doesn't have a reference to the IpfsDHT. Hence we would need to add a filter attribute to the ProviderManager and make sure that the mentioned call to peerstore.AddAddrs provides filtered addresses.


Edit:

The only caller to ProviderStore.AddProvider filters the multiaddrs before the call so we should be good.

// We run the addrs filter after checking for the length,
// this allows transient nodes with varying /p2p-circuit addresses to still have their anouncement go through.
addrs := dht.filterAddrs(pi.Addrs)
dht.providerStore.AddProvider(ctx, key, peer.AddrInfo{ID: pi.ID, Addrs: addrs})

The other callers (IpfsDHT.Provide and FullRT.Provide) add content that is being provided by the host, and do not provide any addrs (because we don't want to add self to the peerstore).

@guillaumemichel
Copy link
Contributor

@Jorropo can you have a last look before we merge this?

@Jorropo
Copy link
Contributor Author

Jorropo commented Sep 4, 2023

Thx a lot for doing this, LGTM I havn't reviewed the tests in detail, just the non test code. ❤️ 🙏

@Jorropo Jorropo merged commit 0c90569 into master Sep 4, 2023
9 checks passed
@Jorropo Jorropo deleted the stop-sending-addrs-in-add-provide branch September 4, 2023 13:12
@Jorropo Jorropo mentioned this pull request Sep 4, 2023
cortze pushed a commit to cortze/go-libp2p-kad-dht that referenced this pull request Jul 30, 2024
added tests for addrfilter

fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820)

* added check to avoid adding unresponsive dht peers to the dht routing table

* removed lock in adding peers to the rt

* made variable names more meaningful

* fixed network loop and corrected tests

* added UsefulPeer() references from current PR

* go mod tidy

* added delay in TestRefreshBelowMinRTThreshold

* addressed review

* go mod tidy

* addressed Jorropo review

* added comments

* removed state of peers probed recently

* fix conflicts merge

* updated deps

* added optimizations documentation

* Update dht.go

* updated md files

---------

Co-authored-by: Jorropo <[email protected]>

fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820)

* added check to avoid adding unresponsive dht peers to the dht routing table

* removed lock in adding peers to the rt

* made variable names more meaningful

* fixed network loop and corrected tests

* added UsefulPeer() references from current PR

* go mod tidy

* added delay in TestRefreshBelowMinRTThreshold

* addressed review

* go mod tidy

* addressed Jorropo review

* added comments

* removed state of peers probed recently

* fix conflicts merge

* updated deps

* added optimizations documentation

* Update dht.go

* updated md files

---------

Co-authored-by: Jorropo <[email protected]>
(cherry picked from commit 8c9fdff)

chore: release v0.24.0

chore: Update .github/workflows/stale.yml [skip ci]

fix: leaking go routines

fix: bump kbucket for abba bug

refactor: remove goprocess

fix: decrease tests noise, update kbucket and fix fixRTIUfNeeded

chore: release v0.24.1

chore: update go-libp2p to a non deadlocky version

chore: release v0.24.2

chore: Update .github/workflows/stale.yml [skip ci]

tracing: fix DHT keys as string attribute not being valid utf-8

chore: release v0.24.3

fix the server mode log in handleNewMessage by changing it to debug

chore: delete templates [skip ci] (libp2p#861)

ci: uci/copy-templates (libp2p#862)

* chore: add or force update .github/workflows/go-test.yml

* chore: add or force update .github/workflows/go-check.yml

* chore: add or force update .github/workflows/releaser.yml

* chore: add or force update .github/workflows/release-check.yml

* chore: add or force update .github/workflows/tagpush.yml

chore: bump go.mod to Go 1.20 and run go fix

chore: bump go-libp2p and/or quic-go to latest version

tracing: add protocol messages client tracing

chore: release v0.25.0

add provider record addresses to peerstore

fixes issue libp2p#868

fixing base32 import

fix: apply addrFilters in the dht (libp2p#872)

* fix: correctly apply addrFilters in the dht

This still does not do the fullrt client but it wasn't doing it before either.

* Add address filter tests

* use channel instead of waitgroup

to catch timeouts

* filter multiaddresses when serving provider records

---------

Co-authored-by: Dennis Trautwein <[email protected]>

fix: properly iterate in tracing for protocol messenger

MB Copy Paste Typo,

This caused panics while tracing.

chore: use go-libp2p-routing-helpers for tracing needs

go-libp2p-routing-helpers has an optimized implementation that does nothing if we are not tracing, it also properly log all IO of the request.

perf: don't buffer the output of FindProvidersAsync

We always select it against ctx and we don't rely on the fact it will always be non blocking since when count is zero we can't realistically preallocate.
Buffering use more memory makes more garbage and is less efficient than direct copies.

chore: release v0.25.1

add ctx canceled err check

return canceled err not retry

reset stream and return err when ctx canceled

release v0.25.2

chore: Update .github/workflows/stale.yml [skip ci]

chore: bump go.mod to Go 1.21 and run go fix

chore: run go mod tidy

chore: add or force update .github/workflows/go-test.yml

chore: add or force update .github/workflows/go-check.yml

chore: add or force update .github/workflows/releaser.yml

chore: add or force update .github/workflows/release-check.yml

chore: add or force update .github/workflows/tagpush.yml

chore: add or force update .github/workflows/go-test.yml

chore: add or force update .github/workflows/go-check.yml

chore: add or force update .github/workflows/releaser.yml

chore: add or force update .github/workflows/release-check.yml

chore: add or force update .github/workflows/tagpush.yml

chore: add or force update .github/workflows/go-test.yml

chore: add or force update .github/workflows/go-check.yml

chore: add or force update .github/workflows/releaser.yml

chore: add or force update .github/workflows/release-check.yml

chore: add or force update .github/workflows/tagpush.yml

findnode(self) now returns multiple peers

don't lookup check if not enough peers in table

enhanced tests

moved rt size check after request

Upgrade to go-log v2.5.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants