Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: refactor MaxPendingPeers handling #3981

Merged
merged 1 commit into from
Apr 28, 2022
Merged

p2p: refactor MaxPendingPeers handling #3981

merged 1 commit into from
Apr 28, 2022

Conversation

battlmonstr
Copy link
Contributor

@battlmonstr battlmonstr commented Apr 26, 2022

  • use semaphore instead of a chan struct{}
  • move MaxPendingPeers default value to DefaultConfig.P2P
  • log Error if Accept fails
  • replace quit channel with context

slots <- struct{}{}
srv.log.Trace("Rejected inbound connection", "addr", fd.RemoteAddr(), "err", err)
_ = fd.Close()
slots.Release(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better do release in defer, otherwise can get deadlock on panic

Copy link
Contributor Author

@battlmonstr battlmonstr Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to do this without a bigger refactoring of the structure of this function.

defer only runs on return, and can't be cancelled once it is set.
Here the logic wants to release either before goroutine if checks fail, or after the setup goroutine ends.

One way to do would be like so (pseudocode):

slots.Acquire()
preSetupBarrier = WaitGroup(1)
go {
    defer slots.Release()
    conn, err = preSetup(preSetupBarrier)
    if err != nil { return }
    setup(conn)
}
preSetupBarrier.Wait()

//

func preSetup(preSetupBarrier) (conn, err) {
    defer preSetupBarrier.Done()
    conn, err = accept()
    if err != nil { return nil, err }
    err = checks(conn)
    return conn, err
}

it is definitely neat and future-proof with a single Release, but it adds a barrier, and a new func such that .Done() call is also performed by defer to avoid it in multiple return paths.
besides log.Trace and fd.Close are primitives and are unlikely to panic.

Do you think it is worth this refactoring?

Also I wonder if we'd like to keep the ability to apply patches from geth upstream?

Copy link
Contributor Author

@battlmonstr battlmonstr Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also considering using a bounded channel, but I feel that there has to be 2 channels for this. It wants to allow for parallel setup while queueing up Accept'ed conns.

@battlmonstr battlmonstr force-pushed the listenLoop branch 2 times, most recently from 1510e98 to f60cb3f Compare April 27, 2022 14:29
* use semaphore instead of a chan struct{}
* move MaxPendingPeers default value to DefaultConfig.P2P
* log Error if Accept fails
* replace quit channel with context
@AskAlexSharov
Copy link
Collaborator

TestUDPv5_lookupE2E v5_udp_test.go:94: (a065529b1d5276dc) [TRACE] [04-27|18:07:53.448] UDP read error err="read udp4 127.0.0.1:55675: use of closed network connection"
probably can take a look later

@AskAlexSharov AskAlexSharov merged commit c6649f5 into devel Apr 28, 2022
@AskAlexSharov AskAlexSharov deleted the listenLoop branch April 28, 2022 02:21
AlexeyAkhunov added a commit that referenced this pull request May 3, 2022
* Change version to alpha (#3926)

Co-authored-by: Alexey Sharp <[email protected]>
Co-authored-by: Alex Sharp <[email protected]>

* docs: update libmdbx links (#3929)

* Makefile: refactor build flags and fix 1.17 (#3930)

* Fix some cli flag descriptions (#3933)

* Fix some cli flag descriptions

* add node about verbosity

* min requirement to go 1.18 (#3934)

* save

* save

* save

* Added Ethstats service (#3931)

* somewhat there but not yet

* lol

* more efficient ethstats

* lint

* not die on no wifi

* Update bor mumbai config (#3937)

* Update ci.yml (#3936)

* Use heimdall url in integration bor consensus (#3940)

* Downloader: re-use flags defaults (#3941)

* torrent: print peers amount in logs (#3942)

* Observer - P2P network crawler (#3928)

Observer crawls the Ethereum network and collects information about the nodes.

* Torrent conns print (#3943)

* save

* save

* [erigon2] Fuzz tests for commitment (#3939)

* [erigon2] Fuzz tests for commitment

* Cleanup

* Update to erigon-lib main

Co-authored-by: Alexey Sharp <[email protected]>

* Introduce unlimited download rate (#3945)

* Introduce unlimited download rate

* More generous burst

Co-authored-by: Alexey Sharp <[email protected]>

* Replace ioutil with io and os (#3946)

* Sentry GRPC: rename Peers to PeerEvents (#3944)

* Sentry GRPC: rename Peers to PeerEvents

see erigontech/interfaces#101

* Update to erigon-lib main

Co-authored-by: Alexey Sharp <[email protected]>

* cleaned up forkchoices db insertions #3949

* fixed ethstats (#3951)

* bsc: disable snap sync (#3955)

* bsc: disable snap sync (#3956)

* Snapshots: support empty buf case (#3957)

* Snapshots: rare nil pointer at fresh start (#3958)

* got rid of the automatic usage of net api (#3952)

* got rid of the automatic usage of net api

* less confusing comment

* ops

* ops2

* important

* ops

* RPC: admin.peers() (#3960)

* RPC: admin.peers()

This RPC method returns information about the connected remote nodes.
https://geth.ethereum.org/docs/rpc/ns-admin#admin_peers

The peers are collected from all configured sentries.
See: erigontech/interfaces#102

Test with:
curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "admin_peers", "params": [], "id":1}' localhost:8545

* save

* liner fix

Co-authored-by: alex.sharov <[email protected]>

* sentry: refactor flags, add maxpeers. (#3961)

* Experiment files 1 by 1 (#3959)

* Experiment files 1 by 1

* Remove check

* sort preverified snapshots

* docs: docker permissions

* sort preverified snapshots

* sort preverified snapshots

* sort preverified snapshots

* sort preverified snapshots

* sort preverified snapshots

* sort preverified snapshots

* save

* Fix speed log, remove file name

* Move timer out of the loop

* Calculate total size of downloaded files

* Fixes

* Fix

* Fix

* Fix

* Move downloadData

* Fix

* Revert "Fix"

This reverts commit 038e02b.

* Revert "Move downloadData"

This reverts commit 8130a4d.

* Revert "Fix"

This reverts commit 1dca25b.

* Revert "Fix"

This reverts commit ee5a1e8.

* Revert "Fix"

This reverts commit 8af7be7.

* Revert "Fixes"

This reverts commit 50509af.

* Revert "Calculate total size of downloaded files"

This reverts commit 64a26df.

* Remove progress

* Remove progress

Co-authored-by: Alexey Sharp <[email protected]>
Co-authored-by: alex.sharov <[email protected]>

* Update stage_headers.go (#3966)

* Snapshots: open bittorrent udp port in docker (#3969)

* Snapshots: open torrent udp in docker-compose.yml

* Snapshots: open torrent udp in docker-compose.yml

* Delete blocks in [from, to) range (#3970)

* Snapshots: allow stage_headers --unwind behind available snapshots (#3971)

* save

* save

* save

* Integration: allow headers --reset (#3972)

* Bsc: enable syncmode=snap by default #3973

* rlp: add support for optional struct fields (#22832) (#3977)

This adds support for a new struct tag "optional". Using this tag, structs used
for RLP encoding/decoding can be extended in a backwards-compatible way,
by adding new fields at the end.

see geth commit ethereum/go-ethereum@700df14

Co-authored-by: Felix Lange <[email protected]>

* Forgot to check err status (#3978)

* Forgot to check err status

* Invalid header shouldn't fail the entire stage

* Potential fix for verification (#3962)

* Potential fix for verification

* multi verify

Co-authored-by: Alexey Sharp <[email protected]>
Co-authored-by: Alex Sharp <[email protected]>

* p2p/discover/v4wire: use optional RLP field for EIP-868 seq (#3963)

This changes the definitions of Ping and Pong, adding an optional field
for the sequence number. This field was previously encoded/decoded using
the "tail" struct tag, but using "optional" is much nicer.

see ethereum/go-ethereum#22842

Co-authored-by: Felix Lange <[email protected]>

* FullSync instead of FastSync (#3980)

* Update README.md (#3984)

* Update README.md (#3985)

* Update README.md (#3987)

* Update README.md (#3988)

* Update README.md (#3989)

* save (#3983)

* Update to erigon-lib main (#3992)

Co-authored-by: Alex Sharp <[email protected]>

* TxLookup fix 2 (#3994)

* save

* save

* tolerate some fails

* tolerate some fails

Co-authored-by: Alexey Sharp <[email protected]>

* No NewBlock gossip after Merge (#3995)

* Check that safe & finalized blocks are canonical for no-op forkChoice (#3997)

* Place finishHandlingForkChoice after startHandlingForkChoice

* forkChoiceMessage -> forkChoice

* Check that safe & finalized blocks are canonical for no-op forkChoice

* Re-introduced cleanup of temporary table (#3999)

* Re-introduced cleanup of temporary table

* Fix sign

* Fix lint

* Fix lint

* Revert

Co-authored-by: Alex Sharp <[email protected]>

* Update skip_analysis.go (#4003)

* Downloader: calc stat inside, add --torrent.download.slots and limit downloads inside (#3986)

* save

* save

* save

* save

* save

* save

* save

* save

* save

* p2p: speed-up TestUDPv4_LookupIterator (#4000)

The test was slow, because it was trying to find
predefined nodeIDs (lookupTestnet) by generating random keys
and trying to find their neighbours
until it hits all nodes of the lookupTestnet.
In addition each FindNode response was waited for 0.5 sec (respTimeout).
This could take up to 30 sec and fail the test suite.

A fake random key generator is now used during the test.
It issues the expected keys, and the lookup converges quickly.
The reply timeout is reduced for the test.
Now it normally takes less than.1 sec.

* p2p: refactor MaxPendingPeers handling (#3981)

* use semaphore instead of a chan struct{}
* move MaxPendingPeers default value to DefaultConfig.P2P
* log Error if Accept fails
* replace quit channel with context

* downloader stuck on 99.9% fix #4004

* Open only existing torrent files (#4007)

* save

* save

* save

* save

* save

* Open shorter logs #400

* Fix empty "Tables" log line (#4008)

* save

* save

* save

* Torrent: maxpeers flag were used incorrectly

* reduce downloader deps (#4010)

* reduce downloader deps

* reduce downloader deps

* reduce downloader deps (#4011)

* Handle system-txn in block_reader (#4012)

* reduce downloader deps

* reduce downloader deps

* save

* reduce downloader deps

* [integration tool] Clean BorReceipt when reset state (#4013)

* Update reset_state.go

* Update reset_state.go

* rename field "type" (#4015)

* save

* save

* save

* typed sender (#4016)

* save

* save

* Observer: fix panic on clean start (#4002) (#4017)

Problem: (nil, nil) from CountPingErrors was not handled.
This happens if the node is not in the db (a bootstrap node),
and was never crawled before.

* Add override.terminaltotaldifficulty flag (#4018)

* cmd/utils: initialize f.Value before setting variable

* override.terminaltotaldifficulty flag

* Add OverrideTerminalTotalDifficulty to default_flags

* p2p: fix flaky TestUDPv5_lookupE2E (#4020)

The test was flaky, because of the "endpoint prediction".
The test starts 5 nodes one by one.
Node 0 is used as a bootstrap node for nodes 1-4.
When it is about to add, say, node 3, nodes 0 and 1 might already have had a chance to communicate,
and updateEndpoints() deletes the node 0 UDP port, because fallbackUDP port was not configured.

In this case node 3 would get a bootstrap node 0 without a port and lead to an error:

    v5_udp_test.go:110: bad bootstrap node "enr:...": missing UDP port

The problem was reproducible by this command:

    go test ./p2p/discover -run TestUDPv5_lookupE2E -count 500

* Added Goerli Full Node Space Requirements (#4021)

* p2p: crawler-friendly handshake (#3982)

* exchange RLPx Hello even when maxpeers limit is reached
* bump MaxPendingPeers to increase the default handshake queue
  (and the likelyhood of Hello exchange)

* Add link about rqspbery po (#4022)

* More efficient header verification of headers for Parlia when snapshots are used (#3998)

* Update stageloop.go

* Print

* Consider snapshot headers as parlia checkpoints

* Not fail after not loading snapshot

* Lazy snapshots

* Print number of validators

* More printing

* Use epoch instead of checkpoint interval

* Reduce logging

* Fix compilation

* Remove trace jump dest

* Fix lint

* Not store snapshots every epoch

* Separate snapshot for verification and finalisation

Co-authored-by: Alex Sharp <[email protected]>
Co-authored-by: Alexey Sharp <[email protected]>

* Docker build: make db-tools to depend on git-submodules (#4024)

* save

* save

* save

* save

* save

* More careful handle of sequences in stage_headers --reset (#4023)

* save

* save

* save

* save

* added ovveride merge fork block (#4027)

* Fix non-starting download (#4031)

* save

* save

* save

* save

* save (#4032)

* Truncate bor receipts on unwind (#4033)

Co-authored-by: Alexey Sharp <[email protected]>

* eth/filters: Fix filterLogs() (#4036)

* index segments by maximum by 2 workers #4041

* trace read parent header from snapshot and lru #4042

* make sure stage_headers --reset doesn't left garbage in bodies table #4043

* Fix for Bor (Polygon) (#4044)

* print branchHash

* Print state changes

* Print val

* Fix for author

* Remove prints

Co-authored-by: Alexey Sharp <[email protected]>

* Cleanup isBor (#4045)

Co-authored-by: Alexey Sharp <[email protected]>

* Speed up docker image build by use layer cache (#4038)

* speed up docker image build by use layer cache

* rearrenge Dockerfile

* enable docker layer cache in github action

* state_processor: fix ignored SkipAnalysis() result (#4046)

`cfg` is not a pointer

* p2p: improve test TestTable_findnodeByID (#4047)

* refactor test
* add a fast fixed examples test for the main suite
* split slow test for the integration suite

* Update skip_analysis.go (#4052)

* More relax inclusion of headers in the downloader (#4050)

* More relax inclusion of headers in the downloader

* Fix

Co-authored-by: Alexey Sharp <[email protected]>

* Revert "Speed up docker image build by use layer cache (#4038)" (#4054)

This reverts commit e758fb8.

* Increase max DB size to 8 Tb for chain data only (#4055)

* Update node.go

* Update node.go

* Point to erigon-lib alpha

Co-authored-by: Alexey Sharp <[email protected]>
Co-authored-by: Alex Sharp <[email protected]>
Co-authored-by: battlmonstr <[email protected]>
Co-authored-by: Chase Wright <[email protected]>
Co-authored-by: Alex Sharov <[email protected]>
Co-authored-by: Giulio rebuffo <[email protected]>
Co-authored-by: Krishna Upadhyaya <[email protected]>
Co-authored-by: Håvard Anda Estensen <[email protected]>
Co-authored-by: Enrique Jose  Avila Asapche <[email protected]>
Co-authored-by: Felix Lange <[email protected]>
Co-authored-by: Andrew Ashikhmin <[email protected]>
Co-authored-by: gaia <[email protected]>
Co-authored-by: EXEC <[email protected]>
Co-authored-by: Groute <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants