Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session roll over fix #1536

Merged
merged 6 commits into from
Apr 10, 2023
Merged

Conversation

nodiesBlade
Copy link
Contributor

@nodiesBlade nodiesBlade commented Mar 16, 2023

Addresses #1464 by adding a configurable session block tolerance into pocket config client_session_sync_allowance with a default value of 1

This means that once a new session appears, servicer nodes will still respect old sessions up until 1 session block (4 blocks at the time of this PR).

DISCLAIMER: SHOULD DEFINITELY TEST.......................... USING THE REGULAR RELEASE CYCLE MECHANISMS

@nodiesBlade
Copy link
Contributor Author

Newly added tests

=== RUN   TestKeeper_HandleRelay
I[2023-03-16|00:54:58.549] Initializing 1512ec818e72bfec7696f443e2324f79d90b4762 session and evidence cache 
--- PASS: TestKeeper_HandleRelay (0.20s)
=== RUN   TestKeeper_HandleRelayWithinTolerance
I[2023-03-16|00:54:58.726] Initializing 5408648319da126ccd8cce3c5efa44a93a9d8943 session and evidence cache 
--- PASS: TestKeeper_HandleRelayWithinTolerance (0.15s)
=== RUN   TestKeeper_HandleRelayOutsideTolerance
I[2023-03-16|00:54:58.904] Initializing 0e694b1904edc2fbdc13978ec27487a717d51bde session and evidence cache 
I[2023-03-16|00:54:59.058] Initializing d8dbfee56de47cb4c4708e61c012d1b99bb42d8b session and evidence cache 
--- PASS: TestKeeper_HandleRelayOutsideTolerance (0.33s)
PASS

Process finished with the exit code 0

@nodiesBlade
Copy link
Contributor Author

nodiesBlade commented Mar 16, 2023

Note: some nodes will seal their evidence claim immediately after sesson rollover and this will result in evidence sealed errors. In order to remediate this, we could modify claim and proof tx to only occur on 2,3,4th of a session block instead depending on the nodes address offset. incorporating this network wide will require more due diligence as the reason why this code exists in first place is to prevent too many claims/proofs from being submitted at once in a single block

@Olshansk Olshansk self-requested a review March 16, 2023 16:43
@Olshansk Olshansk added bug Something isn't working optimization labels Mar 16, 2023
@Olshansk Olshansk added this to the Network Cost milestone Mar 16, 2023
@Olshansk
Copy link
Member

@PoktBlade Thanks for doing this!

I haven't reviewed the code but just wanted to ACK that I'm aware and it's on my backlog.

DISCLAIMER: SHOULD DEFINITELY TEST.......................... USING THE REGULAR RELEASE CYCLE MECHANISMS

I, unfortunately, think that testing and getting confidence in this change is going to be one of the hardest parts, aside from the social coordination of deploying it.

I have two requests:

  1. Can you look into whether we can extend the e2e testing framework to support relays and have an integration test that validates this change.

  2. I have an open PR that enables spinning up a LocaNet and sending relays by specifying altruist endpoints (details are in the docs).
    2.1 Can you use it to validate/test this change?
    2.2 Please note that the infra is janky but the best we have (to the best of my knowledge), so I apologize in advance.

@nodiesBlade
Copy link
Contributor Author

nodiesBlade commented Mar 16, 2023

Hey,

I have set up my own localnet stack from rigging up my own E2E stack. So I can test to see if session roll over works.

the e2e automatic test suite is new to me, I don't have the bandwidth to test that. It was new and delivered abruptly so no one besides pni has context of that. The previous testing mechanisms included a QA suite of test cases and then a manual regression test using said qa test suite. I can add some scenarios to that, and someone can translate it to the automatic version.

@Olshansk
Copy link
Member

Olshansk commented Mar 16, 2023

tl;dr Noted, I will put the time into this myself.


I have set up my own localnet stack from rigging up my own E2E stack. So I can test to see if session roll over works.

I tried to put together good & easy-to-follow documentation (so it can be used by others), but if you can share the results of your own stack (potentially a video and/or documentation), that would be good too. Ref: https://github.com/pokt-network/pocket-core/blob/fixing_localnet/localnet.md

the e2e automatic test suite is new to me, I don't have the bandwidth to test that. It was new and delivered abruptly so no one besides pni has context of that

Understood. I tried to make the documentation easy to follow but will invest time in extending, improving and testing it. https://github.com/pokt-network/pocket-core/tree/staging/e2e

types/config.go Outdated Show resolved Hide resolved
types/utils.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Show resolved Hide resolved
x/pocketcore/keeper/service.go Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/session.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/session.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
@nodiesBlade
Copy link
Contributor Author

Hey @Olshansk I responded to most of your comments - thanks for taking the time to review it. I think the largest I concern was that you mentioned one of the tests were not running for you after reverting the major core piece of this branch, indicating something might be wrong. Can you provide your test results and we can take it from there, thank you!

@Olshansk
Copy link
Member

Shouldn't this test fail if I revert the changes you made?

Screenshot 2023-03-28 at 7 58 28 PM

@nodiesBlade
Copy link
Contributor Author

nodiesBlade commented Mar 29, 2023

Shouldn't this test fail if I revert the changes you made?

Screenshot 2023-03-28 at 7 58 28 PM

Tests should not fail under the current designs.

Explanation

So the test scenario is mainly bootstrapped by a function setupHandleRelayTest which initializes practically everything that wasn't mockable and whatnot so that these functions actually run without runtime errors. This function also returns a signed relay which is encoded with a session block height of 976 (referred as relaySessionBlockHeight from now on), always. Finally, it mocks out the context so that way the current height is set by currentBlockHeight in the function signature.

Change made

In your scenario, you revert my changes, so the network will always grab the latest session block height dictated by currentBlockHeight.

Tests:

Test One (Simple handle relay):

  1. Block height 976
  2. relay encoded with relaySessionBlockHeight as 976
  3. GetLatestSessionBlockHeight is still 976,
  4. Validation passes, relays is handled.

Test Two (Within tolernace):

  1. Block height 977
  2. relay encoded with relaySessionBlockHeight as 976
  3. GetLatestSessionBlockHeight is still 976,
  4. Validation passes, relays is handled.

Test Three (outside tolernace):

  1. Block height 926
  2. relay encoded with relaySessionBlockHeight as 976
  3. GetLatestSessionBlockHeight is now 926
  4. so the relayBlockSessionHeight != GetLatestSessionBlockHeight, and therefore invalid block height error assertion matches.

pls explain

Now you might be asking, what's up with these random/magic numbers in these tests, and why are the tests structured this way. Unfortunately, this is just tech debt within the code base already, and I'm happy to resolve as much as I can and provide clarity as you see fit. What's more important to me is that you understand the code that I'm writing and you agree that the logic works. Everything else, I prefer to be a bit more pragmatic and timely on how to solve these PRs. I'll solve as much as I can in the same PR and will suggest moving the unresolved ones to another GH issue/PR. Can you mark the ones as blockers, and the ones as nice to have, and I will prioritize as needed.

P.S - I used delve to debug to see the values on the heap/stack and the function calls trace, good tool to use

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's more important to me is that you understand the code that I'm writing and you agree that the logic works.

I spent several hours yesterday looking at this and string to understand it, but I personally still don't have confidence that this works.

What I'd prefer is something similar to what @msmania has in #1534.

  1. Add feature switch
  2. Show failing test with feature switch off
  3. Turn on feature switch
  4. Show successful test with feature switch on

Now you might be asking, what's up with these random/magic numbers in these tests, and why are the tests structured this way. Unfortunately, this is just tech debt within the code base already, and I'm happy to resolve as much as I can and provide clarity as you see fit.

Sorry about that. I'm having a hard time wrapping my head around it, which is why I'm asking for the renaming of certain functions/variables and adding comments where appropriate.

Can you mark the ones as blockers, and the ones as nice to have, and I will prioritize as needed.

I added follow-up comments and re-reviewed the PR.

  1. Please take a look at the my comments and re-request a review once it's ready for another look.
  2. If you add me as an editor on your fork (or move the branch to the main repo), I can help make some of the edits myself on this branch.

types/config.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service.go Show resolved Hide resolved
x/pocketcore/keeper/service.go Show resolved Hide resolved
x/pocketcore/keeper/service.go Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/session.go Outdated Show resolved Hide resolved
@nodiesBlade
Copy link
Contributor Author

Hey @Olshansk

Addressed most of your comments, try giving it a read again with the test cases, it should provide more clarity for your reading. If it takes too long, just message me async and I'll do my best to explain the logic.

@nodiesBlade
Copy link
Contributor Author

nodiesBlade commented Mar 31, 2023

Also, you mention we should have a feature flag.. Just a couple notes to that:

  1. The tests written previously never accounted for invalid block height tolerance, it was a simple test for the relay test in a happy route, that's why you don't see a 'failure case' test. The outside tolerance test adds a invalid block height test for both my changes and before my changes, effectively.
  2. The handleRelay function is purely client sided, it has nothing to do with consensus, so there is not really a need for a true 'feature flag'.

@msmania
Copy link
Contributor

msmania commented Apr 3, 2023

Sorry for interrupting the ongoing discussion. If I understand this patch correctly, the change in HandleDispatch is not necessary, right?

@nodiesBlade
Copy link
Contributor Author

Sorry for interrupting the ongoing discussion. If I understand this patch correctly, the change in HandleDispatch is not necessary, right?

I would deem it necessary as if we are going to allow for “sessions to be rolled over” then we should allow the node to dispatch for the old sessions as well.

@msmania
Copy link
Contributor

msmania commented Apr 3, 2023

I would deem it necessary as if we are going to allow for “sessions to be rolled over” then we should allow the node to dispatch for the old sessions as well.

Session role over can be achieved by the change in HandleRelay, right? The other part is not necessary.

When someone, basically the Portal, starts a relay, it should always use the latest session and shouldn't request a relay with an old session because app or node's staking status may have been changed. And the Portal indeed does that by setting SessionHeader.SessionBlockHeight = 0 (I think it happens here), so it doesn't change the relay behavior anyway.

What your change in HandleDispatch achieves is to enable us to query an old session. I agree this is convenient, but I think it's out of the scope of this PR.

Moreover, you need to be careful about the second argument of NewSession, which must be at most the end of the session of the first argument. Otherwise it may return a wrong session as described in #1529.

@nodiesBlade
Copy link
Contributor Author

What your change in HandleDispatch achieves is to enable us to query an old session. I agree this is convenient, but I think it's out of the scope of this PR.

Fair enough. Removed it. Portals who want to dispatch can later send in a PR or modify their own PCC to dispatch for an old session.

@msmania
Copy link
Contributor

msmania commented Apr 3, 2023

Thanks! One more ask about test code. In your test, each testcase calls setupHandleRelayTest, creating the whole thing from scratch. This is not an ideal design. Also, you should not hardcode block numbers even though they're stored separately. Please use ctx.BlockHeight(), which is 976, and keeper.BlocksPerSession(ctx), which is 25.

I think the test should be something like this. With this, one setup covers all scenarios and magic numbers are removed completely.

func TestKeeper_HandleRelay(t *testing.T) {
	setupHandleRelayTest(...)

	blockPerSession := keeper.BlocksPerSession(ctx)
	allowance = some positive value

	types.GlobalPocketConfig.ClientSessionBlockSyncAllowance = 0

	// Send relay with older heights without session roll over
	for i := 0; i < blockPerSession*(allowance+1); i++ {
		height := ctx.BlockHeight() - i
		// create a relay payload with `height`
		resp, err := keeper.HandleRelay(mockCtx, validRelay)
		// verify the result accordingly
	}

	types.GlobalPocketConfig.ClientSessionBlockSyncAllowance = allowance

	// Send relay with older heights with session roll over
	for i := 0; i < blockPerSession*(allowance+1); i++ {
		height := ctx.BlockHeight() - i
		// create a relay payload with `height`
		resp, err := keeper.HandleRelay(mockCtx, validRelay)
		// verify the result accordingly
	}
}

@nodiesBlade
Copy link
Contributor Author

nodiesBlade commented Apr 4, 2023

Thanks! One more ask about test code. In your test, each testcase calls setupHandleRelayTest, creating the whole thing from scratch. This is not an ideal design. Also, you should not hardcode block numbers even though they're stored separately. Please use ctx.BlockHeight(), which is 976, and keeper.BlocksPerSession(ctx), which is 25.

I think the test should be something like this. With this, one setup covers all scenarios and magic numbers are removed completely.

func TestKeeper_HandleRelay(t *testing.T) {
	setupHandleRelayTest(...)

	blockPerSession := keeper.BlocksPerSession(ctx)
	allowance = some positive value

	types.GlobalPocketConfig.ClientSessionBlockSyncAllowance = 0

	// Send relay with older heights without session roll over
	for i := 0; i < blockPerSession*(allowance+1); i++ {
		height := ctx.BlockHeight() - i
		// create a relay payload with `height`
		resp, err := keeper.HandleRelay(mockCtx, validRelay)
		// verify the result accordingly
	}

	types.GlobalPocketConfig.ClientSessionBlockSyncAllowance = allowance

	// Send relay with older heights with session roll over
	for i := 0; i < blockPerSession*(allowance+1); i++ {
		height := ctx.BlockHeight() - i
		// create a relay payload with `height`
		resp, err := keeper.HandleRelay(mockCtx, validRelay)
		// verify the result accordingly
	}
}

If we were using something like Ginkgo to structure our tests, I'd agree. Otherwise, having the tests separately helps isolate the environments and what we're testing for since these are unit tests and not integration / e2e tests.

Also, you should not hardcode block numbers even though they're stored separately. Please use ctx.BlockHeight(), which is 976, and keeper.BlocksPerSession(ctx), which is 25.

Also agree, but this would require more restructuring. I would push for a follow-up PR to improve tests if we really want to sharpen this old cold base (not worth it for this use case IMO), my bandwidth is quite limited atm to do more refactoring. If it's an absolute must, then please let me know and I'll find time to do it as I'd like see pokt continue to improve. (Or if you'd like to set up these changes, that'd be great too)

@Olshansk
Copy link
Member

Olshansk commented Apr 4, 2023

/reviewpad summarize

@reviewpad
Copy link

reviewpad bot commented Apr 4, 2023

AI-Generated Pull Request Summary: This pull request makes several changes to the configuration file, including adding a new ClientSessionBlockSyncAllowance field, updating the default values and modifying the HandleRelay, GetCacheKey and IsBetween functions. Additionally, it adds new tests for the HandleRelay function and updates the existing tests. It also introduces a new IsProofSessionHeightWithinTolerance function to check if the latest session block height is within a specified tolerance range.

@reviewpad reviewpad bot added medium Pull request is medium waiting-for-review labels Apr 4, 2023
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PoktBlade A few more minor comments / NITs + I asked to leave a TODO per @msmania's suggestion so him or I can do it in a separate PR.

Two questions:

  1. Does the following diagram capture (visually) the changes made in a simple way in your opinion? Lmk if I should change anything
sequenceDiagram
    title 4 Blocks Per Session

    participant C as Client
    participant N as Node(h=100)

    alt 1. before change - ok
        C->>+N: Relay(100<=h<=103)
        N->>-C: Response + Reward
    else 2. before change - err
        C->>+N: Relay(h<100 || h > 103)
        N->>-C: Err
    else 3. after change - ok
        C->>+N: Relay(100<=h<=103)
        N->>-C: Response + Reward
        C->>+N: Relay(96<=h<=99)
        N->>-C: Response + ?? NoReward??
    else 4. before change - err
        C->>+N: Relay(h<96 || h > 103)
        N->>-C: Err                
    end
Loading
  1. Am I correct to understand that in case (3) above, the servicer won't be rewarded for serving a relay from a previous session

  2. I was looking at x/pocketcore/types/service.go and trying to understand why we don't need to change the relay meta validation. Could you help me understand this?

func (m RelayMeta) Validate(ctx sdk.Ctx) sdk.Error {
	// ensures the block height is within the acceptable range
	if ctx.BlockHeight()+int64(GlobalPocketConfig.ClientBlockSyncAllowance) < m.BlockHeight || ctx.BlockHeight()-int64(GlobalPocketConfig.ClientBlockSyncAllowance) > m.BlockHeight {
		return NewOutOfSyncRequestError(ModuleName)
	}
	return nil
}

@@ -55,6 +55,18 @@ func (k Keeper) IsSessionBlock(ctx sdk.Ctx) bool {
return ctx.BlockHeight()%k.posKeeper.BlocksPerSession(ctx) == 1
}

// IsProofSessionHeightWithinTolerance checks if the latest session block height is bounded by minSessionBlockHeight <= x <= latestBlockHeight
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IsProofSessionHeightWithinTolerance checks if the latest session block height is bounded by minSessionBlockHeight <= x <= latestBlockHeight
// IsProofSessionHeightWithinTolerance checks if the relaySessionBlockHeight is bounded by (latestSessionBlockHeight - tolerance ) <= x <= latestBlockHeight

Copy link
Contributor Author

@nodiesBlade nodiesBlade Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've adjusted this to the max range to be latestSessionHeight which is what my tests reflects as well. There is no reason to believe that the relay session block height would ever be above latest session block height.

if relay proof session block height is above latest session block height, then the app sent in a wrong app session height or the app dispatcher is ahead of the servicer node, which should result in an error

x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
@reviewpad reviewpad bot added medium Pull request is medium and removed large Pull request is large labels Apr 7, 2023
@reviewpad
Copy link

reviewpad bot commented Apr 7, 2023

AI-Generated Summary: This pull request introduces a new configuration parameter ClientSessionSyncAllowance to give the clients a range to sync session to node's height. The IsProofSessionHeightWithinTolerance method is added to check if the relay session block height is within an acceptable range. A new test case is introduced to verify the changes, and the HandleRelay method is updated accordingly.

@reviewpad
Copy link

reviewpad bot commented Apr 7, 2023

AI-Generated Summary: This pull request introduces a new configuration option named client_session_sync_allowance, which allows setting a tolerance for the number of sessions that a relay request is allowed to be out-of-sync with the current session height. The implementation includes changes in the PocketConfig struct and DefaultConfig, as well as adding a new function IsProofSessionHeightWithinTolerance in the keeper. The corresponding documentation in quickstart.md has also been updated. Unit tests for testing the new functionality are added in service_test.go.

@nodiesBlade
Copy link
Contributor Author

Squashed all my initial commits to rebase easily

  • updated test cases for out of range and in range.
  • renamed client_session_block_sync_allowance to client_session_sync_allowance
  • Updated the update config command to include a default allowance (a small detail I remembered from LP, not sure if this is even used)
  • Updated quickstart read.me

I believe this is good to merge in. @Olshansk, merge in if you think so as well!

@msmania
Copy link
Contributor

msmania commented Apr 7, 2023

I'd move IsProofSessionHeightWithinTolerance to service.go, but I'm just fine. This patch all looks good to me. Thanks!

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits related to comments. Will approve after.


Unrelated and optional: @PoktBlade, sharing personal experience: Rebasing has worked well for me in other environments (e.g. big companies) and platforms (e.g. gerrit). But I've found that because of a lot of factors (OSS, github, PR flow, squash & merging to main, etc...), I just always `"merge with main" and resolve conflicts.

x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
x/pocketcore/keeper/service_test.go Outdated Show resolved Hide resolved
@reviewpad
Copy link

reviewpad bot commented Apr 8, 2023

AI-Generated Summary: This pull request introduces a new configuration property client_session_sync_allowance to manage the allowed range of session heights in relay requests. It updates the types/config.go and app/config.go files to include this new property with a default value of 1. Also, it modifies the x/pocketcore/keeper/service.go to check if the relay's session height is within the allowed range while handling the relay. Lastly, relevant test cases are updated and added to x/pocketcore/keeper/service_test.go to ensure the new feature's correctness.

@nodiesBlade
Copy link
Contributor Author

Resolved the nits, good to go again.

Copy link
Contributor

@msmania msmania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience!

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PoktBlade. Feel free to squash & merge into staging!

I'll also take a stab at documenting all of the great discussions & notes we have here and will send your way to review by EOW. 🙏

@nodiesBlade nodiesBlade merged commit 8b7b30f into pokt-network:staging Apr 10, 2023
msmania added a commit that referenced this pull request Apr 18, 2023
With session rollover (#1536), we accept relays for older sessions.  For
exaple, a node at block 101 accepts a relay for the session height 97.

There is a bug in the relay validation function `relay.Validate`.  In the
example above, the function sees the following values:

```
ctx.BlockHeight() = 101
sessionBlockHeight = r.Proof.SessionBlockHeight = 97
sessionCtx.BlockHeight() = 97
```

and if the corresponding session is not cached, it passes `sessionCtx` and `ctx`
to `NewSession`.  This may return a wrong session because the second argument
is supposed to be the end of the session, but in this case it's not.

The proposed fix is if `ctx` is on the next session, we get a new context of
the session end and pass it to `NewSession`.
msmania added a commit that referenced this pull request Apr 18, 2023
With session rollover (#1536), we accept relays for older sessions.  For
exaple, a node at block 101 accepts a relay for the session height 97.

There is a bug in the relay validation function `relay.Validate`.  In the
example above, the function sees the following values:

```
ctx.BlockHeight() = 101
sessionBlockHeight = r.Proof.SessionBlockHeight = 97
sessionCtx.BlockHeight() = 97
```

and if the corresponding session is not cached, it passes `sessionCtx` and `ctx`
to `NewSession`.  This may return a wrong session because the second argument
is supposed to be the end of the session, but in this case it's not.

The proposed fix is if `ctx` is beyond the relay session, we get a new context
of the session end and pass it to `NewSession`.
msmania added a commit that referenced this pull request Apr 20, 2023
With session rollover (#1536), we accept relays for older sessions. For
exaple, a node at block 101 accepts a relay for the session height 97.

There is a bug in the relay validation function `relay.Validate`. In the
example above, the function sees the following values:

```
ctx.BlockHeight() = 101
sessionBlockHeight = r.Proof.SessionBlockHeight = 97
sessionCtx.BlockHeight() = 97
```

and if the corresponding session is not cached, it passes `sessionCtx`
and `ctx` to `NewSession`. This may return a wrong session because the
second argument is supposed to be the end of the session, but in this
case it's not.

The proposed fix is if `ctx` is beyond the relay session, we get a new
context of the session end and pass it to `NewSession`.
@POKT-Discourse
Copy link

This pull request has been mentioned on Pocket Network Forum. There might be relevant details there:

https://forum.pokt.network/t/into-the-gateway-verse-bootstrapping-a-multi-gateway-ecosystem/4532/1

Olshansk pushed a commit that referenced this pull request Jul 5, 2023
## Description
This PR sets the default session sync allowance back to zero, effectively removing the session rollover fix as a default setting. Clients can re-enable as they see fit. Gateways should code logic that accounts for old sessions. Gateways should also always use the latest session when possible.

PNI recently transitioned from AWS to GCP, and their Portal V1 stack to V2 stack. This has resulted in 1.3B to ~600m relays being submitted on the chain. Our fleet of 4K+ nodes has also seen reduced served traffic significantly.

The rationale behind this change is simple: Too many changes all happening at once. This change is simply to prevent additional noise from the ongoing incident. Once PNI's V2 gateway is stable and network traffic is stable, we can set this default back to `1` with a new build given its nonconsensus changing.

NOTE:  This is to prevent any more damage with the ongoing incident and cause additional confusion for the gateway and node runners. Since we don't know what the root cause is for the missing traffic, by not including it by default, it removes one less factor for them to consider.

Relevant PR: #1536

<!-- reviewpad:summarize:start -->
### Summary generated by Reviewpad on 27 Jun 23 04:41 UTC
This pull request updates the default session sync allowance configuration from 1 to 0, representing a session unit (irrespective of num blocks per session). This ensures that the session sync allowance configuration is more adjustable for user's needs.
<!-- reviewpad:summarize:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working medium Pull request is medium optimization waiting-for-review
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants