Backport fixes to release/2.7 #890

tiagolobocastro · 2024-11-26T12:18:36Z

chore(bors): merge pull request #887

887: Fix regression for pool creation timeout retry r=tiagolobocastro a=tiagolobocastro

    test: use tmp in project workspace

    Use a tmp folder from the workspace allowing us to cleanup up things like
    LVM volumes a lot easier as we can just purge it.

    Signed-off-by: Tiago Castro <[email protected]>

---

    test(pool): create on very large or very slow disks

    Uses LVM Lvols as backend devices for the pool.
    We suspend these before pool creation, allowing us to simulate slow
    pool creation.
    This test ensures that the pool creation is completed by itself and also
    that a client can also complete it by calling create again.

    Signed-off-by: Tiago Castro <[email protected]>

---

    fix: allow pool creation to complete asynchronously

    When the initial create gRPC times out, the data-plane may still be creating
    the pool in the background, which can happen for very large pools.
    Rather than assume failure, we allow this to complete in the background up to
    a large arbitrary amount of time. If the pool creation completes before, then
    we retry the creation flow.
    The reason why we don't simply use very large timeouts is because the gRPC
    operations are currently sequential, mostly due to historical reasons.
    Now that the data-plane is allowing concurrent calls, we should also allow
    this on the control-plane.
    TODO: allow concurrent node operations

    Signed-off-by: Tiago Castro <[email protected]>

---

    fix: check for correct not found error code

    A previous fix ended up not working correctly because it was merged
    incorrectly, somehow!

    Signed-off-by: Tiago Castro <[email protected]>

---

    chore: update terraform node prep

    Pull the Release key from a recent k8s version since the old keys are no
    longer valid.
    This will have to be updated from time to time.

Co-authored-by: Tiago Castro <[email protected]>

fix(resize): atomically check for the required size

Ensures races don't lead into volume resize failures.

Signed-off-by: Tiago Castro <[email protected]>

test(bdd/thin): fix racy thin prov test

Add retry waiting for condition to be met.

Signed-off-by: Tiago Castro <[email protected]>

feat(topology): remove the internal labels while displaying

Signed-off-by: sinhaashish <[email protected]>

fix(fsfreeze): improved error message when volume is not staged

Signed-off-by: Abhinandan Purkait <[email protected]>

fix(deployer): increasing the max number of allowed connection attempts to the io-engine

Signed-off-by: sinhaashish <[email protected]>

fix(topology): hasTopologyKey overwites affinityTopologyLabels

Signed-off-by: sinhaashish <[email protected]>

Signed-off-by: sinhaashish <[email protected]>

…ts to the io-engine Signed-off-by: sinhaashish <[email protected]>

Signed-off-by: Abhinandan Purkait <[email protected]>

Signed-off-by: sinhaashish <[email protected]>

Add retry waiting for condition to be met. Signed-off-by: Tiago Castro <[email protected]>

tiagolobocastro · 2024-11-26T12:19:20Z

bors try

bors-openebs-mayastor · 2024-11-26T12:50:42Z

try

Build succeeded:

continuous-integration/jenkins/branch

tiagolobocastro · 2024-11-26T15:46:43Z

bors merge

890: Backport fixes to release/2.7 r=tiagolobocastro a=tiagolobocastro chore(bors): merge pull request #887 887: Fix regression for pool creation timeout retry r=tiagolobocastro a=tiagolobocastro test: use tmp in project workspace Use a tmp folder from the workspace allowing us to cleanup up things like LVM volumes a lot easier as we can just purge it. Signed-off-by: Tiago Castro <[email protected]> --- test(pool): create on very large or very slow disks Uses LVM Lvols as backend devices for the pool. We suspend these before pool creation, allowing us to simulate slow pool creation. This test ensures that the pool creation is completed by itself and also that a client can also complete it by calling create again. Signed-off-by: Tiago Castro <[email protected]> --- fix: allow pool creation to complete asynchronously When the initial create gRPC times out, the data-plane may still be creating the pool in the background, which can happen for very large pools. Rather than assume failure, we allow this to complete in the background up to a large arbitrary amount of time. If the pool creation completes before, then we retry the creation flow. The reason why we don't simply use very large timeouts is because the gRPC operations are currently sequential, mostly due to historical reasons. Now that the data-plane is allowing concurrent calls, we should also allow this on the control-plane. TODO: allow concurrent node operations Signed-off-by: Tiago Castro <[email protected]> --- fix: check for correct not found error code A previous fix ended up not working correctly because it was merged incorrectly, somehow! Signed-off-by: Tiago Castro <[email protected]> --- chore: update terraform node prep Pull the Release key from a recent k8s version since the old keys are no longer valid. This will have to be updated from time to time. Co-authored-by: Tiago Castro <[email protected]> --- fix(resize): atomically check for the required size Ensures races don't lead into volume resize failures. Signed-off-by: Tiago Castro <[email protected]> --- test(bdd/thin): fix racy thin prov test Add retry waiting for condition to be met. Signed-off-by: Tiago Castro <[email protected]> --- feat(topology): remove the internal labels while displaying Signed-off-by: sinhaashish <[email protected]> --- fix(fsfreeze): improved error message when volume is not staged Signed-off-by: Abhinandan Purkait <[email protected]> --- fix(deployer): increasing the max number of allowed connection attempts to the io-engine Signed-off-by: sinhaashish <[email protected]> --- fix(topology): hasTopologyKey overwites affinityTopologyLabels Signed-off-by: sinhaashish <[email protected]> Co-authored-by: sinhaashish <[email protected]> Co-authored-by: Abhinandan Purkait <[email protected]> Co-authored-by: Tiago Castro <[email protected]> Co-authored-by: mayastor-bors <[email protected]>

bors-openebs-mayastor · 2024-11-26T16:05:51Z

Build failed:

continuous-integration/jenkins/branch

Ensures races don't lead into volume resize failures. Signed-off-by: Tiago Castro <[email protected]>

887: Fix regression for pool creation timeout retry r=tiagolobocastro a=tiagolobocastro test: use tmp in project workspace Use a tmp folder from the workspace allowing us to cleanup up things like LVM volumes a lot easier as we can just purge it. Signed-off-by: Tiago Castro <[email protected]> --- test(pool): create on very large or very slow disks Uses LVM Lvols as backend devices for the pool. We suspend these before pool creation, allowing us to simulate slow pool creation. This test ensures that the pool creation is completed by itself and also that a client can also complete it by calling create again. Signed-off-by: Tiago Castro <[email protected]> --- fix: allow pool creation to complete asynchronously When the initial create gRPC times out, the data-plane may still be creating the pool in the background, which can happen for very large pools. Rather than assume failure, we allow this to complete in the background up to a large arbitrary amount of time. If the pool creation completes before, then we retry the creation flow. The reason why we don't simply use very large timeouts is because the gRPC operations are currently sequential, mostly due to historical reasons. Now that the data-plane is allowing concurrent calls, we should also allow this on the control-plane. TODO: allow concurrent node operations Signed-off-by: Tiago Castro <[email protected]> --- fix: check for correct not found error code A previous fix ended up not working correctly because it was merged incorrectly, somehow! Signed-off-by: Tiago Castro <[email protected]> --- chore: update terraform node prep Pull the Release key from a recent k8s version since the old keys are no longer valid. This will have to be updated from time to time. Co-authored-by: Tiago Castro <[email protected]> Signed-off-by: Tiago Castro <[email protected]>

tiagolobocastro · 2024-11-26T18:24:54Z

bors merge

bors-openebs-mayastor · 2024-11-26T18:45:59Z

Build succeeded:

continuous-integration/jenkins/branch

sinhaashish and others added 5 commits November 26, 2024 09:56

fix(topology): hasTopologyKey overwites affinityTopologyLabels

e4af2b0

Signed-off-by: sinhaashish <[email protected]>

fix(deployer): increasing the max number of allowed connection attemp…

efe6274

…ts to the io-engine Signed-off-by: sinhaashish <[email protected]>

fix(fsfreeze): improved error message when volume is not staged

546d330

Signed-off-by: Abhinandan Purkait <[email protected]>

feat(topology): remove the internal labels while displaying

6caafb7

Signed-off-by: sinhaashish <[email protected]>

test(bdd/thin): fix racy thin prov test

4f112ad

Add retry waiting for condition to be met. Signed-off-by: Tiago Castro <[email protected]>

tiagolobocastro requested review from sinhaashish, Abhinandan-Purkait and dsharma-dc November 26, 2024 12:18

bors-openebs-mayastor bot pushed a commit that referenced this pull request Nov 26, 2024

Try #890:

2bb35db

Abhinandan-Purkait approved these changes Nov 26, 2024

View reviewed changes

fix(resize): atomically check for the required size

8922aa7

Ensures races don't lead into volume resize failures. Signed-off-by: Tiago Castro <[email protected]>

tiagolobocastro force-pushed the cherry-pick branch from 867e7c0 to 2121c7e Compare November 26, 2024 18:22

tiagolobocastro force-pushed the cherry-pick branch from 2121c7e to 83cdeb5 Compare November 26, 2024 18:23

bors-openebs-mayastor bot merged commit b928b51 into release/2.7 Nov 26, 2024
4 checks passed

bors-openebs-mayastor bot deleted the cherry-pick branch November 26, 2024 18:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport fixes to release/2.7 #890

Backport fixes to release/2.7 #890

tiagolobocastro commented Nov 26, 2024

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024

Backport fixes to release/2.7 #890

Backport fixes to release/2.7 #890

Conversation

tiagolobocastro commented Nov 26, 2024

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024

try

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024

tiagolobocastro commented Nov 26, 2024

bors-openebs-mayastor bot commented Nov 26, 2024