Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in approle login with identical role-id #3746

Closed
Kevin-Phillips-CK opened this issue Jan 3, 2018 · 8 comments
Closed

Race condition in approle login with identical role-id #3746

Kevin-Phillips-CK opened this issue Jan 3, 2018 · 8 comments
Assignees
Milestone

Comments

@Kevin-Phillips-CK
Copy link

Environment:

  • Vault Version: Vault v0.9.0 ('bdac1854478538052ba5b7ec9a9ec688d35a3335')
  • Operating System/Architecture: OSX 10.12.6

Vault Config File:

storage "file" {
  path = "/tmp/vault/data"
}

listener "tcp" {
  address     = "127.0.0.1:8200"
  tls_disable = 1
}

Startup Log Output:

...
2018/01/02 17:01:10.045995 [INFO ] core: post-unseal setup complete
2018/01/02 17:01:10.149368 [INFO ] core: enabled credential backend: path=approle/ type=approle
2018/01/02 17:01:10.742978 [DEBUG] core: creating a new entity: alias=&{approle auth_approle_d638ab40 7abd8541-d9bc-3936-15ad-964d8217732d}
2018/01/02 17:01:10.743025 [DEBUG] core: creating a new entity: alias=&{approle auth_approle_d638ab40 7abd8541-d9bc-3936-15ad-964d8217732d}
==> Vault shutdown triggered
2018/01/02 17:01:12.301662 [DEBUG] core: marked as sealed
...

Expected Behavior:
Approle /auth/approle/login should always succeed given a valid role-id and secret-id.

Actual Behavior:
Login returns 500 response with error message like alias "6701e7da-142f-12a9-22cf-ebf5b343414d" in already tied to a different entity "a29c7397-516b-2595-8c36-ff8ca0af7edd".

Steps to Reproduce:

Perform multiple logins to an approle within μsec from one another. I am unable to estimate threshold at which this race condition is encountered (could be between 1 and 1000 μsec).

Logging from my client implementation (all values below are generated for issue reporting purpose):

...
2018-01-02 17:09:47,559 [DEBUG] server recvd EHLO from client-1 with payload: {u'role_name': u'develop'}
2018-01-02 17:09:47,559 [DEBUG] Got EHLO from client-1 with payload: {u'role_name': u'develop'}
2018-01-02 17:09:47,559 [DEBUG] Attempting to enroll client-1
2018-01-02 17:09:47,559 [DEBUG] Sending enroll to client-1
2018-01-02 17:09:47,567 [DEBUG] Starting new HTTP connection (1): localhost
2018-01-02 17:09:47,568 [DEBUG] http://localhost:8200 "GET /v1/auth/approle/role/develop/role-id HTTP/1.1" 200 208
2018-01-02 17:09:47,571 [DEBUG] http://localhost:8200 "POST /v1/auth/approle/role/develop/secret-id HTTP/1.1" 200 270
2018-01-02 17:09:47,574 [DEBUG] http://localhost:8200 "POST /v1/sys/wrapping/wrap HTTP/1.1" 200 318
2018-01-02 17:09:47,574 [DEBUG] server recvd EHLO from client-2 with payload: {u'role_name': u'develop'}
2018-01-02 17:09:47,574 [DEBUG] Got EHLO from client-2 with payload: {u'role_name': u'develop'}
2018-01-02 17:09:47,574 [DEBUG] Attempting to enroll client-2
2018-01-02 17:09:47,574 [DEBUG] Sending enroll to client-2
2018-01-02 17:09:47,575 [DEBUG] http://localhost:8200 "GET /v1/auth/approle/role/develop/role-id HTTP/1.1" 200 208
2018-01-02 17:09:47,577 [DEBUG] http://localhost:8200 "POST /v1/auth/approle/role/develop/secret-id HTTP/1.1" 200 270
2018-01-02 17:09:47,580 [DEBUG] http://localhost:8200 "POST /v1/sys/wrapping/wrap HTTP/1.1" 200 319
2018-01-02 17:09:47,581 [DEBUG] client-1 recvd ENROLL with payload: b95494d0-47c2-1c38-d50f-1b205164a6df
2018-01-02 17:09:47,581 [DEBUG] client-2 recvd ENROLL with payload: b344dfb4-35c9-d428-4817-22946c756dfc
2018-01-02 17:09:47,582 [DEBUG] client-2 handling enroll
2018-01-02 17:09:47,582 [DEBUG] client-1 handling enroll
2018-01-02 17:09:47,589 [DEBUG] Starting new HTTP connection (1): localhost
2018-01-02 17:09:47,589 [DEBUG] Starting new HTTP connection (1): localhost
2018-01-02 17:09:47,593 [DEBUG] http://localhost:8200 "POST /v1/sys/wrapping/unwrap HTTP/1.1" 200 258
2018-01-02 17:09:47,594 [DEBUG] client-2 unwrapped role-id: 8b352e16-81ff-bf72-5848-a5f2402d7fbb and secret-id: e5bf19ee-d1fe-b058-5391-364b3d6b6091
2018-01-02 17:09:47,594 [DEBUG] http://localhost:8200 "POST /v1/sys/wrapping/unwrap HTTP/1.1" 200 258
2018-01-02 17:09:47,594 [DEBUG] client-1 unwrapped role-id: 8b352e16-81ff-bf72-5848-a5f2402d7fbb and secret-id: f41ef4e3-e818-c09c-0774-91b118831782
2018-01-02 17:09:47,595 [DEBUG] Starting new HTTP connection (1): localhost
2018-01-02 17:09:47,596 [DEBUG] Starting new HTTP connection (1): localhost
2018-01-02 17:09:47,597 [DEBUG] http://localhost:8200 "POST /v1/auth/approle/login HTTP/1.1" 500 141
2018-01-02 17:09:47,598 [ERROR] client-1 Fatal exception in enroll_callback
2018-01-02 17:09:47,598 [DEBUG] http://localhost:8200 "POST /v1/auth/approle/login HTTP/1.1" 200 424
2018-01-02 17:09:47,598 [ERROR] alias "adf76e89-36ec-0087-761d-ff6a7a964287" in already tied to a different entity "6399d380-acc2-239b-1139-7cfd6d7ab439"
...

Important Factoids:

Occurs periodically in my test environment (two concurrent login client requests). I assume this is tied to network jitter on my loopback interface. Cannot be reproduced if a random sleep (up to 1s) is introduced between login requests.

Appears to be related with transaction alias sanity checking.

References:

@jefferai jefferai added this to the 0.9.2 milestone Jan 3, 2018
@jefferai jefferai modified the milestones: 0.9.2, 0.9.3 Jan 17, 2018
@jefferai jefferai modified the milestones: 0.9.3, 0.9.4 Jan 28, 2018
@patelpu94
Copy link

We are also facing same issue. Any update on this?

@jefferai
Copy link
Member

jefferai commented Feb 7, 2018

@patelpu94 What version of Vault?

@patelpu94
Copy link

Vault version 0.9.0 but we are using app-id/user-id instead of approle.

@jefferai
Copy link
Member

jefferai commented Feb 8, 2018

@patelpu94 It's actually a race condition in Identity, so it affects all auth backends. The fix above should handle it in all cases though!

@patelpu94
Copy link

patelpu94 commented Jun 18, 2018

Even with Upgraded Vault 0.9.4 we saw this issue. And it does not complain about it until Vault is restarted. And it does not elect the leader.
[ERROR] core: post-unseal setup failed: error=failed to update entity in MemDB: alias "xxxxx" in already tied to a different entity "xxxxx"

@vishalnayak
Copy link
Member

@patelpu94 Are you facing this issue during the upgrade or have you already been using 0.9.4? I ask this because, if the conflicting entities in question are created before the upgrade, it is possible that the fix won't have an effect on the outcome that you are facing. The fix only tries to avoid racy duplications, but doesn't clean up the already created conflicting entities.

On the other hand, if you were already using 0.9.4 and that the entities in question were created after the upgrade, then we'll have to see why this is happening.

Are you able to bring up Vault at all, or are you completely stuck? If you are able to bring up Vault, manually clearing the entities that are conflicting and recreating them might be a good way forward. If you happen to have a way to consistently reproduce this, it would help us a lot.

@patelpu94
Copy link

patelpu94 commented Jun 19, 2018

This is using 0.9.4 we are been running for awhile and not during upgrade. We were able to use the Vault after we removed one at a time entry from vault/logical/xxxx/packer/buckets/YYYY
What are these packers/buckets? And When is it created or updated? And how does it get into this state where it maps same id to multiple aliases and we don't detect the issue until Vault restart and unseal again.
And there is no consistent way to reproduce this. We use SSH and APPID/USERID mounts in our deployment, if that helps.

@vabovyan
Copy link

vabovyan commented Aug 8, 2018

We faced the same problem running 0.9.4 and deleting vault/logical/xxxx/packer/buckets/YYYY solved it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants