Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(compass-web): refactor sandbox to better handle cert issuing logic #6295

Merged
merged 8 commits into from
Sep 30, 2024

Conversation

gribnoysup
Copy link
Collaborator

@gribnoysup gribnoysup commented Sep 27, 2024

Now that we actually using sandbox more for feature development, some pain points started to show up, this patch tries to address those:

  • One issue was that mms session cookie would expire very quickly. Hard to track this one specifically, but I think the root cause was in that the session token gets updated by mms ui pretty often and you almost always getting a new one with set-cookie header coming from the server on every request in the browser. In sandbox proxy this set-cookie wasn't really doing anything because requests were mostly coming through the http proxy. The fix is to start using electron.net in the AtlasCloudAuthenticator, using electron network stack should actually deal with two things: we don't need to manually get the cookie for those requests, electron can map them on it's own by domain name, but also it should take the cookie setting response into account correctly, meaning less chance for the token to expire
  • Another thing that Basit noticed (and I also ran into a few times) is that there is apparently a limit for the amount of certs you can issue for a single db user in the project. We were using a non unique user name and even more importantly always issuing a new cert when websocket proxy would request it. This change was pretty involved and caused most of the changes in this PR:
    • I changed the test user creation logic to use a unique-ish name so that we can create a new one every time you are using the sandbox
    • On top of that I changed the logic to store the cert in memory so that we don't re-issue it on every request (you can't request existing certs from mms backend as far as I can tell)
    • Because the user is unique per session now, this change required some additional considerations to be handled:
      • The code now waits for the user creation job to be rolled out to all deployments by checking the project plans in progress, this is needed to make sure that when we're attempting the connection, clusters already know that this user exists. Important to mention here that this can take awhile sometimes, so when using the sandbox in "atlas mode", connections can time out, just retrying them should help (and I'll look into how to provide a higher connection timeout in the sandbox so that it's not an issue)
      • Because the username is unique to avoid polluting projects with this test user, I restructured the code to have a more robust cleanup flow when you stop the sandbox. Most notably maybe, instead of making webpack dev server the main process, electron proxy is now, managing all the other servers, and stopping / cleaning them up when the process is stopped.

@gribnoysup gribnoysup marked this pull request as ready for review September 30, 2024 10:24
@@ -112,6 +129,23 @@ class AtlasCloudAuthenticator {
return new URL(url, 'http://localhost').pathname.replace('/v2/', '');
}

async #fetch(path, init) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep forgetting that private properties are even a thing for literally years at a time 😆

bw.close();
});
}),
once(bw, 'close', { signal: abortController.signal }).then(() => {
throw new Error('Window closed before finished signing in');
}),
]);
electronApp.dock.show();
electronApp.dock?.show();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is old code, but what's this doing? Is it this https://www.electronjs.org/docs/latest/api/dock#dockshow-macos ? That returns a promise, btw.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On macos it hides the dock icon when we're doing showing a browser window for login purposes. Just makes it behave closer to a console process when there are no actual UI spawned by this process. I'll need to rewrite this file to typescript at some point... 🙈

await session.defaultSession.clearStorageData({ storages: ['cookies'] });
async cleanupAndLogout() {
// When logging out, delete the test user too. If we don't do this now, we
// will loose a chance to do it later due to missing auth
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar nit: lose, not loose. loose is an adjective that means not firmly attached 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed in both places where I made this mistake 😅

try {
await atlasCloudAuthenticator.deleteTestDBUser();
} catch {
// Can fail if user wasn't even created yet
Copy link
Contributor

@lerouxb lerouxb Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe worth at least debug logging when we ignore errors just to make sure that it is the error we expect and not some NPE or other programmer error. (I'm aware that this isn't exactly intended to be production code)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, yeah, good idea, doesn't hurt to log here 👍

@@ -282,6 +360,10 @@ proxyWebServer.use('/authenticate', async (req, res) => {

try {
const { projectId } = await atlasCloudAuthenticator.authenticate();
// Start issuing the cert to save some time when signing in
void atlasCloudAuthenticator.getX509Cert().catch(() => {
// ignore errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to elsewhere. Should this at least debug log?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add one to the get method itself, thanks 👍

@gribnoysup gribnoysup merged commit a1d78d9 into main Sep 30, 2024
20 of 21 checks passed
@gribnoysup gribnoysup deleted the compass-web-sandbox-improvements branch September 30, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants