Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download alternative server depending on platform requirements #206790

Closed
deepak1556 opened this issue Mar 4, 2024 · 19 comments
Closed

Download alternative server depending on platform requirements #206790

deepak1556 opened this issue Mar 4, 2024 · 19 comments
Assignees
Labels
engineering VS Code - Build / issue tracking / etc. on-testplan remote Remote system operations issues
Milestone

Comments

@deepak1556
Copy link
Collaborator

As a next step in the linux server story we want to build and consume two different flavors of server that will be used depending on the platform requirements. This is useful to pretest the server with newer requirements before the EOL of legacy server which is currently set for February 2025.

  1. Server with glibc >= 2.28 and libstdc++ >= 3.4.25
  2. Server with glibc >= 2.17 and libstdc++ >= 3.4.19, we will call this legacy server

There are 3 parts to this next step,

(i) Production:

VS Code production pipeline will create the above two variants of the server for each of the linux supported architectures (x64, armhf, arm64). A prototype was already created in #204139, I will polish the PR and prepare it for merge as next step. The end result will look like the following screenshot.
Screenshot 2024-03-04 at 14 21 31

(ii) Detection:

Today we have detection of glibc and libstdc++ in two variations, one in the tunnel binary used in the exec server mode and the other as a shell script. The shell script can only perform the detection on a best effort basis due to various limitations but thanks for efforts from the community it works in most cases.

However I wonder if we should just have one variant of the detection as part of this effort, basically we rely on the tunnel binary to do the detection, add a new property to ExecEnvironment that will provide the glibc version and libstdc++ max version. Thoughts @connor4312 ?

Another question, can the remote extensions today use the execServer mode just for detection even if it is not for the other parts of the modal ? That is, get the result from execServer.env() and proceed to their usual server setup flow.

export interface ExecEnvironment {
		...
             /** provides the result of gnu_get_libc_version **/
             glibcVersion?: string
             /** provides the maximum version of the symbol found in the libstdc++.so.6 library **/
             glibcxxMaxVersion?: string
}

(iii) Consumption:

This needs change in the places where the server gets downloaded today, based on the result of detection we want to either download from https://update.code.visualstudio.com/commit:${commit}/${legacy ? 'legacy-' : ''}server-${build.id}${web ? '-web' : ''}/${quality}. The only difference in the download endpoint will be the legacy- prefix.

Let me know if this plan makes sense.

@deepak1556 deepak1556 added engineering VS Code - Build / issue tracking / etc. remote Remote system operations issues labels Mar 4, 2024
@deepak1556 deepak1556 added this to the March 2024 milestone Mar 4, 2024
@connor4312
Copy link
Member

I made a draft PR for the CLI side of this a bit ago #204194. Consumers of the exec server let it do the download, so they just request the version of the server they need and the CLI will determine the appropriate one for the platform -- including the glibc check with that PR. Consumers who are already using the CLI won't need to do any changes in that regard.

Another question, can the remote extensions today use the execServer mode just for detection even if it is not for the other parts of the modal ? That is, get the result from execServer.env() and proceed to their usual server setup flow.

We could add a new method that would just return the server the CLI thinks is appropriate for the platfom.

@aeschli
Copy link
Contributor

aeschli commented Mar 4, 2024

However, I wonder if we should just have one variant of the detection as part of this effort,

But don't we still use the shell script to get a nicer error message when running a regular code-server on a legacy platform?

@deepak1556
Copy link
Collaborator Author

Yes we used to do that based on the custom exit code from the shell script in containers and wsl, but we have disabled the exit code till the new end of support period is reached. It does make sense to have a similar feedback from the execServer so that we can bring back the error dialogs after February 2025. How about the following ?

export interface ExecServer {
   /**
    * Detects the server variant that can be executed on the current platform.
    * @returns a valid download url if all requirements are satisfied, otherwise 
    * a status object that contains the custom exit codes related to the specific failure.
    */
   getServerDownloadURL?(): Thenable<{url: string | undefined; status: ProcessExit}>;
}

@chrmarti
Copy link
Collaborator

chrmarti commented Mar 5, 2024

Would using the CLI for version checking necessarily involve an additional download or could the CLI in the non-legacy server (is the CLI already part of the server?) be used to do that? If the percentage of legacy systems is low, it might make more sense to optimize for the non-legacy case.

Codespaces today downloads the server ahead of time to optimize startup time. The code doing the download and later the startup is owned by us and runs standalone on a VM (i.e., there is no VS Code API available).

@deepak1556
Copy link
Collaborator Author

After some offline discussion with @chrmarti , his main concerns were around startup performance and the codespaces scenario. Since the CLI is part of the server bundle which makes detection and download a two step process which we don't want to happen for the majority cases, what does everyone think of the following steps

  1. Make non-legacy server the default
  2. Use the bundled cli to do the detection, this will happen in the places were shell script is used today referenced in Download alternative server depending on platform requirements #206790 (comment)
  3. If detection succeeds, continue with current server
  4. If detection fails, download legacy server and proceed to connect

For codespaces, will keep the current legacy server as default and avoid any double downloads. There is an idea of caching both legacy and non-legacy servers but that is something to discuss as next step once we have finished transitioning the extensions.

@aeschli
Copy link
Contributor

aeschli commented Mar 5, 2024

The code-cli is not bundled with the server build. It contains a script file remote-cli but that's unrelated.

The code-cli would be a separate download. To do that download is the future, but we're not there for the classic remote use cases (local wsl and dev containers) and I was hoping for an easier change.

What happens now when running a new node on a legacy linux? Instead of a shell script, can the test be implemented in JavaScript?

@deepak1556
Copy link
Collaborator Author

The code-cli is not bundled with the server build. It contains a script file remote-cli but that's unrelated.

Oh I should have checked the file contents, misunderstood with the name.

What happens now when running a new node on a legacy linux?

Following error will be seen from the runtime linker

# ./node-v18.0.0-linux-arm64/bin/node 
./node-v18.0.0-linux-arm64/bin/node: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by ./node-v18.0.0-linux-arm64/bin/node)

Instead of a shell script, can the test be implemented in JavaScript?

I don't want to fragment the checking to yet another script, instead what about this we bundle the cli as part of the server and have a command line mode that can perform the detection, return the results similar to the shell script. This can be used for classic remote use ?

> code-tunnel --check-requirements

@chrmarti
Copy link
Collaborator

chrmarti commented Mar 6, 2024

@deepak1556 Sounds good to me. Is the main reason for preferring the CLI that it can use some OS API to check the library versions whereas the shell script needs to use a set of non-standardized ways to detect the versions?

@deepak1556
Copy link
Collaborator Author

Partly yes, with the shell script we saw issues that demonstrated really unique setups with multiple libc and libstdc++ installations. We are currently looping through them and checking if any of them have the required symbols, not specifically the one that gets loaded into the binary.

For obtaining runtime gnu libc version, the reliable way would be using gnu_get_libc_version api which we already check in the cli

let v = unsafe { libc::gnu_get_libc_version() };
. There is no API way for libstdc++, so next best thing is relying on the ldd <path-to-server-executable> and parse the output.

The other reason to move to cli is that the shell script was only created as a temporary solution as seen in the discussion here, the goal is we only have a single way to do requirements detection through the cli moving forward. I am thinking now is the best time make that change as part of this polishing.

@aeschli
Copy link
Contributor

aeschli commented Mar 6, 2024

It doesn't make sense to add the CLI to the server build just for the version check. That not only adds 20MB to the disk space as well as 7.5MB download size, but is also very confusing. It's the CLI that downloads the server.

We have the script, so lets use it.

Or we skip the compatibility check and just look out for the message
version GLIBC_2.28' not found (required by ./node-v18.0.0-linux-arm64/bin/node)

@deepak1556
Copy link
Collaborator Author

There is a disconnect for me on how the classic and cli way of server will be used, have setup a call to get proper context.

we skip the compatibility check and just look out for the message

This also won't work in all cases, users can have sufficient version of glibc but not libstdc++ and even if the server executable passed the conditions we can have native modules that didn't satisfy the requirements which will only be known when they get loaded into the process.

@connor4312
Copy link
Member

It doesn't make sense to add the CLI to the server build just for the version check. That not only adds 20MB to the disk space as well as 7.5MB download size, but is also very confusing. It's the CLI that downloads the server.

👍 the CLI is a precursor to the server. Having the server also include the CLI only increase download size.

The "classic" way of remote extensions is the extension 'somehow' determining the appropriate VS Code server to use, downloading that onto the remote machine, and then bootstrapping the server using it. However if we use the GNU build of the CLI the extension would need to do a glibc check anyway. It could instead always use the musl build, but then libc::gnu_get_libc_version is not available, so it's a bit of a chicken and egg problem.

@deepak1556
Copy link
Collaborator Author

Summarizing discussion from today's call:

  1. For the classic scenario, we will use the existing script to detect the requirements and if they succeed only then download the new server

  2. For the cli scenario, we are building with 2.17 today but we cannot be sure how long this can be done as this depends on the rust toolchain minimum glibc requirement. So we will need a check for cli, but that's for another day.

@aeschli
Copy link
Contributor

aeschli commented Mar 13, 2024

Now I'm confused. Maybe I misunderstood, I thought we want to download the new server which runs the existing script and fails with an error code when the OS is old. I that case the extension downloads the old server. Optionally we can set a marker file so the next time we go for the legacy server right away.

@deepak1556
Copy link
Collaborator Author

deepak1556 commented Mar 18, 2024

@connor4312 couple of updates,

  1. Merged the CI changes today to build both default and legacy servers
  2. CLI is still built using glibc 2.17, so we can just ship a single CLI for all OS
  3. On the classic server detection side, we have added a check based on the presence of a certain file at $HOME which would allow users to force legacy server. Basically a safety net for cases the script is not reliable. Changes at https://github.com/microsoft/vscode/pull/208000/files. Might want to consider something similar for the exec detection mode as well ?

I will be triggering a new insiders shortly so the core changes can be tested/adopted in wsl and container extensions, maybe it is good time to revive #204194

@plwalsh
Copy link

plwalsh commented Mar 19, 2024

@deepak1556 It seems the new insiders build is not properly falling back to the legacy server for the older glibc systems. It was working fine for me yesterday. And then this morning, insiders alerted me of a new version, so I updated. And after the update, I can no longer connect to my Ubuntu 18.04 or RHEL7 systems, and am seeing this popup message:

Failed to connect to the remote extension host server (Error: WrappedError(WrappedError { message: "error checking server integrity", original: "failed to run command "<my-home>/.vscode-server-insiders/cli/servers/Insiders-7c51753f7ce936a7aeb84ac881818c35c5a9ab8d.staging/server/bin/code-server-insiders --version" (code 100): Warning: Missing GLIBC >= 2.28! from /lib/x86_64-linux-gnu/libc-2.27.so\nWarning: Missing required dependencies. Please refer to our FAQ https://aka.ms/vscode-remote/faq/old-linux for additional information.\n" }))

@deepak1556
Copy link
Collaborator Author

@plwalsh we haven't setup legacy server changes on the extension side yet, please disable updates until this issue is closed.

@deepak1556
Copy link
Collaborator Author

deepak1556 commented Mar 22, 2024

@deepak1556
Copy link
Collaborator Author

Thanks everyone for the quick adoption! marking as closed for endgame testing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
engineering VS Code - Build / issue tracking / etc. on-testplan remote Remote system operations issues
Projects
None yet
Development

No branches or pull requests

7 participants