Skip to content

Commit

Permalink
Avoid indexing the workspace for single-file mode (#13770)
Browse files Browse the repository at this point in the history
## Summary

This PR updates the language server to avoid indexing the workspace for
single-file mode.

**What's a single-file mode?**

When a user opens the file directly in an editor, and not the folder
that represents the workspace, the editor usually can't determine the
workspace root. This means that during initializing the server, the
`workspaceFolders` field will be empty / nil.

Now, in this case, the server defaults to using the current working
directory which is a reasonable default assuming that the directory
would point to the one where this open file is present. This would allow
the server to index the directory itself for any config file, if
present.

It turns out that in VS Code the current working directory in the above
scenario is the system root directory `/` and so the server will try to
index the entire root directory which would take a lot of time. This is
the issue as described in
astral-sh/ruff-vscode#627. To reproduce, refer
astral-sh/ruff-vscode#627 (comment).

This PR updates the indexer to avoid traversing the workspace to read
any config file that might be present. The first commit
(8dd2a31)
refactors the initialization and introduces two structs `Workspaces` and
`Workspace`. The latter struct includes a field to determine whether
it's the default workspace. The second commit
(61fc39b)
utilizes this field to avoid traversing.

Closes: #11366

## Editor behavior

This is to document the behavior as seen in different editors. The test
scenario used has the following directory tree structure:
```
.
├── nested
│   ├── nested.py
│   └── pyproject.toml
└── test.py
```

where, the contents of the files are:

**test.py**
```py
import os
```

**nested/nested.py**
```py
import os
import math
```

**nested/pyproject.toml**
```toml
[tool.ruff.lint]
select = ["I"]
```

Steps:
1. Open `test.py` directly in the editor
2. Validate that it raises the `F401` violation
3. Open `nested/nested.py` in the same editor instance
4. This file would raise only `I001` if the `nested/pyproject.toml` was
indexed

### VS Code

When (1) is done from above, the current working directory is `/` which
means the server will try to index the entire system to build up the
settings index. This will include the `nested/pyproject.toml` file as
well. This leads to bad user experience because the user would need to
wait for minutes for the server to finish indexing.

This PR avoids that by not traversing the workspace directory in
single-file mode. But, in VS Code, this means that per (4), the file
wouldn't raise `I001` but only raise two `F401` violations because the
`nested/pyproject.toml` was never resolved.

One solution here would be to fix this in the extension itself where we
would detect this scenario and pass in the workspace directory that is
the one containing this open file in (1) above.

### Neovim

**tl;dr** it works as expected because the client considers the presence
of certain files (depending on the server) as the root of the workspace.
For Ruff, they are `pyproject.toml`, `ruff.toml`, and `.ruff.toml`. This
means that the client notifies us as the user moves between single-file
mode and workspace mode.

#13770 (comment)

### Helix

Same as Neovim, additional context in
#13770 (comment)

### Sublime Text

**tl;dr** It works similar to VS Code except that the current working
directory of the current process is different and thus the config file
is never read. So, the behavior remains unchanged with this PR.

#13770 (comment)

### Zed

Zed seems to be starting a separate language server instance for each
file when the editor is running in a single-file mode even though all
files have been opened in a single editor instance.

(Separated the logs into sections separated by a single blank line
indicating 3 different server instances that the editor started for 3
files.)

```
   0.000053375s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp, using default settings
   0.009448792s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp
   0.009906334s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/test.py
   0.011775917s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered

   0.000060583s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
   0.010387125s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
   0.011061875s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/nested.py
   0.011545208s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered

   0.000059125s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
   0.010857583s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
   0.011428958s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/other.py
   0.011893792s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
```

## Test Plan

When using the `ruff` server from this PR, we see that the server starts
quickly as seen in the logs. Next, when I switch to the release binary,
it starts indexing the root directory.

For more details, refer to the "Editor Behavior" section above.
  • Loading branch information
dhruvmanila authored Oct 18, 2024
1 parent 3d0bdb4 commit 040a591
Show file tree
Hide file tree
Showing 9 changed files with 211 additions and 53 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions crates/ruff_server/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ rustc-hash = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
shellexpand = { workspace = true }
thiserror = { workspace = true }
tracing = { workspace = true }
tracing-subscriber = { workspace = true }

Expand Down
2 changes: 1 addition & 1 deletion crates/ruff_server/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
pub use edit::{DocumentKey, NotebookDocument, PositionEncoding, TextDocument};
use lsp_types::CodeActionKind;
pub use server::Server;
pub use server::{Server, Workspace, Workspaces};
pub use session::{ClientSettings, DocumentQuery, DocumentSnapshot, Session};

#[macro_use]
Expand Down
170 changes: 140 additions & 30 deletions crates/ruff_server/src/server.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,13 @@
use lsp_server as lsp;
use lsp_types as types;
use lsp_types::InitializeParams;
use lsp_types::WorkspaceFolder;
use std::num::NonZeroUsize;
use std::ops::Deref;
use std::panic::PanicInfo;
use std::str::FromStr;
use thiserror::Error;
use types::ClientCapabilities;
use types::CodeActionKind;
use types::CodeActionOptions;
Expand All @@ -18,6 +22,7 @@ use types::OneOf;
use types::TextDocumentSyncCapability;
use types::TextDocumentSyncKind;
use types::TextDocumentSyncOptions;
use types::Url;
use types::WorkDoneProgressOptions;
use types::WorkspaceFoldersServerCapabilities;

Expand All @@ -29,6 +34,7 @@ use self::schedule::Task;
use crate::session::AllSettings;
use crate::session::ClientSettings;
use crate::session::Session;
use crate::session::WorkspaceSettingsMap;
use crate::PositionEncoding;

mod api;
Expand Down Expand Up @@ -71,17 +77,23 @@ impl Server {

crate::message::init_messenger(connection.make_sender());

let InitializeParams {
initialization_options,
workspace_folders,
client_info,
..
} = init_params;

let mut all_settings = AllSettings::from_value(
init_params
.initialization_options
initialization_options
.unwrap_or_else(|| serde_json::Value::Object(serde_json::Map::default())),
);
if let Some(preview) = preview {
all_settings.set_preview(preview);
}
let AllSettings {
global_settings,
mut workspace_settings,
workspace_settings,
} = all_settings;

crate::trace::init_tracing(
Expand All @@ -91,34 +103,13 @@ impl Server {
.log_level
.unwrap_or(crate::trace::LogLevel::Info),
global_settings.tracing.log_file.as_deref(),
init_params.client_info.as_ref(),
client_info.as_ref(),
);

let mut workspace_for_url = |url: lsp_types::Url| {
let Some(workspace_settings) = workspace_settings.as_mut() else {
return (url, ClientSettings::default());
};
let settings = workspace_settings.remove(&url).unwrap_or_else(|| {
tracing::warn!("No workspace settings found for {}", url);
ClientSettings::default()
});
(url, settings)
};

let workspaces = init_params
.workspace_folders
.filter(|folders| !folders.is_empty())
.map(|folders| folders.into_iter().map(|folder| {
workspace_for_url(folder.uri)
}).collect())
.or_else(|| {
tracing::warn!("No workspace(s) were provided during initialization. Using the current working directory as a default workspace...");
let uri = types::Url::from_file_path(std::env::current_dir().ok()?).ok()?;
Some(vec![workspace_for_url(uri)])
})
.ok_or_else(|| {
anyhow::anyhow!("Failed to get the current working directory while creating a default workspace.")
})?;
let workspaces = Workspaces::from_workspace_folders(
workspace_folders,
workspace_settings.unwrap_or_default(),
)?;

Ok(Self {
connection,
Expand All @@ -127,7 +118,7 @@ impl Server {
&client_capabilities,
position_encoding,
global_settings,
workspaces,
&workspaces,
)?,
client_capabilities,
})
Expand Down Expand Up @@ -462,3 +453,122 @@ impl FromStr for SupportedCommand {
})
}
}

#[derive(Debug)]
pub struct Workspaces(Vec<Workspace>);

impl Workspaces {
pub fn new(workspaces: Vec<Workspace>) -> Self {
Self(workspaces)
}

/// Create the workspaces from the provided workspace folders as provided by the client during
/// initialization.
fn from_workspace_folders(
workspace_folders: Option<Vec<WorkspaceFolder>>,
mut workspace_settings: WorkspaceSettingsMap,
) -> std::result::Result<Workspaces, WorkspacesError> {
let mut client_settings_for_url = |url: &Url| {
workspace_settings.remove(url).unwrap_or_else(|| {
tracing::info!(
"No workspace settings found for {}, using default settings",
url
);
ClientSettings::default()
})
};

let workspaces =
if let Some(folders) = workspace_folders.filter(|folders| !folders.is_empty()) {
folders
.into_iter()
.map(|folder| {
let settings = client_settings_for_url(&folder.uri);
Workspace::new(folder.uri).with_settings(settings)
})
.collect()
} else {
let current_dir = std::env::current_dir().map_err(WorkspacesError::Io)?;
tracing::info!(
"No workspace(s) were provided during initialization. \
Using the current working directory as a default workspace: {}",
current_dir.display()
);
let uri = Url::from_file_path(current_dir)
.map_err(|()| WorkspacesError::InvalidCurrentDir)?;
let settings = client_settings_for_url(&uri);
vec![Workspace::default(uri).with_settings(settings)]
};

Ok(Workspaces(workspaces))
}
}

impl Deref for Workspaces {
type Target = [Workspace];

fn deref(&self) -> &Self::Target {
&self.0
}
}

#[derive(Error, Debug)]
enum WorkspacesError {
#[error(transparent)]
Io(#[from] std::io::Error),
#[error("Failed to create a URL from the current working directory")]
InvalidCurrentDir,
}

#[derive(Debug)]
pub struct Workspace {
/// The [`Url`] pointing to the root of the workspace.
url: Url,
/// The client settings for this workspace.
settings: Option<ClientSettings>,
/// Whether this is the default workspace as created by the server. This will be the case when
/// no workspace folders were provided during initialization.
is_default: bool,
}

impl Workspace {
/// Create a new workspace with the given root URL.
pub fn new(url: Url) -> Self {
Self {
url,
settings: None,
is_default: false,
}
}

/// Create a new default workspace with the given root URL.
pub fn default(url: Url) -> Self {
Self {
url,
settings: None,
is_default: true,
}
}

/// Set the client settings for this workspace.
#[must_use]
pub fn with_settings(mut self, settings: ClientSettings) -> Self {
self.settings = Some(settings);
self
}

/// Returns the root URL of the workspace.
pub(crate) fn url(&self) -> &Url {
&self.url
}

/// Returns the client settings for this workspace.
pub(crate) fn settings(&self) -> Option<&ClientSettings> {
self.settings.as_ref()
}

/// Returns true if this is the default workspace.
pub(crate) fn is_default(&self) -> bool {
self.is_default
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ impl super::SyncNotificationHandler for DidChangeWorkspace {
) -> Result<()> {
for types::WorkspaceFolder { uri, .. } in params.event.added {
session
.open_workspace_folder(&uri)
.open_workspace_folder(uri)
.with_failure_code(lsp_server::ErrorCode::InvalidParams)?;
}
for types::WorkspaceFolder { uri, .. } in params.event.removed {
Expand Down
9 changes: 5 additions & 4 deletions crates/ruff_server/src/session.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@ use std::sync::Arc;
use lsp_types::{ClientCapabilities, NotebookDocumentCellChange, Url};

use crate::edit::{DocumentKey, DocumentVersion, NotebookDocument};
use crate::server::Workspaces;
use crate::{PositionEncoding, TextDocument};

pub(crate) use self::capabilities::ResolvedClientCapabilities;
pub use self::index::DocumentQuery;
pub(crate) use self::settings::AllSettings;
pub use self::settings::ClientSettings;
pub(crate) use self::settings::{AllSettings, WorkspaceSettingsMap};

mod capabilities;
mod index;
Expand Down Expand Up @@ -42,11 +43,11 @@ impl Session {
client_capabilities: &ClientCapabilities,
position_encoding: PositionEncoding,
global_settings: ClientSettings,
workspace_folders: Vec<(Url, ClientSettings)>,
workspaces: &Workspaces,
) -> crate::Result<Self> {
Ok(Self {
position_encoding,
index: index::Index::new(workspace_folders, &global_settings)?,
index: index::Index::new(workspaces, &global_settings)?,
global_settings,
resolved_client_capabilities: Arc::new(ResolvedClientCapabilities::new(
client_capabilities,
Expand Down Expand Up @@ -136,7 +137,7 @@ impl Session {
}

/// Open a workspace folder at the given `url`.
pub(crate) fn open_workspace_folder(&mut self, url: &Url) -> crate::Result<()> {
pub(crate) fn open_workspace_folder(&mut self, url: Url) -> crate::Result<()> {
self.index.open_workspace_folder(url, &self.global_settings)
}

Expand Down
23 changes: 14 additions & 9 deletions crates/ruff_server/src/session/index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ use rustc_hash::FxHashMap;
pub(crate) use ruff_settings::RuffSettings;

use crate::edit::LanguageId;
use crate::server::{Workspace, Workspaces};
use crate::{
edit::{DocumentKey, DocumentVersion, NotebookDocument},
PositionEncoding, TextDocument,
Expand Down Expand Up @@ -67,12 +68,12 @@ pub enum DocumentQuery {

impl Index {
pub(super) fn new(
workspace_folders: Vec<(Url, ClientSettings)>,
workspaces: &Workspaces,
global_settings: &ClientSettings,
) -> crate::Result<Self> {
let mut settings = WorkspaceSettingsIndex::default();
for (url, workspace_settings) in workspace_folders {
settings.register_workspace(&url, Some(workspace_settings), global_settings)?;
for workspace in &**workspaces {
settings.register_workspace(workspace, global_settings)?;
}

Ok(Self {
Expand Down Expand Up @@ -167,11 +168,12 @@ impl Index {

pub(super) fn open_workspace_folder(
&mut self,
url: &Url,
url: Url,
global_settings: &ClientSettings,
) -> crate::Result<()> {
// TODO(jane): Find a way for workspace client settings to be added or changed dynamically.
self.settings.register_workspace(url, None, global_settings)
self.settings
.register_workspace(&Workspace::new(url), global_settings)
}

pub(super) fn num_documents(&self) -> usize {
Expand Down Expand Up @@ -284,6 +286,7 @@ impl Index {
settings.ruff_settings = ruff_settings::RuffSettingsIndex::new(
root,
settings.client_settings.editor_settings(),
false,
);
}
}
Expand Down Expand Up @@ -398,10 +401,10 @@ impl WorkspaceSettingsIndex {
/// workspace. Otherwise, the global settings are used exclusively.
fn register_workspace(
&mut self,
workspace_url: &Url,
workspace_settings: Option<ClientSettings>,
workspace: &Workspace,
global_settings: &ClientSettings,
) -> crate::Result<()> {
let workspace_url = workspace.url();
if workspace_url.scheme() != "file" {
tracing::info!("Ignoring non-file workspace URL: {workspace_url}");
show_warn_msg!("Ruff does not support non-file workspaces; Ignoring {workspace_url}");
Expand All @@ -411,17 +414,19 @@ impl WorkspaceSettingsIndex {
anyhow!("Failed to convert workspace URL to file path: {workspace_url}")
})?;

let client_settings = if let Some(workspace_settings) = workspace_settings {
ResolvedClientSettings::with_workspace(&workspace_settings, global_settings)
let client_settings = if let Some(workspace_settings) = workspace.settings() {
ResolvedClientSettings::with_workspace(workspace_settings, global_settings)
} else {
ResolvedClientSettings::global(global_settings)
};

let workspace_settings_index = ruff_settings::RuffSettingsIndex::new(
&workspace_path,
client_settings.editor_settings(),
workspace.is_default(),
);

tracing::info!("Registering workspace: {}", workspace_path.display());
self.insert(
workspace_path,
WorkspaceSettings {
Expand Down
Loading

0 comments on commit 040a591

Please sign in to comment.