Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language server concurrency and functionality upgrades #979

Closed

Conversation

micahscopes
Copy link
Collaborator

@micahscopes micahscopes commented Jan 29, 2024

This PR introduces significant changes to the language server, focusing on improving concurrency and LSP functionality. The language server has been rewritten based on tower-lsp to support concurrent execution of tasks.

Here's a detailed overview of the key components and their interconnections:

server module

  • Server: This struct handles the I/O of the Language Server Protocol (LSP). It receives requests and notifications from the client and sends responses and notifications back to the client. The Server struct uses the tower-lsp crate, which provides a clean and easy-to-use API for implementing LSP servers. Because state is omitted from the Server struct, it functions essentially as an async task manager for LSP events.
  • MessageSenders and MessageReceivers: These structs are generated by a procedural macro in language-server-macro using information from the tower_lsp::LanguageServer trait implementation. They are used for communication between the Server and Backend via tokio MPSC and oneshot channels.

backend module

  • Backend: This struct has exclusive ownership of language server state, which is stored in LanguageServerDatabase and Workspace members, and mediates all state modifications. Struct methods defined on Backend are used to handle stream events set up in the streams module.
  • Workspace: This struct has been refined to avoid excessive reliance on mutable references. It represents the current state of the user's workspace, including the open files and their contents. It functions as an index of salsa inputs to be used with LanguageServerDatabase.
  • LanguageServerDatabase: This struct provides access to functionality in compiler crates. It implements Salsa's ParallelDatabase trait to enable parallel salsa queries via the Snapshot mechanism.

functionality module

  • functionality::streams: This module sets up streams for handling LSP events on the backend side. It uses the tokio::select! macro to concurrently listen to multiple streams and handle events as they arrive. Streams allow for declarative control over the order in which LSP events are processed and concise expression of additional reactive behavior. I'm especially happy with the chunked+debounced diagnostics stream and its handler, to give an example.
  • functionality::handlers: This module contains functions for handling different LSP events. Each function corresponds to a specific LSP event and contains the logic for processing that event and producing an appropriate response.
  • The other modules in functionality support implementations of various LSP functionality, including improved go-to definition, hover info and diagnostics functionality

This architecture is designed to handle concurrent execution of tasks efficiently and to provide a clean separation between the I/O handling (Server) and the actual processing of LSP events (Backend). This separation makes concurrency easier to reason about and allows for parallel execution of expensive tasks. The use of tokio channels and streams allows for efficient handling of concurrent tasks and provides control over the order of task execution.

Changes and features

  • Async handlers for LSP events (requests + notifications) via the tower-lsp crate
  • Separation of LSP server I/O (Server) and language server state (Backend) via tokio channels
    • Using a procedural macro to generate channels
    • Broadcast channels to send out LSP events
    • Oneshot channels are sent along with LSP requests to facilitate responses
    • Tokio channel separation is optional, events can be handled directly in tower-lsp interface if needed
  • Proof of concept stream-based LSP event handling on the backend side
    • Intended for dealing with LSP event handler execution order, cancellation, and debouncing of expensive event handlers
    • Simple use case: aggregating document updates and handling them with a single on_change handler
    • See tower-lsp issue #284 for examples of more complex scenarios that could arise
  • Separate tokio executors (worker pools) for server vs backend contexts (proof of concept, could be leveraged to keep long running tasks separate from LSP I/O)

More changes, following the initial review

  • Refactor to avoid mutable references when not exclusively modifying salsa inputs, generally avoiding mutable references to language server db outside of document change handler
    • Modify salsa inputs explicitly in a single step without doing other stuff
    • Remove salsa input modifications from handlers like hover, ensure they are updated in the change handler
    • Remove diagnostics storage in the language server db, diagnostics shouldn't need a mutable reference
  • Refactor to avoid using broadcast channels by default, only use them if broadcasting/forking is strictly necessary
  • Review stream forking, avoid unnecessarily forking
  • Refactor stream handler configuration
    • One single spawned select! loop for mutating handlers and multiple spawns for read-only handlers
    • Potentially move read-only select loops to a separate executor to enable parallel execution
  • Switch to tracing for logging to ensure logs are correctly ordered
  • Document broadcast channel / stream architecture and associated proc macro
  • Clean up code organization and module naming
  • Refine and document stream setup for controlling order of execution for tasks triggered by LSP events
  • Give Backend exclusive ownership of workspace and database state; no locks needed
  • Add tests
  • Investigate and implement a cancellation mechanism for potentially long running processes, both on the language server and in the compiler

Functionality upgrades

  • Goto works with intermediate path segments
  • Useful hover information
    • render doc comments as markdown
    • target origin
    • definition source
  • non-blocking, multithreaded diagnostics
  • batched, deduplicated ingot wide diagnostic handler
  • Improved VS Code extension support for stuff like comment toggling and autoclosing brackets

Not urgent/maybe

  • Ensure WASI target compiles
    • Replace WASM test target with WASI test target
  • automatically send LSP events through channels; would require modification to tower-lsp
  • Performance profiling tasks

Initial impressions (previously)

Update: the uncertainties below have been addressed by a channel/stream augmentation to the tower-lsp implementation

Managing request handler execution order

tower-lsp doesn't really provide a way of managing the order of handler execution. Neither does lsp-server for that matter. What exactly should this execution dependency graph look like in various cases? It's hard to foresee what this will look like as more and more LSP functionality gets implemented, but it's clear that we need a way to control the order of task execution somehow or another.

As a simple example, if I were to rename a file in vscode it could trigger multiple handlers (did_open, did_close, watched_files_did_change) concurrently.

How can we ensure that this pair of LSP events gets handled appropriately? We're not just opening any file, we're opening a renamed file and we need to ensure that the workspace cache is updated to reflect this before executing diagnostics. The watched_files_did_change handler ends up being redundant in this case if it gets executed after the did_open handler, but in the case where a file gets directly renamed outside of the LSP client, it's still important.

In this example, it's not a big deal to check for deleted (renamed) files in both handlers, since those checks are relatively short lived given how infrequently they'd be executed. But it'll be important to ensure that e.g. diagnostics are run carefully, that the salsa inputs are set up correctly before running diagnostics or other complex tasks. It's also important to ensure that expensive tasks aren't triggered redundantly in parallel.

Shared state and deadlocks

Related to the issue of concurrency... How do we manage sharing of the salsa database and inputs cache in a concurrent environment?

In this prototype the salsa db and workspace cache are currently shared via std::sync::Mutex locks and the language server client is shared via tokio::sync::Mutex locks. This works just fine but it requires care not to cause deadlocks. The tokio shared state docs were very helpful for me in understanding this better.

Useful info

@micahscopes micahscopes force-pushed the language-server-tower-lsp branch from bd1bbd5 to 9ecdf02 Compare January 29, 2024 17:56
@micahscopes micahscopes marked this pull request as draft January 29, 2024 18:11
@micahscopes micahscopes force-pushed the language-server-tower-lsp branch 2 times, most recently from 6357849 to 2b43c77 Compare January 30, 2024 00:50
@micahscopes micahscopes force-pushed the language-server-tower-lsp branch from fc74841 to fcdd0d9 Compare February 29, 2024 20:24
@micahscopes micahscopes force-pushed the language-server-tower-lsp branch 4 times, most recently from bd01409 to e31c0be Compare March 6, 2024 10:06
@micahscopes
Copy link
Collaborator Author

micahscopes commented Mar 8, 2024

Notes from review with @sbillig and @Y-Nak:

  • avoid mutable references when we aren't exclusively modifying salsa inputs
  • salsa inputs should be modified explicitly in a single step that doesn't do other stuff, e.g.
    • don't modify salsa inputs in the hover handler, they should already be updated from the change handler
    • shouldn't need to store diagnostics on the language server db, diagnostics shouldn't need a mutable reference
    • generally we shouldn't see mutable references to language server db outside of document change handler
    • this will avoid excessive cache invalidation and free up multithreaded computation either using a rwlock or salsa's snapshot mechanism
  • salsa's snapshot mechanism is similar to a rwlock but avoids deadlocks in case of cycles
  • avoid using broadcast channels by default, only use them if broadcasting/forking is strictly necessary
    • maybe we can think in this order of preference: using mpsc channels should by default; then some kind of stream splitting mechanism (potentially using broadcast channels but guarding access to create extra receivers); then broadcast channels
    • this will allow just sending the raw oneshot channel instead of wrapping it
  • try to avoid forking streams unnecessarily; be intentional about forking them
  • multithreaded executor is unnecessary for the current stream handler configuration (one loop with a big select! statement containing all the stream handlers)
    • stream handler will be more i/o bound once we get the multithreaded salsa queries are sorted out
    • need to think more carefully about how to break stuff up
    • maybe we can have one select loop for mutating stream handlers and spawn separate select loops (on a separate executor?) for read only stream handlers, that way we can get all the read only stuff done in parallel and free up the db lock as soon as possible
  • use tracing for logging, it will help guarantee that logs are correctly ordered

@Y-Nak
Copy link
Member

Y-Nak commented Mar 8, 2024

salsa's snapshot mechanism is similar to a rwlock but avoids deadlocks in case of cycles

This is not correct. What Snapshot does is almost the same as what RwLock does, so it's possible to introduce a deadlock either way. But it's rather difficult to cause a deadlock as long as we don't try to mutate the db from the salsa tracked function (e.g.,) by sending an event to the main thread to let the thread mutate the db.
E.g.,

#[salsa::tracked]
// Sender needs to implement `Clone` and `Hash` to be an argument of
// a salsa-tracked function, but I ignore the fact for simplicity.
fn rename(db: &dyn Db, rename_event: Event, tx: Sender ) {
      // Perform Reneming.
      // ...

      // Send an event to the main thread, and the main thread will try to mutate the database. 
      // This might cause a deadlock.
      tx.send(Event::SourceTextChanged) 
     
     // ...
}

Another possibility for the deadlock is not related to mutability thing, i.e., the deadlock situation might happen even if we only use &Db in muti-thread settings. But salsa detects this deadlock situation and raises a cycle error, which is, of course, nice.

Please refer to the below links for more information.

@micahscopes micahscopes force-pushed the language-server-tower-lsp branch 3 times, most recently from 1c2dbb2 to d6e07fb Compare March 19, 2024 22:25
@micahscopes micahscopes force-pushed the language-server-tower-lsp branch from fc567e1 to a34114e Compare March 27, 2024 06:31
@micahscopes micahscopes force-pushed the language-server-tower-lsp branch from f4ad0bf to 9e296ce Compare March 28, 2024 06:30
@micahscopes micahscopes changed the title Language server concurrency (tower-lsp rewrite) Language server concurrency and functionality upgrades Mar 29, 2024
@micahscopes micahscopes marked this pull request as ready for review March 29, 2024 04:01
Result<Option<lsp_types::Hover>, tower_lsp::jsonrpc::Error>,
>,
) {
let db = self.db.snapshot();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to use a snapshot here, right? hover_handler(&self.db, ... should suffice (unless this is changed so that the task is spawned off to run separately)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is also vestigial, I'd been handling hover in a worker but took it out and forgot to remove this.

Copy link
Member

@Y-Nak Y-Nak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the PR. I'll probably add more comments later.

@@ -889,7 +889,7 @@ enum DefKind {
Adt(AdtDef),
Trait(TraitDef),
ImplTrait(Implementor),
Impl(HirImpl, TyId),
Impl(HirImpl, #[allow(dead_code)] TyId),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Impl(HirImpl, #[allow(dead_code)] TyId),
Impl(HirImpl, TyId),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I upgraded rust with rustup clippy complained about TyId being unused. I added this temporarily but wasn't sure how to proceed.

@@ -24,6 +24,7 @@ pub trait DiagnosticVoucher: Send {
fn error_code(&self) -> GlobalErrorCode;
/// Makes a [`CompleteDiagnostic`].
fn to_complete(&self, db: &dyn SpannedHirDb) -> CompleteDiagnostic;
fn clone_box(&self) -> Box<dyn DiagnosticVoucher>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need clone_box? It seems you don't use this method in the LSP implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's the correct way of doing it, but this allowed me to to implement snapshotting (the ParallelDatabase trait) without the compiler complaining about not being able to clone stuff.

}

impl salsa::Database for LanguageServerDatabase {
fn salsa_event(&self, _: salsa::Event) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice if we could get tracing logs when salsa events happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger has been showing some salsa events already even without implementing this, I guess from the other crates? I actually disabled them temporarily here in order to focus on messages from the language-server crate.

It could be cool to have separate outputs for these or some other way of toggling.

crates/language-server/src/functionality/item_info.rs Outdated Show resolved Hide resolved
}
}

pub fn run_server() -> Result<()> {
let (connection, io_threads) = Connection::stdio();
#[language_server_macros::message_channels]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why we need as many channels as LSP request kinds.
It feels that this macro is an abuse of proc macro, decreases readability and maintainability, and also the generated structs already contain unused channels.

Is there any specific reason that a message enum doesn't work in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed the part;

Proof of concept stream-based LSP event handling on the backend side
Intended for dealing with LSP event handler execution order, cancellation, and debouncing of expensive event handlers
Simple use case: aggregating document updates and handling them with a single on_change handler
See ebkalderon/tower-lsp#284 for examples of more complex scenarios that could arise

But the macro does not seem so good...

Copy link
Member

@Y-Nak Y-Nak Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, I don't think we need one-to-one correspondence between channels and request kinds.

  1. I would like to avoid async in our implementation. LSPs are mostly
    computationally bound rather than I/O bound, and async adds a lot of
    complexity to the API, while also making harder to reason about
    execution order. This leads into the second reason, which is...
  2. Any handlers that mutate state should be blocking and run in the
    event loop, and the state should be lock-free. This is the approach that
    rust-analyzer uses (also with the lsp-server/lsp-types crates as a
    framework), and it gives us assurances about data mutation and execution
    order. tower-lsp doesn't support this, which has caused some
    issues around data
    races and out-of-order handler execution.
  3. In general, I think it makes sense to have tight control over
    scheduling and the specifics of our implementation, in exchange for a
    slightly higher up-front cost of writing it ourselves. We'll be able to
    fine-tune it to our needs and support future LSP features without
    depending on an upstream maintainer.

This is the reasoning about why ruff chose lsp-server instead of tower-lsp.
It seems reasonable to me.
astral-sh/ruff#10158

Copy link
Collaborator Author

@micahscopes micahscopes Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I see how the macro isn't exactly a simple solution, and the motivation for it was also messy (getting around those tower-lsp LanguageServer trait limitations). My intention was to reduce the surface area of boilerplate when creating these channels to bridge to the Backend, but the tradeoff is opacity and complexity. Plus, even with this macro there's still a lot of boilerplate in setting up all the stream handlers.

As fun as it was to write I agree that it'd be good not to have it if possible.

As you mentioned on discord one possibility could be to use the support-mutable-methods branch, at least until they figure out a more permanent solution. I like that suggestion a lot, it'd seriously simplify things. That way there'd be no requirement to make a channel/stream just to handle an LSP event directly. We could still get the flexibility and control of the async Stream ecosystem as needed and the lock-free state but without needing to set up all those channels/streams by default. I also appreciate that the tower-lsp maintainers seem to be taking this issue seriously. Another possibility is this async-lsp crate which seems quite solid, even though the API is less readable.

As for the decision of lsp-server vs tower-lsp: I really like tower-lsp's high level API and the built-in task management inherent to rust async. Using lsp-server would require us to build some kind of task manager regardless, which to me seems like an inherently complex thing to reason about and maintain.

The thing that keeps bothering me is how LSP is an async protocol by design (1, 2) in nuanced ways that I'm not convinced can be adequately dealt with just by faithfully processing events in the order they're received... LSP events don't always have a well defined total order. Processing some events in order is essential (e.g. didChange events), but there are situations when a single user action can trigger multiple loosely coupled LSP events more or less concurrently. For example, renaming a file can trigger:

  • willRename (a request that the client waits on a response for before renaming)

Followed by:

  • didRename (after the client renames)
  • didChangeWatchedFiles (from the filesystem watcher with two changes, a create and delete, seems to be delayed by ~1 second on my computer)
  • didOpen (when the file with the new name is opened)
  • didClose (when the file of the previous name is closed)

Great care is needed to ensure that the handlers for each of these does exactly what it needs to do without overlap or conflict, especially in cases where state gets mutated.

I'm anticipating edge cases that will be tricky to reason about imperatively, where we may want to condition event handling on the presence or absence of several other recent or pending events, perhaps over a given debounce period. Ideally we won't need to do this, but it doesn't seem unlikely, especially when we get into code actions or refactoring features.

The idea of trying to manage these complexities imperatively/synchronously with a custom task manager makes me nervous, regardless of how faithfully we adhere to the order these events arrive in.

In spite of the current rigidity of tower-lsp and the potential complexity of choosing rust async, I do think the async ecosystem (especially streams) could bring a lot of flexibility in managing execution order and in making sense of concurrent events, if we use it carefully.


A few more notes about it before letting that proc macro go (RIP 🪦):

  • It only makes channels for trait methods that are implemented, so only for the those that are used
  • I thought about using an enum or doing some kind of "multiplexing" over a single channel but my the amount of boilerplate code it would take to define the enums and join the data and split the channel separate streams seemed comparable to the amount it would take to just have a single channel per LSP event type. Which either way felt like too much.
  • The macro was inspired by the one used to generate the routes in tower-lsp (here) but I admit that it got quite a bit more complex, a work of art

Comment on lines +39 to +47
tokio::select! {
// setup logging
_ = handle_log_messages(rx, server.client.clone()) => {},
// start the server
_ = tower_lsp::Server::new(stdin, stdout, socket)
.serve(service) => {}
// backend
_ = functionality::streams::setup_streams(&mut backend, message_receivers) => {}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you need tokio::select here.

Is there any reason that you can't do the one below?

// Prepare channels for `Server` <-> `Backend` communication.
let (tx, rx) = make_channels(...);

let backend = Backend::new(rx, ...);
// Run backend in a child thread.
tokio::spawn(backend.run());

// Make `Server`, `Backend` and the thread for logging is managed by 
// this server via channels for e.g., graceful shutdown.
let server = ...

// Run server.
server.serve(service).await;

crates/language-server/src/functionality/item_info.rs Outdated Show resolved Hide resolved
crates/language-server/src/functionality/item_info.rs Outdated Show resolved Hide resolved
crates/language-server/src/functionality/item_info.rs Outdated Show resolved Hide resolved
Copy link
Member

@Y-Nak Y-Nak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a review for the implementation of goto.


An additional note for future reference:

For optimization purposes, it would be beneficial to narrow the possible range before collecting actual paths when we know the passed cursor is included in ItemKind::Body.

This is because:

  • The Body contains numerous paths (even local variables are defined as paths in HIR).
  • Hir maintains Expr <=> Span and Stmt <=> Span mapping

Therefore, it would be more efficient to narrow down the possible range before collecting paths. Implementing find_enclosing_stmt and find_enclosing_expr would be required for this((I think these are necessary in the near future either way).

This narrowing-down-range feature should probably be generalized, e.g.,

/// Find the closest HIR element that includes the cursor, 
/// by traversing the elem-span hierarchy.
fn find_closest_hir_elem<T: HirElem>(db: &dyn LanguageServerDb, cursor: Cursor) -> Option<T> {
    ...
}

crates/language-server/src/functionality/goto.rs Outdated Show resolved Hide resolved
Comment on lines 54 to 58
fn visit_ident(
&mut self,
ctxt: &mut VisitorCtxt<'_, hir::visitor::prelude::LazySpanAtom>,
ident: hir::hir_def::IdentId,
) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PathSpanCollector::visit_item is called; this means all idents are collected as a path segment even if the ident is not a segment(e.g. if the item is Func, then func name is also collected).
It's necessary to collect segments manually by iterating them in visit_path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see. That will be more straight to the point.

crates/language-server/src/functionality/goto.rs Outdated Show resolved Hide resolved
crates/language-server/src/functionality/goto.rs Outdated Show resolved Hide resolved
crates/language-server/src/functionality/streams.rs Outdated Show resolved Hide resolved
Comment on lines 53 to 84
pub(super) async fn handle_deleted(
&mut self,
params: lsp_types::FileEvent,
tx_needs_diagnostics: tokio::sync::mpsc::UnboundedSender<String>,
) {
let path = params.uri.to_file_path().unwrap();
info!("file deleted: {:?}", path);
let path = path.to_str().unwrap();
let _ = self
.workspace
.remove_input_for_file_path(&mut self.db, path);
let _ = tx_needs_diagnostics.send(path.to_string());
}

pub(super) async fn handle_change(
&mut self,
doc: TextDocumentItem,
tx_needs_diagnostics: tokio::sync::mpsc::UnboundedSender<String>,
) {
info!("change detected: {:?}", doc.uri);
let path_buf = doc.uri.to_file_path().unwrap();
let path = path_buf.to_str().unwrap();
let contents = Some(doc.text);
if let Some(contents) = contents {
let input = self
.workspace
.touch_input_for_file_path(&mut self.db, path)
.unwrap();
let _ = input.sync_from_text(&mut self.db, contents);
}
let _ = tx_needs_diagnostics.send(path.to_string());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Associated change for the removed of the stream forking)

Suggested change
pub(super) async fn handle_deleted(
&mut self,
params: lsp_types::FileEvent,
tx_needs_diagnostics: tokio::sync::mpsc::UnboundedSender<String>,
) {
let path = params.uri.to_file_path().unwrap();
info!("file deleted: {:?}", path);
let path = path.to_str().unwrap();
let _ = self
.workspace
.remove_input_for_file_path(&mut self.db, path);
let _ = tx_needs_diagnostics.send(path.to_string());
}
pub(super) async fn handle_change(
&mut self,
doc: TextDocumentItem,
tx_needs_diagnostics: tokio::sync::mpsc::UnboundedSender<String>,
) {
info!("change detected: {:?}", doc.uri);
let path_buf = doc.uri.to_file_path().unwrap();
let path = path_buf.to_str().unwrap();
let contents = Some(doc.text);
if let Some(contents) = contents {
let input = self
.workspace
.touch_input_for_file_path(&mut self.db, path)
.unwrap();
let _ = input.sync_from_text(&mut self.db, contents);
}
let _ = tx_needs_diagnostics.send(path.to_string());
}
pub(super) async fn handle_change(
&mut self,
change: FileChange,
tx_needs_diagnostics: tokio::sync::mpsc::UnboundedSender<String>,
) {
let path = change.uri.to_string();
match change.kind {
ChangeKind::Open(contents) => {
info!("file opened: {:?}", &path);
self.update_input_file_text(&path, contents);
}
ChangeKind::Create => {
info!("file created: {:?}", &path);
let contents = tokio::fs::read_to_string(&path).await.unwrap();
self.update_input_file_text(&path, contents)
}
ChangeKind::Edit(contents) => {
info!("file edited: {:?}", &path);
let contents = if let Some(text) = contents {
text
} else {
tokio::fs::read_to_string(&path).await.unwrap()
};
self.update_input_file_text(&path, contents);
}
ChangeKind::Delete => {
info!("file deleted: {:?}", path);
self.workspace
.remove_input_for_file_path(&mut self.db, &path)
.unwrap();
}
}
tx_needs_diagnostics.send(path).unwrap();
}
fn update_input_file_text(&mut self, path: &str, contents: String) {
let input = self
.workspace
.touch_input_for_file_path(&mut self.db, path)
.unwrap();
let _ = input.sync_from_text(&mut self.db, contents);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used tokio::fs::read_to_string here, but that might introduce the possibility of changes being handled out of order.

I share Yoshi's skepticism of using async because of these tricky situations, but I remain open minded. If we did drop async, I think the "streamy" essence of the code could remain, we'd just eg explicitly spawn a thread (or "actor") that sequentially handles all change events that arrive on a channel.

Copy link
Collaborator Author

@micahscopes micahscopes Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried it yet but this looks great. Can you help me understand how these changes might get handled out of order? My thought is that by merging all those change events into a single stream we will enforce that they're processed serially in the order they were merged, at least relative to one another. The tokio::fs::read_to_string.await will yield to other non-change-stream events but block further change handling until finished.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(By the way, I shared my complex thoughts on the async stuff here: #979 (comment))

@micahscopes
Copy link
Collaborator Author

Closing in favor of #1022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants