Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add awareness features to handle server state #170

Merged
merged 21 commits into from
Oct 9, 2024

Conversation

brichet
Copy link
Contributor

@brichet brichet commented Oct 3, 2024

This PR adds the awareness class, extracted from https://github.com/jupyter-server/pycrdt-websocket.

This PR adds some features to that awareness to allow the server to update and observe it.

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @brichet!
Now that pycrdt has an API reference documentation, all public classes and methods should have docstrings.
Also, we'll need 100% test coverage. You can check with:

coverage run -m pytest tests
coverage report --show-missing --fail-under=100

tests/test_awareness.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
@brichet
Copy link
Contributor Author

brichet commented Oct 3, 2024

Also, we'll need 100% test coverage.

It should be OK now

python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
@brichet brichet changed the title Add awareness Add awareness features to handle server state Oct 4, 2024
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
python/pycrdt/_awareness.py Outdated Show resolved Hide resolved
@davidbrochart
Copy link
Collaborator

@brichet I took the liberty to make some changes in 98540a5, please let me know if you agree with them.
Among other things, I removed the on_change() callback. The encoded state is just (optionally) returned when the local state is set. That lets the door open for emitting the encoded state in the future.

@brichet
Copy link
Contributor Author

brichet commented Oct 7, 2024

Thanks for the suggestion, I can take a look at it..
Maybe we should keep the get_change() as deprecated to avoid backward incompatibility.

@brichet
Copy link
Contributor Author

brichet commented Oct 7, 2024

@davidbrochart I updated the class according to your suggestions, that should be closer to awareness.js now.

@davidbrochart
Copy link
Collaborator

Thanks, do you mind if I take it from now? It will be quicker than requesting changes, and you will still be able to accept/reject my changes afterwards.

@davidbrochart
Copy link
Collaborator

@brichet Let me know what you think about c75c8f9.

Copy link
Contributor Author

@brichet brichet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davidbrochart.

Does it work on your side ? I have errors when I try to use it:

      File "/home/brichet/projects/pycrdt/python/pycrdt/_awareness.py", line 136, in apply_awareness_update
        state_str = decoder.read_var_string()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/brichet/projects/pycrdt/python/pycrdt/_sync.py", line 227, in read_var_string
        return message.decode("utf-8")
               ^^^^^^^^^^^^^^^^^^^^^^^
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte

Comment on lines 221 to 223
def _update_states(
self, client_id: int, clock: int, state: Any, states_changes: dict[str, list[int]]
) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could be kept, using an origin to know if we should remove the local state or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's slightly different between setLocalState and applyAwarenessUpdate, right? In particular setLocalState doesn't handle the clock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand "handle the clock". It does update it too, no ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what about this, that _update_states() was doing inconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it was wrong in _update_states(), it should also have tested that origin is not "local".
Anyway, the current implementation works, it was just to avoid duplicating code.

@davidbrochart
Copy link
Collaborator

Does it work on your side ? I have errors when I try to use it:

      File "/home/brichet/projects/pycrdt/python/pycrdt/_awareness.py", line 136, in apply_awareness_update
        state_str = decoder.read_var_string()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/brichet/projects/pycrdt/python/pycrdt/_sync.py", line 227, in read_var_string
        return message.decode("utf-8")
               ^^^^^^^^^^^^^^^^^^^^^^^
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte

Do you have a reproducible example?

@brichet
Copy link
Contributor Author

brichet commented Oct 8, 2024

Do you have a reproducible example?

Not really, I have an environment with dev install of pycrdt, pycrdt_websocket, jupyter_collaboration and jupyter_ydoc. All these packages need to be updated to test it properly.
I'll try to figure out where the error come from.

Comment on lines 57 to 58
if state is None:
del self._states[client_id]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably raise an error if the local state has not been defined and we try to set it as None.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising an exception is a feature IMO (rather than keep it silent), but maybe you mean that we should raise a more meaningful exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to raise an exception, if we want to set it to null but it is already the case, it might be silent, like here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in set_local_state we are responsible for what we do, in apply_awareness_update less so. But I don't have a strong opinion on this, I will add the check back 👍

# Remote client removed the local state. Do not remove state.
# Broadcast a message indicating that this client still exists by increasing
# the clock.
clock += 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be

clock = curr_clock + 1

to update the local clock ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mimics the JavaScript implementation. Should it be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood the clock, but I understood that each client has its own.
If this is the case, why should we rely on a value coming from a client to update the local clock ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to open an issue in https://github.com/yjs/y-protocols to discuss this, or maybe @dmonad can comment here?

Comment on lines 162 to 163
elif client_meta is not None and state is None:
removed.append(client_id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we not fallback here if a remote client try to remove the local state ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the translation of this. Do you mean that there is an issue in the JavaScript implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe...
We should test it, but it seems to me that if a remote client try to set the local awareness to null, this condition is fulfilled.
The client_meta is the local meta (and may be not null), and the state is the state to apply, which is null in that case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But a remote client cannot remove the local state, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that, but it will send a wrong information to other client, that the local state has been removed, no ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, maybe open a PR in https://github.com/yjs/y-protocols? It should be fixed there too, and they will confirm if it's a bug.

@brichet
Copy link
Contributor Author

brichet commented Oct 8, 2024

Thanks @davidbrochart.

Does it work on your side ? I have errors when I try to use it:

      File "/home/brichet/projects/pycrdt/python/pycrdt/_awareness.py", line 136, in apply_awareness_update
        state_str = decoder.read_var_string()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/brichet/projects/pycrdt/python/pycrdt/_sync.py", line 227, in read_var_string
        return message.decode("utf-8")
               ^^^^^^^^^^^^^^^^^^^^^^^
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte

@davidbrochart it seems that we do not take into account the full length of the message (first integer), when encoding and decoding.
This call was removing the first integer, corresponding to the length in bytes of the message, but has been remove in a previous commit.
And when encoding, we should add the full length before returning the message (https://github.com/jupyter-server/pycrdt/pull/170/files#diff-d0908cdc750b8113ae6b4433e378db136aebe96a30e773aef2cf08313b6669d8R117), I observe the same error in console.

image

@davidbrochart
Copy link
Collaborator

Good catch!
The thing is I don't know if if should be part of encode_awareness_update and apply_awareness_update. It doesn't seem to be the case in awareness.js. Since this PR is already a breaking change, maybe we should let it as-is, and do the job outside?

@davidbrochart
Copy link
Collaborator

An alternative would be to pass from_message/to_message boolean parameters, that would do the extra work if True.

@brichet
Copy link
Contributor Author

brichet commented Oct 8, 2024

An alternative would be to pass from_message/to_message boolean parameters, that would do the extra work if True.

I can't see a use case where we need an 'incomplete' encoded awareness (without the length).
Maybe we don't have to mimic exactly what is done in awareness.js if it is not relevant.

Also the YMessageType is missing from the encoder, and again I don't really know when encode_awareness_update() can be called for another message type....

@davidbrochart
Copy link
Collaborator

An update and a message are two different things, the latter encapsulates the former. For instance, Doc.get_update() is a document update that also has to be encapsulated in a message, using create_message().
So I think encode_awareness_update() and apply_awareness_update() are fine like that, they do what their name implies. But I can add a create_awareness_message(update) function (read_message() can be used to get an awareness update from a message).

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to go, any objection to merge @brichet?

@brichet
Copy link
Contributor Author

brichet commented Oct 9, 2024

I think it's good to go, any objection to merge @brichet?

Yes, it looks good to me too.

@davidbrochart
Copy link
Collaborator

Thanks @brichet.

@davidbrochart davidbrochart merged commit fd74268 into jupyter-server:main Oct 9, 2024
27 checks passed
@brichet brichet deleted the awareness branch October 9, 2024 09:26
@brichet
Copy link
Contributor Author

brichet commented Oct 9, 2024

Thanks for helping a lot on it @davidbrochart.

@davidbrochart
Copy link
Collaborator

Released in v0.10.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants