Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taskchampion Python bindings #385

Open
illya-laifu opened this issue May 4, 2024 · 15 comments
Open

Taskchampion Python bindings #385

illya-laifu opened this issue May 4, 2024 · 15 comments

Comments

@illya-laifu
Copy link

Hello,

Really appreciate your work on taskwarrior/taskchampion!

With the release of a new way to interact with the app (via the rust library), python bindings are needed. It's a big request, so I come bearing gifts - a fork with a POC barebones attempt to connect with an existing database instance and task creation here.

It's very much a barebones POC, with two methods, to get the impression of the workload needed; right now the code lives in the py-lib directory, and can be assembled via maturin.

Here's the demo in python console as an easy test:
Screenshot_20240504_160351

Guess I am volunteering to work on this?

@djmitche
Copy link
Collaborator

djmitche commented May 5, 2024

That's fantastic! And, I think this approach looks good: interfacing directly from Python into the taskchampion crate. The taskchampion-lib crate was a bit of a mistake (soon to be replaced with use of CXX from Taskwarrior), so I'm glad you didn't choose to use that.

A few questions toward finding a way to make progress here;

  • Is there an example you're following, embedding a Rust library in a Python app?
  • How are PyO3 wrappers typically packaged and released? Are they usually developed in the same repo as the library?

Maybe a good next step is to get something that works into a PR and get that merged, with multiple TODO's scattered about. That can include README describing the intended implementation, and indicating that it's not yet stable or ready for use. Then we can get a few more issues filed and perhaps a few people can work on separate aspects in parallel.

@illya-laifu
Copy link
Author

To answer your questions:

  • I wasn't following any specific example, just what I saw in the taskchampion_lib crate as an inspiration, plus a healthy dose of docs from the pyo3. However, there are a number of crates that do could be used as an example, like tokenizers
  • I am not sure about the best practices for the PyO3 packaging, the tokenizers put all their bindings in a single subfolder called bindings, but it is possible to do this standalone as well. Packaging is done via the same tool used to instantiate and build the project - maturin. On build is builds a wheel package, that can be uploaded to pypi via maturin publish, or by using other publishing tools.

@illya-laifu
Copy link
Author

As per your suggestion, I will continue work in the forked repository, so we can possibly PR. This can be pulled into its own, separate repository later on, if need arises.

@djmitche
Copy link
Collaborator

djmitche commented May 6, 2024

Sounds good!

@illya-laifu
Copy link
Author

Ok, I think I have the the core. I'll create PR in a bit, documenting the install/test process.

Currently implemented and tested:

  • Annotation class and the getters/setters for the two fields.
  • Replica class (::dependency_map method needs more testing and polishing due to the Rc reference)
  • Status enum
  • Tag class
  • Task class *
  • WorkingSet class

Implemented but not tested:

  • DependencyMap class
  • Storage concrete classes.

TODO:

  • Be able to construct replica by passing in the specific Storage implementations (e.g. InMemoryStorage, SqliteStorage or users own implementation of Storage)
  • Current Task class is the immutable version of it, it would be nice to extend it to also include the methods from TaskMut and convert to and the mutable state in those method calls. It is impossible to implement the TaskMut via pyo3 due to the lifetime parameters.*
  • Be able to iterate over WorkingSet via next() and with for item in ws:, but this needs it to hold the reference to an iterator.
  • Python has it's own uuid library, currently the methods that accept a UUID, do that by accepting str and converting it into the rust's implementation of UUID.
  • Better documentation, but the Rust comments on the pyo3 annotated functions/objects directly translate into the docstrings.
  • Not entirely sure if it is possible (need to research), but it would be convenient to provide signature stub files along with the library, so the LSP's, linters and such would be able to infer the types.

@djmitche
Copy link
Collaborator

This is really cool! I will try to take a look, but I've been quite busy so I invite anyone else following along to also jump in and offer feedback.

@illya-laifu
Copy link
Author

Quick update on the todo items -- adding signature stub files is quite simple, just a bit tedious.

image

@djmitche
Copy link
Collaborator

I wonder if the new TaskData interface (#372) is easier to wrap into Python?

@illya-laifu
Copy link
Author

There seems to be a few API changes as well, with the deprecations of many Replica related operations, so I'll look into those, and bring up the deprecations up-to-date.

@illya-laifu
Copy link
Author

I am pretty much done with the updates, had to make a few changes due to API updates -

  • instead of passing around address of Operations for methods like create_task(&mut self, &mut Operations), I create the Operation internally, and return the Operation to the caller, so it's up to the developer to add it into an array and pass said array into commit_operations():
ops = []
task, op = empty_replica.create_task(uuid.uuid4())
ops.append(op)

op = task.set_status(Status.Pending)
ops.append(op)

....

empty_replica.commit_operations(ops)

tasks = empty_replica.all_task_uuids()
assert len(tasks) == 1

The only issue i have ran into so far - my previously written tests for the WorkingSet are now broken, and I haven't been able to figure out why yet.

@pytest.fixture
def working_set(tmp_path: Path):
    r = Replica(str(tmp_path), True)
    ops = []
    result = r.create_task(str(uuid.uuid4()))
    assert result is not None
    task, op = result
    ops.append(op)

    result = r.create_task(str(uuid.uuid4()))
    assert result is not None
    task, op = result
    ops.append(op)
    r.commit_operations(ops)

    return r.working_set() # For whatever reason this working set is empty, even though `r.all_task_uuids()` is not, and contains valid UUIDs. 

Out of desperation, I've tried to r.rebuild_working_set to no effect.

I've noticed that the C bindings are gone now, have they been moved to a separate repo, or removed? And if so - should this wrapper also be standalone?

@djmitche
Copy link
Collaborator

That's fantastic! I am glad to hear the new API worked well.

instead of passing around address of Operations...

This sounds like a great change. That's basically what the Rust code is doing, just in a more ergonomic way in that language.

Speaking of ergonomics, consider having create_task always return a 2-tuple, that is (None, None) on error? That avoids the need to use a temporary variable and later unpack:

(task, op) = r.create_task(...)
assert task is not None
...

my previously written tests for the WorkingSet are now broken

I suspect this is because the created tasks have no status - they are just a uuid with no properties. I suspect adding

ops.append(task.set_status(Status.Pending))

after each ops.append(op) would fix this issue.

I've noticed that the C bindings are gone now, have they been moved to a separate repo, or removed? And if so - should this wrapper also be standalone?

The story is: when I started on all of this, my thinking was that Taskchampion should have a single C interface, and all other languages would integrate against that. It turns out that's not really "how it's done", now that I've had a chance to see some other Rust FFI. Instead, most languages integrate directly with the Rust, even if there's some C involved in the implementation of that integration. And, for the most part C++ applications tend to build their own integrations that are specific to how they use the Rust library, while Python applications tend to use a Python package wrapping the Rust library.

So, the C bindings went away, and Taskwarrior now has its own C++ bindings (using https://cxx.rs) that are specific to how it wants to work with Taskchampion. Notably, it doesn't wrap the Task type, just TaskData. The cxx crate uses C ABIs internally, but they're pretty well hidden.

I don't have a strong opinion on whether the Python wrapper should be in this repo or not. A few thoughts:

  • In this repo:
    • CI can help us keep it working and up-to-date with TaskChampion
    • The wrapper will share maintainership with Taskchampion
    • Wrapper and TaskChampion can be released together and co-versioned
    • But, people working on this repo may know less about Python and make un-Pythonic decisions
  • In another repo:
    • Invites other language users to create and own similar repos without asking for permission (I think this is a good thing!)
    • Can be maintained and versioned independently of TaskChampion
    • Can easiliy be forked if some decision is not appropriate for some set of users (although this seems unlikely)
    • But, changes to TaskChampion might require toil by maintainers of this repo in order to "keep up", and there is the risk of the repo becoming umaintained

What would you prefer?

@illya-laifu
Copy link
Author

Speaking of ergonomics, consider having create_task always return a 2-tuple, that is (None, None) on error? That avoids the need to use a temporary variable and later unpack

That a good suggestion, the only problem is the Python's error handing - if an error occurs, let's say, during UUID parsing, the Python raises a RuntimeException. So there wouldn't be a return value to work with, as the program wound just quit on an error.

The signature of these functions is still not correct however - I foolishly copied over Rust signatures, forgetting about the raising of errors. Removing python's Optional from the signatures resolves this -

# before
def create_task(uuid: str) -> Optional[tuple[Task, Operation]]: ...

# after
def create_task(uuid: str) -> tuple[Task, Operation]: ...

I found the issue with my tests and partially my code - the set_status call creates not 1, but 2 Operations - First one sets modified timestamp, which otherwise is None and the second one actually modifies the status property.

# before 
def set_status(status: Status) -> Operation: ...

# after
def set_status(status: Status) -> list[Operation]: ...

Making this change and extending the ops list solves the issue with the tests. This is something I was afraid of when implementing the original return-the-operation approach - I wasn't sure if there was a situation where a single method call inserted multiple Operations instead of just one. But this should still be an easy fix, as all I have to do is return the entire list instead of a single Operation.


I started this project from a fork, because it was already being done with the C library. But there really is no reason to couple this wrapper with the Taskchampion as all it's doing is providing a convenience wrapper to public structs/methods - the same can be accomplished by using it as a dependency.

@djmitche
Copy link
Collaborator

But this should still be an easy fix, as all I have to do is return the entire list instead of a single Operation.

Oh, good find! Just to be sure: that will be necessary for all modifications, not just set_status.

Regarding the repository: would you like me to create a new repo in this org? Perhaps taskchampion-py?

@illya-laifu
Copy link
Author

Oh, good find! Just to be sure: that will be necessary for all modifications, not just set_status.

Yea, that was my plan all along.

Regarding the repository: would you like me to create a new repo in this org? Perhaps taskchampion-py?

Feel free to!
I already have a working repository that is reduced to just the inner module of the py-lib contents, with working tests.

@djmitche
Copy link
Collaborator

djmitche commented Sep 2, 2024

I've added you with write access to empty repo https://github.com/GothenburgBitFactory/taskchampion-py/. Feel free to push to that!

Please have a look at this repo for the necessary metadata files (license, code of conduct, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants