Logseq Sync

An attempt at an open-source version of the Logseq Sync service, intended for individual, self-hosted use.

It's vaguely functional (see What Works? below), but decidedly pre-alpha software. Definitely don't try to point a real, populated Logseq client at it, I have no idea what will happen.

What's Done/Exists?

Right now, the repo contains (in cmd/server) a mostly implemented version of the Logseq API, including credentialed blob uploads, signed blob downloads, a SQLite database for persistence, and most of the API surface at least somewhat implemented.

Currently, running any of this requires a modified version of the Logseq codebase (here), and the @logseq/rsapi package (here)

On that note, many thanks to the Logseq Team for open-sourcing rsapi recently, it made this project significantly easier to work with.

What Works?

With a modified Logseq, you can use the local server to

Create a graph
Upload (passphrase-encrypted) encryption keys
Get temporary AWS credentials to upload your encrypted files to your private S3 bucket
Upload your encrypted files

And that's basically the full end-to-end flow! The big remaining things are:

Implement the WebSockets protocol
- There's some documentation for it
Figure out how/when to increment the transaction (tx) counter

API Documentation

There's some documentation for the API in docs/API.md. This is the area I could benefit the most from having more information/help on, see Contributing below

Open Questions

S3 API

The real Logseq Sync API gets temp S3 credentials and uploads files direct to S3. I haven't looked closely enough to see if we can swap this out for something S3-compatible like s3proxy or MinIO, see #2 for a bit more discussion.

Currently, amazonaws.com is hardcoded in the client, so that'll be part of a larger discussion on how to make all of this configurable in the long run.

Associated Changes to Logseq

Being able to connect to a self-hosted sync server requires some changes to Logseq as well, namely to specify where your sync server can be accessed. Those changes are in a rough, non-functional state here: https://github.com/logseq/logseq/compare/master...bcspragu:logseq:brandon/settings-hack

Adding a database migration

The self-hosted sync backend has rudimentary support for persistence in a SQLite database. We use sqlc to do Go codegen for SQL queries, and Atlas to manage generating diffs.

The process for changing the database schema looks like:

Update db/sqlite/schema.sql with your desired changes
Run ./scripts/add_migration.sh <name of migration> to generate the relevant migration
Run ./scripts/apply_migrations.sh to apply the migrations to your SQLite database

Why do it this way?

With this workflow, the db/sqlite/migrations/ directory is more or less unused by both sqlc and the actual server program. The reason it's structured this way is to keep a more reviewable audit log of the changes to a database, which a single schema.sql doesn't give you.

Contributing

If you're interested in contributing, thanks! I sincerely appreciate it. There's a few main avenues for contributions:

Getting official buy-in from Logseq

The main blocker right now is getting buy-in from the Logseq team, as I don't want to do the work to add self-hosting settings to the Logseq codebase if they won't be accepted upstream. I've raised the question on the Logseq forums, as well as in a GitHub Discussion on the Logseq repo, but have received no official response.

Understanding/documenting the API

One area where I would love help is specifying the official API more accurately. My API docs are based on a dataset of one, my own account. So there are areas that are underspecified, unknown, or where I just don't understand the flow. Any help there would be great!

Specifically, I'd like to understand:

The details of the WebSocket protocol (doc started here), and
How and when to update the transaction counter, tx in the API

Debugging S3 signature issues

I believe there's a bug (filed upstream, initially here) in the s3-presign crate used by Logseq's rsapi component, which handles the actual sync protocol bits (encryption, key generation, S3 upload, etc).

The bug causes flaky uploads with self-hosted, AWS-backed (i.e. S3 + STS) servers, but I haven't had the time to investigate the exact root cause. The source code for the s3-presign crate is available here, the GitHub repo itself doesn't appear to be public.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
blob		blob
cmd		cmd
db		db
docs		docs
httperr		httperr
scripts		scripts
terraform		terraform
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
sqlc.yaml		sqlc.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Logseq Sync

What's Done/Exists?

What Works?

API Documentation

Open Questions

S3 API

Associated Changes to Logseq

Adding a database migration

Why do it this way?

Contributing

Getting official buy-in from Logseq

Understanding/documenting the API

Debugging S3 signature issues

About

Releases

Contributors 2

Languages

License

bcspragu/logseq-sync

Folders and files

Latest commit

History

Repository files navigation

Logseq Sync

What's Done/Exists?

What Works?

API Documentation

Open Questions

S3 API

Associated Changes to Logseq

Adding a database migration

Why do it this way?

Contributing

Getting official buy-in from Logseq

Understanding/documenting the API

Debugging S3 signature issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 2

Languages