Skip to content
This repository has been archived by the owner on Jan 9, 2023. It is now read-only.

Syncing for offline use not working #731

Closed
jkleinsc opened this issue Oct 12, 2016 · 25 comments
Closed

Syncing for offline use not working #731

jkleinsc opened this issue Oct 12, 2016 · 25 comments
Labels
🐛bug issue/pull request that documents/fixes a bug in progress indicates that issue/pull request is currently being worked on

Comments

@jkleinsc
Copy link
Member

Expected behavior:
When a user is online records should be synced to the user's browser for offline use.
Actual behavior:
Records are not synced for offline use.
Steps to reproduce:

  1. Make sure you don't have offline records already synced. Make sure the app isn't running in any tabs and then you can clear out any offline data stored by going to chrome://settings/cookies in a Chrome tab. You should see a screen like this:
    screen shot 2016-10-12 at 4 14 59 pm
    Type in localhost (or whatever host your app is running on) and you should see an entry like the one in the screenshot. Click on the entry and then click on the X on the right hand side of the screen.
  2. Make sure the app is running via ember serve or via hospitalrun-server.
  3. Navigate to the application and verify that records are available. For testing simplicity I would use the sample-data set here:
    https://github.com/HospitalRun/hospitalrun-frontend/blob/master/sample-data.txt
  4. Leave the app up for 5 minutes to make sure syncing has time to complete.
  5. Turn off the server and then check in the app if the records you saw while the server was running are still there.
    Screenshots (if applicable):
    N/A
    OS and Browser:
    Chrome 53.0.2785.143 (64-bit)
@jkleinsc jkleinsc added 🐛bug issue/pull request that documents/fixes a bug Offline help wanted indicates that an issue is open for contributions labels Oct 12, 2016
@jkleinsc
Copy link
Member Author

Just a couple notes about this issue:

  1. Syncing/offline use is limited to browsers that support serviceworker and Chrome is the preferred browser to use for offline use.
  2. The synchronization code is all encapsulated here:
    https://github.com/HospitalRun/hospitalrun-frontend/blob/master/app/serviceworkers/pouchdb-sync.js

Solving this issue requires knowledge of PouchDB and serviceworkers. If you have a question on the current code and/or those technologies, please feel free to ping me (@jkleinsc)

@gnowoel
Copy link
Contributor

gnowoel commented Oct 17, 2016

I'd like to try fixing this if no one else does. Not sure how long it'll take since I'm new to both PouchDB and Service Workers.

@mdarmanin
Copy link
Contributor

@tangollama I'd like to pick this one up

@mdarmanin
Copy link
Contributor

@jkleinsc I closed all the tabs while server was running, then cleared all the localhost cookies. I already had some entries so I didn't use the test data. Left the app running for 5 mins. Turned off the server and the records are still showing.

image

@gnowoel
Copy link
Contributor

gnowoel commented Oct 17, 2016

@mdarmani Perhaps you could also check this discussion out. I can confirm the problem mentioned.

@jkleinsc jkleinsc added in progress indicates that issue/pull request is currently being worked on and removed help wanted indicates that an issue is open for contributions labels Oct 17, 2016
@jkleinsc
Copy link
Member Author

@mdarmani since @gnowoel asked first, I'd like to give @gnowoel this issue @gnowoel it is yours unless for some reason you have changed your mind about it.

A couple of notes:

  1. Serviceworkers do not run on a hard refresh (eg cmd+shift+r/ctrl+shift+r)

  2. I suspect the issue may be tied to an upgrade to a newer version of PouchDB. We recently upgraded from 5.3.2 to 5.4.5

  3. Part of the issue may have to do with the fact that the sync process between pouch (local offline in browser store) and couch (server side db store) runs through the service worker proxies so I think it is possible if the network request fails on a sync we could end up in a situation where the service worker code falls back to the local pouchdb and ends up syncing the pouchdb to itself instead of to the server side couchdb. I'll explain a little bit more below how the service worker code is designed:

  4. If you look at the top of the service worker code you'll see code like this:

toolbox.router.get('/db/main/_all_docs', function(request, values, options) {

This code is using https://github.com/GoogleChrome/sw-toolbox to add network proxies for calls to couchdb. These calls try to get the data from the network, but if that fails they grab the data from the local pouchdb
2) The setupRemoteSync function sets up the syncing between pouchdb and couchdb. The one thing we should do is set a header in the pouchOptions.ajax.headers that notes that the request originated from the local pouchdb, eg something like:

pouchOptions.ajax.headers['x-local-pouch-request'] = true;

and then in the couchDBResponse function we can check to see if that header is set and if it is and the network request fails, that failure should be returned instead of calling runPouchFn
This would make sure that we aren't running into the issue where a failed server side sync is causing local pouch to sync with itself. There may be other issues, but this is at least one thing that needs improvement/fixing.

@gnowoel
Copy link
Contributor

gnowoel commented Oct 17, 2016

@jkleinsc Thanks for your guidance! Will check it out.

@gnowoel
Copy link
Contributor

gnowoel commented Oct 20, 2016

@jkleinsc No fix yet. Just some random findings :)

In my experiments, I tried to directly wire the local Pouch to the remote Couch, but the problem still persists. Considering the requests triggered within a service worker will not go through the proxy routes, I guess the reason should not be self-syncing.

From what I saw, the initial replication always worked, but it then just paused indefinitely. I tried many ways to rule out the unrelated facts, like writing a service worker by hand, upgrading PouchDB, and even disabling Ad blocker. But no matter what, the syncing still doesn’t work.

Then I moved the syncing part to the web page (outside of the service worker), and suddenly everything just went fine.

So I suspect it has to do with how we use PouchDB. We currently create the PouchDB database in two places, one in the web page for querying database, and the other in the service worker for syncing (and falling back, but never used I believe). Although we use the same name when creating and they are actually the same instance in the IndexdDB, but this arrangement might break syncing somehow. Not sure why though, because it should be legal to use one database in multiple tabs. Is it missing change events? Did it work before?

If this is what happened, I think we have at least two choices:

  • Use a proxy library (a fake PouchDB) for querying database, and the requests would be delegated to the real PouchDB in the service worker, which syncs with the remote CouchDB;
  • Use remote CouchDB for querying database exclusively, and intercept requests in the service worker for offline usage, which is when PouchDB comes into play.

I learned the first approach from worker-pouch, and the second one seems to be what are currently using. No idea which is better. But I think we need to remove the PouchDB from the web page either way.

But is it really the root of the problem? What's your thoughts on this?

@jkleinsc
Copy link
Member Author

@gnowoel thanks for the research. The local db created in the web page

return this._createLocalDB('localMainDB', pouchOptions);
is only a fallback for non service worker browsers and should never fire if seviceworkers are working. I guess we could verify whether or not this is happening or not via console.log or debug.

The code in the service worker basically does the same thing as worker-pouch except it doesn't do any of the work in the web page thread.

@jkleinsc
Copy link
Member Author

@gnowoel I just retested this and everything worked correctly for me offline. Can you retest and see if it works for you with the latest?

@gnowoel
Copy link
Contributor

gnowoel commented Oct 24, 2016

@jkleinsc In my case, the initial syncing always works, but the subsequent changes from either end won't be populated to the other one. I'm experimenting with a minimal setup (only index.html and sw.js) with various combinations, but I haven't got an answer yet.

@gnowoel
Copy link
Contributor

gnowoel commented Oct 25, 2016

@jkleinsc I still see the same problem in my mini setup. It works for the initial replication, but no syncing happens on data change afterwards.

Before testing, I first clear the data and unregister any existing Service Worker. Then I do a hard reload (Shift-Command-R) followed by a normal reload (Command-R). When the Service Worker is running, I manage the local PouchDB through the exposed localMainDB variable in Chrome's Developer Tools, like:

localMainDB.info().then(function(info) {
  console.log(info);
});

This way, I can change a record and see if it's populated to the remote CouchDB.

I can also do it in the other way around, by manually updating a record in CouchDB with Futon/Fauxton, and checking if it has been pushed back to the local PouchDB in the Service Worker's console window.

I check the status of the Service Worker in this page: chrome://serviceworker-internals/, and open a console window from here: chrome://inspect/#service-workers.

No idea about why syncing doesn't work in my testing. Does it have to do with the fact that a Service Worker is stoppable? It would be nice if you could take a look at the sw.js file in my mini setup, because I believe the root of the problem is hiding somewhere there.

@jkleinsc
Copy link
Member Author

@gnowoel as far as I can tell sw.js looks ok.

I asked on the PouchDB slack about this issue and here is what Nolan Lawson (one of the PouchDB maintainers) had to say:
From https://pouchdb.slack.com/archives/general/p1477069584001156
nolanlawson [1:06 PM]
jkleinsc I've never heard of that before, but in principle SWs can be terminated at any time by the browser so you have to prepare for that situation. the browser can shut it down for performance reasons; it's not guaranteed to be constantly running
if you need it to be constantly running then the best bet is Background Sync, which is a bit of a misnomer but basically just means "wake PouchDB up when the server says there's a change"

jkleinsc [1:40 PM]
nolanlawson are you talking about serviceworker background sync (eg https://developers.google.com/web/updates/2015/12/background-sync) or is there some convention in pouchdb for background sync?

nolanlawson [1:43 PM]
I was talking about this, yeah ^

[1:44]
I would have written a library for PouchDB, but I think honestly the best bet is 1) server pokes the browser 2) pouchdb does a single-shot replication 3) pouchdb waits for server to poke it again
which doesn't need a library

@broerse
Copy link

broerse commented Oct 25, 2016

Not sure about this but perhaps check if the ember-pouch unloadedDocumentChanged is fired when records should sync?

@gnowoel
Copy link
Contributor

gnowoel commented Oct 26, 2016

Nolan Lawson's suggestion is truly inspiring. Will see what I can do with the one-shot approach.

@gnowoel
Copy link
Contributor

gnowoel commented Oct 28, 2016

@jkleinsc I managed to make the Background Sync approach basically work and here's how.

In the web page:

  • Create local and remote databases
  • Sync databases for the first time
  • Watch for changes and trigger sync events
  • Register a service worker

And in the service worker:

  • Create local and remote databases
  • Capture the sync event and sync databases

This arrangement however has some known problems:

  • We have PouchDB instances in both the web page and the service worker, but the sync event will only be triggered when changes of data happen in the web page (i.e. changes from within the service worker will be ignored).
  • I tried to do the initial sync in the service worker, but failed with a 500 error (due to some _local records missing if we use db.replicate(), or empty response if we use db.sync()).

A sample app can be found here.

This is just for your consideration and I have no idea if it's applicable to HospitalRun. I think I need your guidance on what to do next.

@jkleinsc
Copy link
Member Author

jkleinsc commented Oct 31, 2016

@gnowoel thanks for your work on this. I have a couple comments:

  1. Ideally the local pouchdb should only be in the SW. I'd like to see some more details about the sync failing in the SW.
  2. I think the solution for this problem will probably require a multi faceted approach using background sync as well as possibly using the push API (eg https://serviceworke.rs/push-simple.html)
  3. I think instead of using one time sync events, periodic sync may be the better solution because we could trigger synchronization on a regular schedule instead of registering an event for every change (eg we wouldn't need the changes listener on the local pouchdb). More details on period sync are available here: https://github.com/WICG/BackgroundSync/blob/master/explainer.md#periodic-synchronization-in-design
  4. I think I need to spend some more time thinking about how we solve this problem. I'm attending Offline Camp this weekend and I'm hoping to have some time then to think and talk about this problem with others.

@jkleinsc
Copy link
Member Author

@gnowoel I have been meaning to get back to you on this but I have been pretty buried lately. Here's an update based on learnings from Offline Camp.

  1. It turns out that periodic sync is not supported by any browsers at the current time, so it is not an option for a solution.
  2. One of the reasons that my original solution was not working is because global scope in a serviceworker is unreliable at best:
    PouchDB continuous replication fails inside of a service worker pouchdb/pouchdb#5851 (comment). I think this also explains why you couldn't do the initial sync within a serviceworker. Also, that initial sync should be done on the SW install event, eg:

javascript
this.addEventListener('install', function(event) {
let localMainDB = new PouchDB('localMainDB');
let remoteDB = new PouchDB('http://couchadmin:test@localhost:5984/main');
localDB.sync(remoteDB);
});

1. We can use background sync(https://wicg.github.io/BackgroundSync) and the Push API (https://w3c.github.io/push-api/) to managing syncing within a SW.  Background sync can be initiated from either the UI thread or the SW thread.  Firing a background sync event from the SW is accomplished via the following:

``` javascript
self.registration.sync.register('sync');

@gnowoel
Copy link
Contributor

gnowoel commented Nov 15, 2016

@jkleinsc Thanks for the update and sorry for my late. I haven't got a chance to look into this issue until today.

The Background Sync approach used to work in my test app, but somehow the sync events are never triggered in today's experiments. So, before I can figure out how to integrate Push API and other ideas you mentioned above, I'd like to try making Background Sync work first.

No idea what happened. But will report back to you as soon as I've made any progress.

@gnowoel
Copy link
Contributor

gnowoel commented Nov 15, 2016

@jkleinsc I couldn't get Background Sync working again. So I downloaded the official Background Sync demo and hoped to find out where I did wrong. But to my surprise, the app failed on all Chrome, Opera and Firefox on my localhost. And when it failed, Chrome didn't output any error message.

However, the online version works fine for Chrome.

My OS and browsers are up-to-date:

  • macOS Sierra 10.12.1
  • Google Chrome 54.0.2840.98 64-bit
  • Opera 41.0.2353.56
  • Firefox 50.0

@jkleinsc
Copy link
Member Author

@gnowoel one thing I forgot to mention is that Background Sync is only currently supported in Chrome, so until other browsers implement it, it won't work there.

As far as the demo not working, that is strange. Maybe it had to do with running on localhost? The one thing I observed was that I needed to toggle my internet connection in order to test the background sync vs just turning off the web server. The spec seems to indicate that it is tied to network connection, not server reachability:

The user agent is considered to be online if the user agent has established a network connection. A user agent MAY use a stricter definition of being online. Such a stricter definition MAY take into account the particular service worker or origin a sync registration is associated with.
From: https://wicg.github.io/BackgroundSync/spec/#concepts

In theory you could polyfill background sync in other browsers by listening for the online event (see https://developer.mozilla.org/en-US/docs/Online_and_offline_events) and then sending a message to the service worker to fire the sync event.

@gnowoel
Copy link
Contributor

gnowoel commented Nov 16, 2016

@jkleinsc Toggling network connection did the trick (Guess it's becuase of my VPN or something). Now the sync events get fired normally. Thanks!

(To run the Background Sync demo app locally, I also need to adjust the scope when registering the service worker.)

If you don't mind, here I have some stupid questions to ask.

Push API

As far as I know, Push API would basically do the same thing as db.changes(). Considering Push API needs server-side cooperation, wouldn't it be simpler to just use db.changes() in the main thread, at least as the first step?

Primary database

It looks like we’re using remote CouchDB as the primary database. Is there any specific reason that we can't use local PouchDB directly and let it worry about how to sync behind the scene? That way, we don't have to deal with database switching and the timing gap caused by it. (Say we add a record online, and go offline immediately. The newly added record would disappear because it has not been synced back to the local database yet.)

Thoughts

I'm thinking if we could do the following?

  • Using PouchDB as the only database in the main thread
  • Watching data changes from both databases with db.changes() and firing sync events from the main thread
  • Listening on sync events and sync databases in the Service Worker

Since we don't connect to the remote CouchDB directly, there's no need to intercept the query requests in the Service Worker. Now the responsibility of the Service Worker would be:

  • Caching static assets
  • Syncing databases

@jkleinsc
Copy link
Member Author

@gnowoel I think maybe a little background might help explain things further.
When I first started working on HospitalRun, I started off with the local database being the primary database that would then sync changes to the remote CouchDB.
What I found out was that when the dataset started to get large, clients were having trouble syncing, so I ran into issues where users had changes locally that were not getting syncing to the remote CouchDB database and browsers were "freezing" during sync. In a one user system this would be ok, but since HospitalRun is multi-user it meant that other users were not seeing patients entered by other users.

As I investigated this problem further, I found out that IndexedDB (which PouchDB uses behind the scenes) blocks the DOM: https://nolanlawson.com/2015/09/29/indexeddb-websql-localstorage-what-blocks-the-dom/

Given this, I decided two things:

  1. To fix an immediate problem in a hospital, the primary database would be the remote CouchDB database. This meant not as good offline support but in a networked hospital environment everyone would be able to have access to the "latest" records.
  2. In order to bolster offline support dropped/impacted by moving the primary db to the remote CouchDB, I added service worker code to handle offline. Doing sync in the service worker means that any IndexedDB work gets done in a separate thread, which means there isn't extra load on the UI (DOM manipulating/main) thread.

Long term what I would like to see is a way of users selecting what mode they want to run in. For example:

  1. Offline first - make local pouchdb primary database.
  2. Network first - make remote couchdb primary database.
  3. Other - maybe strategy like "fastest" response between network and offline

Does that make sense?

As far as the PUSH API is concerned, the difference is who is listening to the changes feed and which db is the "primary database". You could in the browser listen for changes on the remote DB, but that would require a constant polling to pull in changes vs the server initiating a push when a change actually happens.

@gnowoel
Copy link
Contributor

gnowoel commented Nov 16, 2016

@jkleinsc Yes, it makes sense. Thanks for the detailed explanation! Will see what I can do...

@gnowoel
Copy link
Contributor

gnowoel commented Nov 21, 2016

@jkleinsc I updated the code with PR #820. It's by no means a complete solution, but I was thinking if we could:

  • fix the upfront bug first, and
  • make as few changes as possible.

What I did is just making Background Sync work, and in my simple tests, the problem seems to be fixed.

My implementation looks somewhat naive, because I use global variables for sharing database instances. There must be a better way, but I just don't know.

The initial sync happens in the main thread, because I can't do it in the service worker. Tried to put the code in different places, inside or outside of a function, in the install or activate handler, but nothing worked.

To check it out, perhaps you could add or change some records offline, and then go online and refresh the page. After a while, the databases should be synced.

Hope this is good enough at least for now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🐛bug issue/pull request that documents/fixes a bug in progress indicates that issue/pull request is currently being worked on
Projects
None yet
Development

No branches or pull requests

5 participants