Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV when attempting to resume a Puased syncSession #6558

Open
muddy-alex opened this issue Mar 15, 2024 · 19 comments
Open

SIGSEGV when attempting to resume a Puased syncSession #6558

muddy-alex opened this issue Mar 15, 2024 · 19 comments
Labels
O-Community T-Bug Waiting-For-Reporter Waiting for more information from the reporter before we can proceed

Comments

@muddy-alex
Copy link

How frequently does the bug occur?

Always

Description

I'm trying to force Realm to perform a sync of all offline changes (it's an asymmetric table).

According to the documentation, we should be able to call realm.syncSession.pause() and realm.syncSession.resume() to perform this operation, however, when calling resume() a SIGSEGV error is thrown.

Is this the correct way to force Realm to perform a sync with AppSync?

Stacktrace & log output

No response

Can you reproduce the bug?

Always

Reproduction Steps

No response

Version

12.6.2

What services are you using?

Both Atlas Device Sync and Atlas App Services

Are you using encryption?

No

Platform OS and version(s)

OSX/Linux

Build environment

NodeJS

Cocoapods version

No response

Copy link

sync-by-unito bot commented Mar 15, 2024

➤ PM Bot commented:

Jira ticket: RJS-2762

@kneth
Copy link
Contributor

kneth commented Mar 17, 2024

@muddy-alex There is no API to force sync as the sync client works mostly autonomously. pause()/resume() can trigger the sync client to take an extra round trip.

If it is possible, please share the stack trace. resume() should not cause a SIGSEGV.

@sync-by-unito sync-by-unito bot added the Waiting-For-Reporter Waiting for more information from the reporter before we can proceed label Mar 17, 2024
@muddy-alex
Copy link
Author

No stack trace, unfortunately.

Pause/Resume seems to work before the initial sync has been completed. However, once it has (based on the realm.syncSession.addConnectionNotification event handler being called), calling pause/resume causes the process to abruptly exit.

This is all I get...

Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)

@github-actions github-actions bot added Needs-Attention Reporter has responded. Review comment. and removed Waiting-For-Reporter Waiting for more information from the reporter before we can proceed labels Mar 18, 2024
@kneth
Copy link
Contributor

kneth commented Mar 20, 2024

Could be related to realm/realm-core#7349? It is possible for you to downgrade? If it works with an earlier version (12.4.0 or older), it could be an indication for us.

@sync-by-unito sync-by-unito bot added Waiting-For-Reporter Waiting for more information from the reporter before we can proceed and removed Needs-Attention Reporter has responded. Review comment. labels Mar 20, 2024
@muddy-alex
Copy link
Author

We were originally testing with 12.3.1 when we found the issue, but upgraded to 12.6.2 to check if it was resolved

@github-actions github-actions bot added Needs-Attention Reporter has responded. Review comment. and removed Waiting-For-Reporter Waiting for more information from the reporter before we can proceed labels Mar 20, 2024
@nirinchev
Copy link
Member

Let me get this straight - initial sync is happening, but you have created some data offline and don't want to wait for initial sync to complete before that data gets uploaded, which is why you're pausing and resuming the session?

As Kenneth said, even though this is likely not the intended flow, we shouldn't be crashing. But understanding the circumstances around the crash will allow us to repro it.

@nirinchev nirinchev added More-information-needed More information is needed to progress. The issue will close automatically in 2 weeks. and removed Needs-Attention Reporter has responded. Review comment. labels Mar 20, 2024
@nirinchev nirinchev self-assigned this Mar 20, 2024
@sync-by-unito sync-by-unito bot added the Waiting-For-Reporter Waiting for more information from the reporter before we can proceed label Mar 20, 2024
@muddy-alex
Copy link
Author

muddy-alex commented Mar 21, 2024

We only have specific time windows on internet access, which is why we need some kind of 'force-sync' feature as we've found that the NodeJS Client doesn't seem to do any further synchronisation during the lifetime of the process (tested over 2 days).

We've found that the initial sync seems to be restricted to 10MB upload, taking the short time window of internet access into account, we would like this 'force-sync' feature to keep pushing up the 10MB chunks until all offline data has been uploaded.

@github-actions github-actions bot added Needs-Attention Reporter has responded. Review comment. and removed More-information-needed More information is needed to progress. The issue will close automatically in 2 weeks. Waiting-For-Reporter Waiting for more information from the reporter before we can proceed labels Mar 21, 2024
@nirinchev
Copy link
Member

Thanks for that context - I'll chat with our sync team and see what they say.

@nirinchev
Copy link
Member

Hey, so following that conversation, here's a few more questions.

  1. What do you mean by "initial sync" here? Can you clarify what the flow looks like in your app - possibly with some example code snippets. It doesn't have to be the exact thing, but more for illustrative purposes - e.g.:
realm.subscriptions.update(...) // Subscribe for new data
realm.write(() => {
  for (var i = 0; i < 100; i++) {
    realm.create(...); // Create some offline data
  }
});
// User goes into internet accessible window
realm.syncSession.resume();
// wait for something?
realm.syncSession.pause();
  1. Where does the 10mb limit come from? There's nothing in the sync protocol that we're aware of that puts 10 mb limit on uploads and there's nothing we're aware of that would cause pausing/resuming session to force sync and have it behave differently compared to the session just being active.
  2. What is the connection state you're waiting for using addConnectionNotification. This API is intended to let you know that a connection is established, but provides no information whatsoever about the state of synchronization. How are you using it and what information are you inferring from the connection state?
  3. If you can consistently reproduce the crash, can you configure logging to log at trace level and get us the logs from a run that reproduces the crash?

@nirinchev nirinchev added More-information-needed More information is needed to progress. The issue will close automatically in 2 weeks. Waiting-For-Reporter Waiting for more information from the reporter before we can proceed and removed Needs-Attention Reporter has responded. Review comment. labels Mar 21, 2024
@danieltabacaru
Copy link

@muddy-alex There is no way currently to force-sync the local data: we first download any data from the server, and then we start uploading the local data. Pausing and resuming the session does not change this behavior in any way. We could add a config option to change that though. If you only have asymmetric tables in the schema, then there is nothing to download so upload would start immediately.

@muddy-alex
Copy link
Author

muddy-alex commented Mar 22, 2024

@kneth I can replicate the SIGSEGV here...

import Realm from 'realm';

const ATLAS_APP_ID = '';
const ATLAS_API_KEY = '';

export const Point = {
  name: 'Point',
  embedded: true,
  properties: {
    type: { type: 'string', default: () => 'Point' },
    coordinates: { type: 'list', objectType: 'double' }
  },
};

class Log extends Realm.Object {
   static schema = {
    name: 'Log',
    asymmetric: true,
    primaryKey: '_id',
    properties: {
      _id: 'objectId',
      timestamp: 'date',
      organisation: 'objectId',
      key: 'string',
      value: 'mixed?[]',
      metadata: 'mixed?{}',
      location: 'Point',
      locationFix: 'int'
    }
  };
}


const app = new Realm.App({ id: ATLAS_APP_ID });

Realm.App.Sync.setLogLevel(app, "trace");

const user = app.currentUser
  ? app.currentUser
  : await app.logIn(Realm.Credentials.apiKey(ATLAS_API_KEY));

const realm = await Realm.open({
  schema: [Log, Point],
  sync: {
    cancelWaitsOnNonFatalError: false,
    existingRealmFileBehavior: {
      type: Realm.OpenRealmBehaviorType.OpenImmediately,
      timeOut: 5000,
      timeOutBehavior: Realm.OpenRealmTimeOutBehavior.OpenLocalRealm
    },
    newRealmFileBehavior: {
      type: Realm.OpenRealmBehaviorType.OpenImmediately,
      timeOut: 5000,
      timeOutBehavior: Realm.OpenRealmTimeOutBehavior.OpenLocalRealm
    },
    flexible: true,
    onError: (session, error) => {
      console.error('Sync Error!');
      console.error(error);
    },
    user
  }
});

realm.syncSession.addConnectionNotification(() => {
  console.log('Sync Connected!');
});

realm.syncSession.addProgressNotification(Realm.ProgressDirection.Download, Realm.ProgressMode.ForCurrentlyOutstandingWork, (transferred, transferable) => {
  console.log(`Sync Downloading: ${transferred}/${transferable}`);
});

realm.syncSession.addProgressNotification(Realm.ProgressDirection.Upload, Realm.ProgressMode.ForCurrentlyOutstandingWork, (transferred, transferable) => {
  console.log(`Sync Uploading: ${transferred}/${transferable}`);
});

setTimeout(async () => {
  console.log('Force Sync');
  await realm.syncSession.pause()
  await realm.syncSession.resume()
}, 15 * 1000);

@github-actions github-actions bot added Needs-Attention Reporter has responded. Review comment. and removed More-information-needed More information is needed to progress. The issue will close automatically in 2 weeks. Waiting-For-Reporter Waiting for more information from the reporter before we can proceed labels Mar 22, 2024
@muddy-alex
Copy link
Author

@danieltabacaru it was the documentation here that lead me to believe that this might be possible

@muddy-alex
Copy link
Author

muddy-alex commented Mar 22, 2024

@nirinchev When I mention to 'initial sync', I am referring to when Realm uploads offline data for the first time the process starts. Based on my reproduction above, when you see

Sync Connected!
Sync Downloading: 0/0
Sync Uploading: 0/201
Sync Uploading: 201/201

We've found that when this initial sync runs, only a chunk (I can't remember why I assumed it was 10MB) is uploaded - rather than all of the offline data. This meant that we had to restart the process a few times before all data was pushed up to AppSync, hence the reason we are looking for a 'force-sync' solution.

@danieltabacaru
Copy link

@muddy-alex I believe the documentation is correct. Forcing sync in that context is forcing connecting to the server, but the behavior is as I described it.

I wrote a similar test and I see no issues with it. I cannot see where class Point is declared in your example. The one you have there looks like a schema. Maybe @kneth can have a look.

@danieltabacaru
Copy link

Is there any code following the resume? Maybe something is not cleaned-up properly. You can try to wait for upload completion.

@sync-by-unito sync-by-unito bot added Waiting-For-Reporter Waiting for more information from the reporter before we can proceed and removed Needs-Attention Reporter has responded. Review comment. labels Mar 22, 2024
@muddy-alex
Copy link
Author

@danieltabacaru Nothing follows the resume(). As soon as its called we get the SIGSEGV

@github-actions github-actions bot added Needs-Attention Reporter has responded. Review comment. and removed Waiting-For-Reporter Waiting for more information from the reporter before we can proceed labels Mar 22, 2024
@kneth
Copy link
Contributor

kneth commented Apr 3, 2024

@muddy-alex You are mixing class-based models (Log) and JS schemas (Point), and I wonder if it has a negative impact. I mean, our tests are either-or, so you might be in the land of undefined behavior.

@sync-by-unito sync-by-unito bot added Waiting-For-Reporter Waiting for more information from the reporter before we can proceed and removed Needs-Attention Reporter has responded. Review comment. labels Apr 3, 2024
@muddy-alex
Copy link
Author

muddy-alex commented Apr 4, 2024

I couldn't find a documented way of using Embedded objects with the Class syntax. Updating the above example to just use JS schemas still results in the SIGSEGV.

@kneth
Copy link
Contributor

kneth commented Apr 5, 2024

@muddy-alex Thank you to getting back. I'll bring up the issue at the next team meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-Community T-Bug Waiting-For-Reporter Waiting for more information from the reporter before we can proceed
Projects
None yet
Development

No branches or pull requests

4 participants