Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLX subscriptions are not re-sent to the server if bootstrap is interrupted #7077

Closed
sync-by-unito bot opened this issue Oct 23, 2023 · 4 comments · Fixed by #7096
Closed

FLX subscriptions are not re-sent to the server if bootstrap is interrupted #7077

sync-by-unito bot opened this issue Oct 23, 2023 · 4 comments · Fixed by #7096
Assignees

Comments

@sync-by-unito
Copy link

sync-by-unito bot commented Oct 23, 2023

SubscriptionStore::get_next_pending_version() returns the next subscription that needs to be sent to the server with a snapshot version past the upload cursor. Empty uploads may be acknowledged during bootstraps and so that will advance the upload cursor past the snapshot version of the subscription. If the subscription bootstrap is interrupted, it is not picked up by the sync client the next time get_next_pending_version() is called (and as a side effect a notification on that subscription will no be fulfilled).

@sync-by-unito
Copy link
Author

sync-by-unito bot commented Oct 24, 2023

➤ danieltabacaru commented:

[~[email protected]] I cannot reproduce this yet, do you have the logs when it fails?

@ironage
Copy link
Contributor

ironage commented Oct 24, 2023

Here's the logs to demonstrate the hang without the fix of the "flx: interrupted bootstrap restarts/recovers on reconnect" built with multiplexing disabled.

Connection[3]: Session[3]: Begin processing pending FLX bootstrap for query version 1. (changesets: 1, original total changeset size: 1073125)
DB: 5313 Thread 0x16bd9b000: acquire writemutex
DB: 5313 Thread 0x16bd9b000: writemutex acquired
DB: 5313 Thread 0x16bd9b000: Start write 1424: 14 ref 9456
DB: 5313 Thread 0x16bd9b000: Start read 1477: 14 ref 9456
DB: 5313 Thread 0x16bd9b000: End transaction 1477
Connection[3]: Session[3]: Incomplete pending bootstrap found for query version 1
DB: 5313 Thread 0x16bd9b000: writemutex released
DB: 5313 Thread 0x16bd9b000: End transaction 1424
DB: 5313 Thread 0x16bd9b000: acquire writemutex
DB: 5313 Thread 0x16bd9b000: writemutex acquired
DB: 5313 Thread 0x16bd9b000: Start write 1477: 14 ref 9456
DB: 5313 Thread 0x16bd9b000: clearing pending bootstrap store of size: 1 transaction number 14, versions: {1}
DB: 5313 Thread 0x16bd9b000: Initiate commit version: 15
DB: 5313 Thread 0x16bd9b000: flx_subscription_sets state: [{"_key":0,"version":0,"state":3,"snapshot_version":3,"error":null,"subscriptions":[]},{"_key":1,"version":1,"state":2,"snapshot_version":7,"error":null,"subscriptions":[{"_key":0,"id":"6532d3894c4b386f670bfbeb","created_at":"2023-10-20 19:22:49.009482000","updated_at":"2023-10-20 19:22:49.009482000","name":null,"object_class":"TopLevel","query":"TRUEPREDICATE"}]}]

DB: 5313 Thread 0x16bd9b000: flx_pending_bootstrap state: []

DB: 5313 Thread 0x16bd9b000: Commit of size 3168 done in 343 us ref 9680
DB: 5313 Thread 0x16bd9b000: writemutex released
DB: 5313 Thread 0x16bd9b000: End transaction 1477
DB: 5313 Thread 0x16bd9b000: Start read 1032: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: End transaction 1032
Connection[3]: Session[3]: Progress handler called, downloaded = 75, downloadable(total) = 75, uploaded = 78, uploadable = 78, reliable_download_progress = false, snapshot version = 15
DB: 5313 Thread 0x16bd9b000: Start read 1032: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: End transaction 1032
DB: 5313 Thread 0x16bd9b000: Start read 1032: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: End transaction 1032
Connection[3]: Connecting to 'ws://localhost:9090/api/client/v2.0/app/flx_bootstrap_reconnect-tsgua/realm-sync'
WebSocket::Websocket()
Resolving 'localhost:9090'
Connecting to endpoint '::1:9090' (1/2)
Connected to endpoint '::1:9090' (from '::1:56486')
WebSocket::initiate_client_handshake()
HTTP request =
GET /api/client/v2.0/app/flx_bootstrap_reconnect-tsgua/realm-sync?baas_at=eyJhbGciOiJSUzI1NiIsImtpZCI6IjY1MzJkMzc3ZmQ1MzAwOWE0OWVjY2NmNSIsInR5cCI6IkpXVCJ9.eyJiYWFzX2RldmljZV9pZCI6IjY1MzJkMzdhZmQ1MzAwOWE0OWVjY2U2ZSIsImJhYXNfZG9tYWluX2lkIjoiNjUzMmQzNzdmZDUzMDA5YTQ5ZWNjY2YxIiwiZXhwIjoxNjk3ODMxNTU0LCJpYXQiOjE2OTc4Mjk3NTQsImlzcyI6IjY1MzJkMzdhZmQ1MzAwOWE0OWVjY2U3MSIsInN0aXRjaF9kZXZJZCI6IjY1MzJkMzdhZmQ1MzAwOWE0OWVjY2U2ZSIsInN0aXRjaF9kb21haW5JZCI6IjY1MzJkMzc3ZmQ1MzAwOWE0OWVjY2NmMSIsInN1YiI6IjY1MzJkMzdhZmQ1MzAwOWE0OWVjY2U2YiIsInR5cCI6ImFjY2VzcyJ9.Upu2a9ZwqO9USV7o2D0ES-vLbdgRirD9Zq_GiGbDXVKsNJfNble5zX2jlFORDRwp8udyXI0nmvLScmaQ5Iyi6JI9UVTpsyb-VXQ_tQUwSX4YIoobvKIBrxim4YKfUh5A-j8QQvfQzLj_2F_YBsvlalBcoCrMPbl2o3tLJh9l9u9Wgrk7yOroBhQxedfCiJNHp2DGyzAkR2yMxnEp14FmssOajwa_d0jkjioiCTl6IsmJP83fyNQ5sbz72ryI-dk2wCnqzuLRmXCV-nv7gcGuu9OaEyoaNHDpo-TsZ6AC8wDv9ic6oPew5PQYA26OYYWzQbyj3XQVOa4wOBySQe2rzQ HTTP/1.1
Host: localhost:9090
Connection: Upgrade
Sec-WebSocket-Key: Lz3KrOEs4Mi+1GzZ9liI8g==
Sec-WebSocket-Protocol: com.mongodb.realm-query-sync#10, com.mongodb.realm-query-sync#9, com.mongodb.realm-query-sync#8, com.mongodb.realm-query-sync#7, com.mongodb.realm-query-sync#6, com.mongodb.realm-query-sync#5, com.mongodb.realm-query-sync#4, com.mongodb.realm-query-sync#3, com.mongodb.realm-query-sync#2
Sec-WebSocket-Version: 13
Upgrade: websocket
User-Agent: RealmSync/13.23.1 (macOS Darwin 22.6.0 Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:28 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T6000 arm64)


WebSocket::handle_http_response_received()
HTTP response = HTTP/1.1 101 Switching Protocols
Cache-Control: no-cache, no-store, must-revalidate
Connection: Upgrade
Date: Fri, 20 Oct 2023 19:22:49 GMT
Sec-Websocket-Accept: UVwJRUP3DSns7QH+AqSKrh52QTQ=
Sec-Websocket-Protocol: com.mongodb.realm-query-sync#10
Strict-Transport-Security: max-age=31536000; includeSubdomains;
Upgrade: websocket
Vary: Origin
X-Appservices-Request-Id: 6532d389fd53009a49ecd1ad
X-Frame-Options: DENY


Connection[3]: Negotiated protocol version: 10
Connection[3]: Connected to app services with request id: "6532d389fd53009a49ecd1ad"
Connection[3]: Will emit a ping in 30378 milliseconds
Connection[3]: Session[3]: Sending: BIND(session_ident=3, need_client_file_ident=false, is_subserver=false, json_data="{"sessionReason":0}")
DB: 5313 Thread 0x16bd9b000: Start frozen 1032: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: Query find all: '(state == 3 or state == 6)', limit = -1
DB: 5313 Thread 0x16bd9b000: Query found: 1, Duration: 13 us
DB: 5313 Thread 0x16bd9b000: End transaction 1032
Connection[3]: Session[3]: Sending: IDENT(client_file_ident=35, client_file_ident_salt=8310911049450183104, scan_server_version=69, scan_client_version=5, latest_server_version=70, latest_server_version_salt=0, query_version=0, query_size=2, query="{}")
Connection[3]: Session[3]: Sending: MARK(request_ident=1)
Connection[3]: Session[3]: Received: MARK(request_ident=1, state=1)
DB: 5313 Thread 0x16bd9b000: Start read 1971: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: Query find all: 'version > 0 and (state == 1 or state == 2) and snapshot_version > 10', limit = -1
DB: 5313 Thread 0x16bd9b000: Query found: 0, Duration: 45 us
DB: 5313 Thread 0x16bd9b000: End transaction 1971
DB: 5313 Thread 0x16bd9b000: Start read 1032: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: End transaction 1032
Connection[3]: Session[3]: Sending: UPLOAD(progress_client_version=15, progress_server_version=69, locked_server_version=69, num_changesets=0)
Connection[3]: Session[3]: on_download_completion() with m_flx_pending_mark_version=-1, has store ? true
DB: 5313 Thread 0x16bd9b000: Start read 1424: 15 ref 9680
DB: 5313 Thread 0x16bd9b000: Query find all: 'version > 0 and (state == 1 or state == 2) and snapshot_version > 14', limit = -1
DB: 5313 Thread 0x16bd9b000: Query found: 0, Duration: 15 us
DB: 5313 Thread 0x16bd9b000: End transaction 1424
Connection[3]: Sending: PING(timestamp=2161016490, rtt=0)
Connection[3]: Received: PONG(timestamp=2161016490)
Connection[3]: Round trip time was 1 milliseconds
Connection[3]: Will emit a ping in 58476 milliseconds
Connection[3]: Sending: PING(timestamp=2161074968, rtt=1)
Connection[3]: Received: PONG(timestamp=2161074968)
Connection[3]: Round trip time was 1 milliseconds
Connection[3]: Will emit a ping in 56506 milliseconds
// hangs forever waiting on the promise of subscription completion in the test

@sync-by-unito
Copy link
Author

sync-by-unito bot commented Oct 24, 2023

➤ danieltabacaru commented:

thanks. please upload the full logs if you still have them.

@ironage
Copy link
Contributor

ironage commented Oct 24, 2023

Here's the full client logs for the test: full_logs_hang.txt
That is the current master, running against BAAS 27f42f55a7944ed7d8ba9fad1854a4b22714cb8d

@sync-by-unito sync-by-unito bot changed the title test hang on new BAAS: "flx: interrupted bootstrap restarts/recovers on reconnect" FLX subscriptions are not re-sent to the server if a bootstrap is interrupted Oct 30, 2023
@sync-by-unito sync-by-unito bot changed the title FLX subscriptions are not re-sent to the server if a bootstrap is interrupted FLX subscriptions are not re-sent to the server if bootstrap is interrupted Oct 30, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants