-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify and optimize worker task scheduling #10417
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The geojson simplification is great.
It looks like the callbacks from getGlyphs
and getImages
aren't getting added to the scheduler when I think they should be. This prioritization would matter if the glyphs for one time loaded before the vectortile of another tile. Our benchmarks don't appear to be covering this case. But maybe that means it's fine as is.
By bypassing the scheduler a whole bunch of work is now not measured as part of the workerTask
diagnostic metric. This could be fixed by adding something like this to the other path or by putting all work through the scheduler (and making some of it immediate there).
The scheduler already gave messages the highest priority by default. Making all those immediate would probably work well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\
Good point! I also didn't realize responses after returning the result from the other thread also get scheduled, so I restored the condition that this only happens on the worker side, and additionally added
👍 went with the first option. |
From what I can tell it doesn't actually apply queuing to the responses to these calls. The messages have |
src/source/worker_tile.js
Outdated
@@ -169,7 +169,7 @@ class WorkerTile { | |||
glyphMap = result; | |||
maybePrepare.call(this); | |||
} | |||
}, undefined, undefined, taskMetadata); | |||
}, undefined, true, taskMetadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to queue this on the main thread. We need to queue the response to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the intent was to make mustQueue
force queing only on the worker thread, whether it's the task (if you're sending from the main) or the response (if you're sending from the worker).
src/util/actor.js
Outdated
// executing the next task in our queue, postMessage preempts this and <cancel> | ||
// messages can be processed. We're using a MessageChannel object to get throttle the | ||
// process() flow to one at a time. | ||
if (isWorker() && data.mustQueue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to pass the responses from getImages and getGlyphs to the scheduler. Checking for the presence of callback.metadata
in actor.js might be enough to decide whether to do that but I think letting the scheduler decide that might be slightly cleaner
The queuing on the main thread was intentional but it looks like it might not be needed since we dropped IE. I think this is the only case where we did queuing on the main thread. It was also applied to iOS Safari < 12.1 but I don't think it was actually needed there... not sure though. @arindam1993 do you remember if the queuing was only needed for IE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also old Safari verions wherein AbortController
doesn't actually abort fetches.
@ansis OK, so the |
This! Do you think its worth moving all the queueing, throttling and cancellation logic into |
Closes #10187. There are two independent commits here.
The first one simplifies
setData
coalescing logic in GeoJSON source, previously introduced in #5902. I did this first to make worker logic easier to reason about, but theoretically it should improve performance — by moving the coalescing logic to the main thread, we avoid flooding the worker message queue with tasks that will get discarded. Technically the flow and consequently the overall performance characteristics shouldn't change, as I tried to demonstrate on this very rough and clunky chart:Previously, any
setData
calls that follow one that's still in progress would get sent to the worker, which remembers the last call while returning the previous one as "abandoned", and waits until that firstsetData
call successfully finishes processing and then callscoalesce
message that tells the worker to additionally do an update for the lastsetData
call it caught before.Now, we simply don't send any worker messages on additional
setData
calls while asetData
is already in progress, but we remember to issue one moresetData
with the last updateddata
if there are any updates which we previously refused.The second commit is the one that fixes the Safari performance issue — it was caused by Safari being too slow to process worker tasks that are delayed to run after the current event loop (introduced in #8633), which we do to make sure
<cancel>
messages are processed before the tasks that were cancelled if both come together to the worker in a batch ofpostMessage
calls. Previously, all messages were delayed, then #8913 made it delay only on the worker side, and finally #9031 introduced an explicitactor.send
parameter to additionally delaygetResource
on the main thread to fix a perf regression. This PR changes the logic so that messages are handled immediately by default, and only delayed for<cancel>
processing explicitly for calls where we commonly expect cancellations (mostly network-related) — in this caseloadTile
,loadDEMTile
andgetResource
. I confirmed that this fixes the SafarisetData
performance issue in the ticket, while not increasing the percentage of unfulfilled cancellations when browsing the map quickly (it's about 40% before and after the PR).Tagging @kkaefer @ChrisLoer just in case because it affects the code you added significantly — take a look if you have time but no worries if not.
Launch Checklist
mapbox-gl-js
changelog:<changelog>Fixes a performance regression in Safari on frequent GeoJSON setData calls</changelog>