queryRenderFeatures continued [not ready] #2224

ansis · 2016-03-05T00:20:48Z

This builds on #2052.

This is not ready, but it could use some high level 👀. I probably should have gotten some 👀 a bit earlier. Let me know if any of the high level ideas here concern you. I'll comment again later when this is more polished.

The raw vector tile pbf, FeatureTree and CollisionTile are now transferred to the main thread when a tile finishes loading. This lets use have a synchronous queryRenderedFeatures, it lets us eliminate includeGeometry without a performance hit, and it starts moving us towards more state-less workers.

A synchronous queryRenderedFeatures

This adds a synchronous version of queryRenderedFeatures.

it's faster
it's more convenient to use, especially when you want to make multiple queries
it's more consistent. It never gets delayed for 100s of ms while a worker is busy parsing a tile
you don't have to worry about callbacks being called out of order
things like Add a "layers" parameter to "Evented#on" #1002 could be done lazily

There is still an async version. It transfers data back to workers.

Despite being parallelized, the async version is not faster than the sync version. It ends up parsing from the pbfs twice instead of once. It spends less time on the main thread though which could be useful for large, expensive queries. I think it could be worthwhile to keep both versions as queryRenderedFeatures and queryRenderedFeaturesAsync

Getting rid of `includeGeometry`

The includeGeometry param for queryRenderedFeatures only exists because sending geometries from the worker back to the main thread was slow. When the pbf is on the main thread individual geometries can be deserialized lazily. This lets us drop includeGeometry without a performance penalty.

indexing with Grid

I replaced rbush with a Grid, a different kind of spatial index. Grid can be serialized to an ArrayBuffer to make it transferrable. The serialized version can be queried directly without deserializing individual elements to javascript objects. It is faster than rbush for collision detection and comparable for rendered feature querying. It's faster because insertions are faster.

Grid is a grid of cells. When a bbox is inserted it is added to all the cells it intersects. When a bbox is queried it looks in all cells that intersects with that bbox. That's it.

It isn't faster than rbush generally, but it is faster for what we use it for: a fixed extent, mostly evenly sized bboxes, and a high number of insertions. More importantly, it's transferable.

Storing arrays of "structs" in ArrayBuffers

There are a bunch of cases where we have giant arrays of fixed size objects. I added a new abstraction, StructArray, that makes it easier to store these kinds of objects in ArrayBuffers. It's inspired by the new Buffer implementation. It:

saves a lot of space. Switching collision boxes saved 500KB per tile.
makes transferring or even copying these arrays a lot faster
avoids creating a ton of garbage

see 9a2e5fd 's commit message for an example

Reducing worker state

Both Grid and StructArrays are transferrable and don't need to be deserialized to individual objects before they can be used. This makes moving them between workers and the main thread really fast and reduces the amount of data that is "stuck" on workers. When symbolInstances is made completely transferable we could get rid of WorkerTiles that live after parsing has finished.

The work requests would transfer all the data that is needed and not rely on the worker having any unique data. I think this is also more similar to how we want work requests to work in -native.

Since work won't be tied to a specific thread, it should be possible to distribute work more consistently and speed things up.

Benchmarks

Compared to #2052


point query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v6 (before all features were merged)
new sync                        1 ms
new sync with geometry          1 ms
new async                       5-6 ms
new async with geometry         5-6 ms
old async                       1-2 ms
old async with geometry         2 ms

point query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        2 ms
new sync with geometry          2 ms
new async                       5 ms
new async with geometry         5-6 ms
old async                       5 ms
old async with geometry         15-30 ms

full screen box query
http://localhost:9966/debug/#15.37/38.9047/-77.0299
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        23 ms
new sync with geometry          70 ms
new async                       19-25 ms
new async with geometry         75 ms
old async                       45 ms
old async with geometry         770 ms


full screen bbox query
http://localhost:9966/debug/#11.52/38.9262/-77.0420
mapbox-streets-v7 (after many layers are merged into a single huge multi-geometry feature)
new sync                        115 ms
new sync with geometry          175 ms
new aysnc                       125 ms
new async with geometry         180 ms
old async                       265 ms
old async with geometry         560 ms

The old and new sync queryRenderedFeatures are both really fast for point queries.
The new queryRenderedFeatures is much faster for large box queries, especially if geometries are included.
The new sync and async queryRenderedFeatures take a similar amount of time, but the async version spends less time on the main thread.

next steps

Cleaning things up. Fixing tests. Fixing a couple remaining bugs.

Do all the high level changes sound ok?

@lucaswoj @jfirebaugh @mourner

mourner · 2016-03-07T14:28:12Z

What's the memory impact of switching to grid index and storing it in both the main thread and the worker?

lucaswoj · 2016-03-07T18:13:41Z

I'm very 👍 on the high-level changes in this PR.

Before shipping, I'd like for us to think about

what it'd take to merge StructArray and Buffer (LOC is proportional to # of bugs)
how stateless workers can be used to write better integration tests for mapbox-gl-js (the tests can be another PR but lets think about the architecture that enables those tests now)
how we can codify the performance benchmarks used in this PR into something reusable

ansis · 2016-03-08T01:40:53Z

What's the memory impact of switching to grid index and storing it in both the main thread and the worker?

Good question. I hadn't benchmarked that yet.

It is generally only stored in one thread at a time. Async queries create a duplicate but that duplicate is discarded at the end of the query. We could maintain a duplicate version to reduce the cpu cost, but the cpu cost is pretty low and async queries should only be used for expensive queries.

For each of the benchmarks I cropped the window so that it loaded only one tile. I used mapbox-streets-v6 because v7 merges features into huge multigeometries. This branch implements subfeature indexing (a bbox for each ring, not each feature) but the old one doesn't so it isn't a fair comparison when using v7.

At z12 with many lines and few labels.
http://localhost:9966/debug/#12.89/38.9188/-77.0388

FeatureTree with rbush          1843 KB         343 KB for the tree, 1500 KB for the bbox arrays that are inserted
CollisionTile with rbush        20 KB
CollisionBoxes

FeatureTree with Grid           340 KB          240 KB for the Grid, 100 KB for the featureIndexArray
CollisionTile with Grid         225 KB

At z17 with less lines but more labels.
http://localhost:9966/debug/#17.09/38.90398/-77.03116

FeatureTree with rbush          420 KB          50 KB for the tree, 270 KB for the bbox arrays that are inserted
CollisionTile with rbush        200 KB

FeatureTree with Grid           117 KB          100 KB for the Grid, 17 KB for the featureIndexArray
CollisionTile with Grid         280 KB


z12
CollisionBoxes used with rbush  140 KB
CollisionBoxes used with Grid   26 KB

z17
CollisionBoxes used with rbush  1160 KB
CollisionBoxes used with grid   235 KB

rbush could be switched to use the StructArray collision boxes currently used
by Grid but it still might need regular bbox arrays to avoid a cpu performance hit.

The Grid in FeatureTree uses far less memory than rbush. 340 KB vs 1843 KB and 117 KB vs 420 KB.

The Grid in CollisionTile uses more memory than rbush. 225 KB vs 20 KB and 280 KB.

this excludes the cost of the actual CollisionBoxes that are inserted into the tree, which is way higher in the rbush version. But the rbush version could possibly be rewritten to use the typed array collision boxes which could reduce that cost
this is mostly the memory cost of creating a Int32 subarray for each cell (112 bytes per cell). I'll look into optimizing this for cases where no or few cells are actually used (like the geojson source that is part of the benchmark)

Summary:

lower memory usage overall
much lower usage for feature indexes
possibly high and possibly lower usage for collision indexes: it's hard to separate from other optimizations

ansis · 2016-03-08T01:59:19Z

what it'd take to merge StructArray and Buffer (LOC is proportional to # of bugs)

Buffer does a whole bunch of extra calculations like scaling values and it doesn't provide an interface for reading values. I think it might make sense to wait until dds settles down before trying to merge the two.

how stateless workers can be used to write better integration tests for mapbox-gl-js (the tests can be another PR but lets think about the architecture that enables those tests now)

How would stateless workers help us write better integration tests?

how we can codify the performance benchmarks used in this PR into something reusable

Yes, I definitely want to write reusable benchmarks. Do we want to start using a framework for this or continue writing it ourselves (like with the buffer benchmark)?

ansis · 2016-03-08T02:25:16Z

ceb13b8 reduces Grid memory usage from 225 KB to 85 KB for the z12 case and from 280 KB to 169 KB for the z17 case.

mourner · 2016-03-08T09:31:45Z

@ansis fantastic! I'm all 👍 on the changes then. I'll review the code in more detail.

lucaswoj · 2016-03-08T18:54:46Z

Buffer does a whole bunch of extra calculations like scaling values

That's not correct. Those "extra calculations" are done in Bucket. Buffer is designed to be a lightweight generic StructArray implementation.

and it doesn't provide an interface for reading values.

It has a Buffer#get method for reading values. The docs say that it is "only used for debugging" as an indication that it is dead code, not as an indication of any technical limitations.

I think it might make sense to wait until dds settles down before trying to merge the two.

The Buffer class will not change in any future dds work.

How would stateless workers help us write better integration tests?

Out of scope for this PR. I'm excited about the implications of a simpler interface for writing better integration tests through skipWorker-like functionality #1979.

test-suite is slow and limited in what it can test. #2223 is a perfect example of a bug that we won't be able to write regression tests for and maybe would've been caught if we had better integration tests.

Yes, I definitely want to write reusable benchmarks. Do we want to start using a framework for this or continue writing it ourselves (like with the buffer benchmark)?

I'm going to design a simple benchmark interface specification today. It'll be similar to how the buffer benchmark works.

ansis · 2016-03-08T19:39:15Z

That's not correct. Those "extra calculations" are done in Bucket. Buffer is designed to be a lightweight generic StructArray implementation.

Yeah, you're right. I confused the two. I'll take a closer look at this today.

The Buffer class will not change in any future dds work.

Great!

I'm going to design a simple benchmark interface specification today. It'll be similar to how the buffer benchmark works.

Is it worth writing our own? Should we just use something like https://github.com/bestiejs/benchmark.js/ ?

lucaswoj · 2016-03-08T22:41:23Z

benchmark interface specification pushed to #2231 😄

ansis · 2016-03-09T02:26:07Z

Rebased on master. Latest commit used to be 3fc75bd3127436715d0ec2e88d52aade617ea526.

ansis · 2016-03-14T22:26:05Z

d40a059 merges the similar parts of Buffer and StructArray. After the changes, Buffer is a thing that turns a StructArray into a WebGL buffer. @lucaswoj can you review? (I still need to update tests).

lucaswoj · 2016-03-15T21:56:44Z

A convention I've been using, and have found fruitful:

object constructors take a single options argument
serialize methods output an object compatible with the constructor options argument

ansis · 2016-03-16T01:32:47Z

A convention I've been using, and have found fruitful
...

switched in fe8dad8

and then I squashed all those commits that fixed things from d40a059

and accept only an array of strings. fix #2230

It's faster, but not fast enough to be worth it.

let's try to get by with just queryRenderedFeatures

and split intersection tests into a separate file

samanpwbb · 2016-03-24T19:12:07Z

🎉

lucaswoj added the in progress label Mar 5, 2016

ansis force-pushed the query-rendered-features-continued branch from 9b3a74d to d342793 Compare March 7, 2016 22:26

This was referenced Mar 8, 2016

Improve responsiveness by skipping data transfer to worker [DO NOT MERGE] mapbox/mapbox-gl-draw#185

Closed

Add "skipWorker" option to GeoJSON sources [DO NOT MERGE] #1979

Closed

ansis force-pushed the query-rendered-features-continued branch from ceb13b8 to 806a77b Compare March 9, 2016 01:30

ansis force-pushed the features-at-symbols branch from 8bdf3c7 to 68f186d Compare March 9, 2016 02:23

ansis force-pushed the query-rendered-features-continued branch from 806a77b to ca23cca Compare March 9, 2016 02:25

lucaswoj mentioned this pull request Mar 11, 2016

Markers animation performance #2257

Closed

lucaswoj added the do not merge label Mar 11, 2016

jfirebaugh mentioned this pull request Mar 14, 2016

Rename layer option to layers for featuresIn/At requests #2230

Closed

ansis force-pushed the query-rendered-features-continued branch 2 times, most recently from e7b49f1 to 3530bb2 Compare March 15, 2016 02:13

lucaswoj mentioned this pull request Mar 15, 2016

Add a "layers" parameter to "Evented#on" #1002

Closed

ansis force-pushed the query-rendered-features-continued branch from 1ac623a to 1d8b595 Compare March 16, 2016 01:32

ansis force-pushed the features-at-symbols branch from 68f186d to 506bec5 Compare March 16, 2016 23:28

ansis force-pushed the query-rendered-features-continued branch 2 times, most recently from 21dddbc to 4a48755 Compare March 16, 2016 23:41

ansis added 17 commits March 24, 2016 11:49

remove dead code

1594938

fix querying empty geojson tiles

e56a49e

rename layers property from layer to layers

dfb96c8

and accept only an array of strings. fix #2230

fix querying after reloading vector tiles

3164a64

test both sync and async queries

b6fe1c0

add queryRenderedFeatures changes to changelog

64a77b9

fix querying symbols with -allow-overlap: true

41c4b70

update examples for new queryRenderedFeatures

20cc526

add query benchmarks

956a6e9

cleanup StructArrays and queryRenderedFeatures

d886a2e

make bucketIndex -> layerIDs slightly cleaner

25df374

get rid of Struct._setIndex

afa28ea

It's faster, but not fast enough to be worth it.

fix queryRendered features for empty tiles

b1517ef

rename StringNumberMapping -> DictionaryCoder

f96e807

remove queryRenderedFeaturesAsync

e79a7ec

let's try to get by with just queryRenderedFeatures

fix box queries that cross the dateline

7531605

rename FeatureTree to FeatureIndex

056fbc0

and split intersection tests into a separate file

ansis force-pushed the query-rendered-features-continued branch from 6e8bcef to 056fbc0 Compare March 24, 2016 18:49

ansis force-pushed the features-at-symbols branch from 9a5432d to e5a4152 Compare March 24, 2016 18:50

ansis merged commit 056fbc0 into features-at-symbols Mar 24, 2016

ansis deleted the query-rendered-features-continued branch March 24, 2016 19:11

lucaswoj removed the in progress label Mar 24, 2016

ansis mentioned this pull request Mar 24, 2016

featuresAt: replace includeGeometry with omitGeometry #1552

Closed

anandthakker mentioned this pull request Mar 24, 2016

Publish Grid as a separate module #2326

Closed

ansis mentioned this pull request Mar 24, 2016

featuresAt offset #1274

Closed

1ec5 mentioned this pull request Mar 25, 2016

initial featuresAt implementation mapbox/mapbox-gl-native#3172

Closed

andrewharvey mentioned this pull request Mar 27, 2016

broken with mapbox-gl-js 0.16 due to featuresAt breaking change mapbox/mapbox-gl-directions#56

Closed

1ec5 mentioned this pull request Mar 28, 2016

Feature eventing (native equivalent of queryRenderedFeatures and querySourceFeatures) mapbox/mapbox-gl-native#352

Closed

incanus mentioned this pull request Jul 26, 2016

Query all features in sources (querySourceFeatures) mapbox/mapbox-gl-native#5792

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

queryRenderFeatures continued [not ready] #2224

queryRenderFeatures continued [not ready] #2224

ansis commented Mar 5, 2016

mourner commented Mar 7, 2016

lucaswoj commented Mar 7, 2016

ansis commented Mar 8, 2016

ansis commented Mar 8, 2016

ansis commented Mar 8, 2016

mourner commented Mar 8, 2016

lucaswoj commented Mar 8, 2016

ansis commented Mar 8, 2016

lucaswoj commented Mar 8, 2016

ansis commented Mar 9, 2016

ansis commented Mar 14, 2016

lucaswoj commented Mar 15, 2016

ansis commented Mar 16, 2016

samanpwbb commented Mar 24, 2016

queryRenderFeatures continued [not ready] #2224

queryRenderFeatures continued [not ready] #2224

Conversation

ansis commented Mar 5, 2016

A synchronous queryRenderedFeatures

Getting rid of includeGeometry

indexing with Grid

Storing arrays of "structs" in ArrayBuffers

Reducing worker state

Benchmarks

next steps

mourner commented Mar 7, 2016

lucaswoj commented Mar 7, 2016

ansis commented Mar 8, 2016

ansis commented Mar 8, 2016

ansis commented Mar 8, 2016

mourner commented Mar 8, 2016

lucaswoj commented Mar 8, 2016

ansis commented Mar 8, 2016

lucaswoj commented Mar 8, 2016

ansis commented Mar 9, 2016

ansis commented Mar 14, 2016

lucaswoj commented Mar 15, 2016

ansis commented Mar 16, 2016

samanpwbb commented Mar 24, 2016

Getting rid of `includeGeometry`