For data size <2 million entities
- Efficient storage data in memory
- Efficient search
- Prepare infrastructure for performance analyze
This repo is comparing bitmap
and map
as an index/cache for entities
map
is provided by Golang built-in librarybitmap
's are provided by https://github.com/RoaringBitmap/roaring- Few other bitmap repositories used occasionally, but excluded from the results described in this section
Considerations:
Data sizing
: there are ~500k
prices with ~8 conditions
for each priceFair competition
: there are different strategies to improve bothbitmap
andmap
and some of them were applied.High throughput
vsLow Latency
: low-latency is expected hereUsage
: this repository is comparingsearch
capabilities (e.g. find price by 8 input values/conditions).Usage excluded
: it doesn't take into consideration update or inserts. It can be an important use case for you depending on the architecture. Not only timing and memory usage (gc) is important here, but also you mind end up with x2 memory requirements (e.g. creating two caches at the same time) or will be forced to add READ lock for concurrency update\read. Map as well doesn't allow concurrent writes, so at least search compares oranges with orangesMeasurement
: timing is measured by thewrk
tool by calling service's json REST API (wrk is among very few available that supports microseconds)
Request body:
{
"offeringId": "a38e432c-3965-4c74-8251-aa640002d2b2",
"groupId": "group5",
"priceSpecId": "MRC",
"charValues": [
{
"char": "Term",
"value": "18"
},
{
"char": "B2B Traffic",
"value": "10GB"
},
{
"char": "VPN",
"value": "ad796998-f1c7-4fcc-9a6b-1b33042fb375"
},
{
"char": "B2B Bandwidth",
"value": "1Mbps"
},
{
"char": "Router",
"value": "Included"
}
]
}
Response body:
{
"id": "ed3a1325-9727-4bd0-ba29-a1d1486bf24d",
"spec": "MRC",
"value": 2320,
"currency": ""
}
Topic | Map | Roaring Bitmap |
---|---|---|
Client Requests for 30s run, count |
162 471 |
✅ 2 244 242 , which is 1 order of magnitude more. Note: Absolute numbers might change drastically for different options, but 1 order of magnitude is ~consistent |
Latency (90% percentile) |
7.083ms |
✅ 510µs , which is 1 order of magnitude faster |
Latency (99% percentile) |
9.534ms |
✅ 911µs , which is 1 order of magnitude faster |
Cache memory size | 180mb |
✅ 90mb , 2 times better. ONLY if original entities are not used |
-- Including index size | 4mb |
✅ 2mb , 2 times less |
-- Including entities size | 175mb |
✅ 87mb , 2 times less. ONLY if original entities are not used |
Usage memory (GOMEMLIMIT ) |
✅ tested with 500mb |
❌ 800MB for 100MB cache for similar latency, which is 1.6 times higher |
- ❌
Cost
:roaring bitmap
development and especially maintenance is higher than regularmap
because it is about supporting low-level data types. In contrastMap
is like business as usual - you will be able to spend more time on you actual business cases, then on dealing with performance. More over, you can check issues of differentbitmap libraries
and found issues hard-to-spot, like race conditions. Add your changes on top. Good testing, even for concurrent READ scenarios is required.
- build:
make build
- with codegen for extra options:
make build-generate
(might not work)
- with codegen for extra options:
- run app:
- map:
make run-map32-fiber
- bitmap:
make run-roaring32-fiber
- map:
- run & benchmark in '1 click':
make run-wrk-experiments
. After the run see reports- it runs the app for map, benchmarks for 5s, runs the app for bitmap and benchmarks again for 5s
- Search 1 price. Bitmap
curl -POST http://localhost:8091/v1/search/bitmap/prices \
-H "Content-Type: application/json" \
-d @benchmark/500k-large-groups/sample/search-price-request.json
- Search 1 price. Map
curl -POST http://localhost:8091/v1/search/map/prices \
-H "Content-Type: application/json" \
-d @benchmark/500k-large-groups/sample/search-price-request.json
- Search 5 prices. Bitmap
curl -H "Content-Type: application/json" -POST http://localhost:8091/v5/search/bitmap/bulk/prices \
-d @benchmark/500k-large-groups/sample/search-price-bulk-request-5-nd.json
- Search 10000 prices. Bitmap
time curl -H "Content-Type: application/json" -o /dev/null -POST http://localhost:8091/v5/search/bitmap/bulk/prices \
-d @benchmark/500k-large-groups/sample/search-price-bulk-request-10000-nd.json
- High-level
- In 1 click:
make run-wrk-experiments
. See reports - Wrk:
make wrk
- Wrk2:
make wrk2
- In 1 click:
- Low-level:
make bench
(go tests) - Memory:
make bench-memory-bitmap
See bitmap memorymake bench-memory-map
. See map memory
- Docker (beta)
- Build:
make build docker
- Run roaring32:
make docker-run-roaring32-fiber
- Run map32:
make docker-run-map32-fiber
- Run roaring32:
- Benchmarks
- Wrk:
make wrk -e IN_DOCKER=true
- Wrk2:
make wrk2 -e IN_DOCKER=true
- Wrk:
- Build:
GOMEMLIMIT
: You may need to set memory limit as800MB
for100MB
of the actual cache to get decent performance for99%
percentile. See bitmap 2gb vs 800mb. If degradation further is fine (we are still discussing timings<5ms
) see bitmap 2gb vs 500mb. No difference for map 2gb vs 500mbGoGC
: Bitmap comes with the cost of increased memory allocation, which means increased GC pressure. Value1000
looks good. For some background see bitmap. gogc 1000 vs 100Http Server
: when it comes to bitmaps, you can achieve micro-seconds latency and here every improvement matters. For example, bitmap. http server fiber vs default fiber got40%
better timings (319µs
improvement for 97% percentile). Map has fewer requests, and timing from the start way higher, so, it is1-3%
(135µs
for 97% percentile) mapDocker
: has fewer requests1 457 261
compare to non-docker option2 244 242
, but similar timing. It is not yet clear why
- Read Status, you might get some issues running non-essential benchmarks
- Read Design
- Take into consideration: very few search options are supported, so benchmarks are limited
- Read Benchmark Aggregation - it may not be up-to-date, but will give you the common understanding
of the
bitmap
benefits