Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor map memory, remove map memory limit. #247

Merged
merged 26 commits into from
Dec 7, 2020
Merged

Refactor map memory, remove map memory limit. #247

merged 26 commits into from
Dec 7, 2020

Conversation

olanti-p
Copy link
Contributor

@olanti-p olanti-p commented Dec 3, 2020

Summary

SUMMARY: Features "Unlimited map memory"

Purpose of change

Remove limit on map memory capacity.
Follows #97.

Describe the solution

Right now, map memory is implemented via lru_cache, with absolute map square coordinates as keys. This makes it impossible to remove limit on map memory size, since sooner or later players' computers would run out of memory trying to cram all memorized tiles into RAM.

This PR splits map memory into chunks (named memorized_submap, with size equal to an ordinary submap), and loads these chunks from the disk (or allocates new empty ones) as necessary.

The memorized_submaps are stored in a new folder, <save-name>/<player-name>.mm1/, in individual files with names equal to their coordinates. While simplistic, this solution produces a ton of small (<4kb) files, which could tank performance on some systems (e.g. Windows with an antivirus).

For ease of testing, I've implemented migration from existing memory map file <save-name>/<player-name>.mm on first load.

Upd: changed saving to use regions of 8x8 submaps to cut down on number of files.

Describe alternatives you've considered

Mostly tweaking some numbers, e.g. size of memorized_submap, but I'd like to keep it equal to the size of a submap to avoid unnecessary coordinate conversions.
Also, there are some thoughts on how to deal with high data fragmentation on disk, I'll go into details below.

Testing

FPS

I've compiled the game in Visual Studio in 'Release' mode and ran in-game Draw benchmark while measuring it with Visual Studio's profiler.

For testing purposes, I've selected the following case: a somewhat small map memory size of 71'843 tiles (normal map memory limit in DDA is 57'600, with memory banks bionic it's 2'880'000; default limit in BN right now is 576'000), zoomed-out terrain with no visible tiles, but approx. half of them have been memorized (see attached screenshot).

According to the profiler, time spent inside avatar::get_memorized_tile() went down from 10.43% to 0.11%, and the similar thing has likely happened to avatar::memorize_tile(), but that one is a bit more problematic to measure since it's already optimized by memory_seen_cache.

Overall, this is a performance improvement, a one that wasn't even ported from DDA (yay!).

Disk usage

In that test case, the size of the existing tile memory file was 2'089'403 bytes, and after migration the total size of all saved memorized_submaps went down to 1'535'971 bytes. Likely because old tile memory had to save data as array of global map square coord + memorized_tile, while memorized_submaps are saved as arrays of memorized_tiles.
Disk usage, on the other hand, went up from 2'093'056 to 2'359'296 bytes due to huge number of files (1 vs 570).

Another test case. I've created a new character, and played for a few IRL hours.
In the end, maps folder had 1'115 files with total size of 7'442'102 bytes (9'289'728 on disk), and the tile memory folder had 3'083 files with total size of 9'070'291 bytes (12'730'368 on disk), with each memorized_submap taking from 1'591 to 4'033 bytes.

This makes me think that some sort of compression would be nice to have. Current ideas:

  1. Use Run-length encoding. submaps already do that with tiles and furniture, and it would help with memorized_submaps that are mostly empty or contain many similar tiles (e.g. rivers), but would likely have no effect on all other memorized_submaps since for each physical tile there are many variations of graphical tile (e.g. t_wall vs t_wall with connections to neighboring walls).
  2. Derive memorized tiles from submaps. This approach could potentially shrink memorized_submap to an array of booleans, but I'm afraid it would require much work for little gain.
  3. Save memorized_submaps in bundles. mapbuffer already does that with submaps, writing 4 of them (2 by 2) per file. Since memorized_submaps don't eat as much RAM, they could be packed in bigger chunks (e.g. 6x6 or 8x8). This would cut down on disk usage, and improve save/load performance.
  4. Gzip compression. As CleverRaven#44218 demonstrated, it shouldn't be that hard, and the gains are huge.

I'm currently leaning towards 1+3, with possible 4 in the future.

Upd: implemented 1 and 3, results can be seen here: #247 (comment)

Quirks

Existing map memory code has some quirks. Mainly, what is memorized or not depends on currently displayed terrain. Even with 3d vision enabled, to memorize above ground terrain you have to press x and look at the other levels. I'm not sure if that needs fixing (probably yes), and how fixing that would impact performance / disk usage.

Additionally, in ascii mode the game does not commit to memory tiles that are not visible on screen. This one is probably easy to fix, and I'll most likely include the fix in this PR.

Upd: fixed the ascii mode bug. It required a bit of drawing code cleanup - the cleanup is pure refactor and should not impact the drawing in any way (unless I botched it up, ofc).

Screenshots

Testing area
image

@Coolthulhu
Copy link
Member

Looks really good.
Also looks easy to extend, so that one day it could store items too.

src/editmap.cpp Outdated Show resolved Hide resolved
@Coolthulhu
Copy link
Member

Derive memorized tiles from submaps. This approach could potentially shrink memorized_submap to an array of booleans, but I'm afraid it would require much work for little gain.

In the future, I'll want to have it that way, but that's distant future.
With submaps, the same submap buffer class could be used for both "real" and memorized tiles, making handling of special types (vehicles, items) unnecessary. The issue would be memory usage and performance.

Shrinking it to bools wouldn't work cleanly: consider an "action at a distance" caused by remotely updated submap. For example, due to mission start editing an area. bool approach would reveal the changes.
Though it would be nothing unforgivable, as long as reality bubble changes were handled properly.

I'm currently leaning towards 1+3, with possible 4 in the future.

I like 3 the best for now, since file access is much slower on Windows than it is on Linux, so I couldn't test the performance without booting Windows up.
When I implemented the first z-level features, saving/loading tons of submaps on Windows was unbearable, but Linux users reported no problems.

src/map.h Outdated Show resolved Hide resolved
src/map_memory.h Outdated Show resolved Hide resolved
@olanti-p
Copy link
Contributor Author

olanti-p commented Dec 3, 2020

With submaps, the same submap buffer class could be used for both "real" and memorized tiles, making handling of special types (vehicles, items) unnecessary. The issue would be memory usage and performance.

I didn't want memorized_submap to depend on submap in any way, for the sake of keeping the code separate. Also, display area in the screenshot is ≈4 times larger than current map, and having submap be loaded based on display size sounds like trouble. Unless I misunderstood what you mean.

Shrinking it to bools wouldn't work cleanly: consider an "action at a distance" caused by remotely updated submap. For example, due to mission start editing an area. bool approach would reveal the changes.

Yes, that's why I was considering more of a hybrid approach.

The game right now generates terrain for some submaps from scratch, and theoretically it would be possible to have memorized_submap for such submap to be generated from array of booleans and the freshly-generated terrain. When any change happens to the submap, the corresponding memorized_submap is saved in the "full" form, as an array of tiles. Or just the changes are saved. That could help compress some of the empty submaps (forests, or fields).

But all of that sounds not very robust...

When I implemented the first z-level features, saving/loading tons of submaps on Windows was unbearable, but Linux users reported no problems.

Speaking of, map memory has weird behavior regarding 3d vision. I've updated PR description.

@Coolthulhu
Copy link
Member

I didn't want memorized_submap to depend on submap in any way, for the sake of keeping the code separate.

Sure. Your code doesn't lock the idea out or anything, and it's easier to comprehend and optimize than having to deal with full submaps.
I'm just talking of a possible distant future implementation, where map_memory would internally use a mapbuffer (same as one used for submaps) and memorizing a tile would be written roughly as

auto &memorized_submap = memorized_submap_buffer.get_or_create( ... );
memorized_submap[x][y].set_all_tile_data( real_submap[x][y] );

That is, memory of a tile would be a full copy of said tile, made at the time of memorization.

having submap be loaded based on display size sounds like trouble

If it was loaded only on a special mapbuffer solely for memorized stuff, it wouldn't conflict with the real map.

When any change happens to the submap, the corresponding memorized_submap is saved in the "full" form, as an array of tiles.

Consider:

  • Memorize location L in its entirety
  • Save the game away from L. It is saved as a block of true.
  • Modify L without seeing it. For example, on other character.
  • Look at memory of L, without seeing it directly
  • The changes are visible, even though we never saw them happen

Now, it requires arcane stuff to happen and is very minor, so it totally wouldn't be considered a blocker or anything. Just an edge case that can't be solved perfectly without something like an observer pattern that updates all memory_maps covering a given submap.

@olanti-p
Copy link
Contributor Author

olanti-p commented Dec 3, 2020

a possible distant future implementation, where map_memory would internally use a mapbuffer (same as one used for submaps)

That is, memory of a tile would be a full copy of said tile, made at the time of memorization.

That could be expensive memory wise. Would also require some clever tricks to keep the rendering consistent, since some tiles, like walls or furniture, are rendered based on their connection to neighbours.
I believe something like CleverRaven#36381 would be easier, or a mix of that and your idea.

without something like an observer pattern that updates all memory_maps covering a given submap.

Yeah, that's what I was trying to say. I'm not the best at explanations, heh.

@Coolthulhu
Copy link
Member

That could be expensive memory wise.

Yes, up to double. But it would allow memorizing items and vehicles, which would be incredibly convenient if you wanted to say, find the nearest pair of welding goggles, vehicle with V12, check up on your stash to see if you're missing something etc.

It might actually decrease memory usage too: storing entire tile strings could get expensive.
BTW, did you do a stress test of memory usage? Could easily be done with a test where you'd generate a ton of uniform submaps and then memorize them.
I wouldn't expect it to get anywhere near unacceptable, I'm just curious about how good/bad it is.

Would also require some clever tricks to keep the rendering consistent

Doesn't sound too hard. Say, for every category of tile connection, generate an invisible, fake variant. It would never be drawn, but it would be memorized next to tiles that connect to things, if the real neighbor wasn't memorized.

@olanti-p
Copy link
Contributor Author

olanti-p commented Dec 3, 2020

It might actually decrease memory usage too: storing entire tile strings could get expensive.

did you do a stress test of memory usage?

From what I've read across the internet, memory cost from storing small strings in C++ depends heavily on compiler and which standard library it uses. I don't expect worst-case memorized_submap to go beyond 12 kb, which is pretty mild compared to a single submap (don't know how much it uses, but from the code, looks like at least a hundred kb to me).

Then there's the fact that other z-levels are unlikely to have memorized_submaps allocated for them due to the whole "only tiles on current z-level are memorized" thing.

But I'm curious about the stress test myself - will see what can be done.

@olanti-p
Copy link
Contributor Author

olanti-p commented Dec 4, 2020

Okay, I've made a few changes to the code based on the review, shortened memorized_submap -> mm_submap and implemented batch saving & RLE compression.

Results for a fresh test case (grabbed a car and drove through fields and a small town; 115'568 tiles memorized ):

Variant N files Total Size Disk Usage
maps folder 313 2'465'052 2'945'024
old .mm file 1 3'590'044 3'592'192
1 mm_submap per file 923 2'510'497 3'817'472
64 mm_submap per file 25 2'514'830 2'564'096
64 mm_submap per file + RLE 25 1'764'386 1'818'624

Looks good to me.

TODO:

  • memory stress test

  • deal with ncurses bug

  • playtest

  • region-based saving is implemented loosely based on current mapbuffer behavior:

    • during load, mm_submaps are loaded as regions, unpacked and added to the pool
    • during save, the submaps are assembled back into regions, and regions are saved.

    I'm not sure if I should replace std::map<tripoint, mm_submap> in the map_memory with std::map<tripoint, mm_region>, and keep submaps in these regions, or if I should keep it as is to match the mapbuffer.

@Coolthulhu
Copy link
Member

I'm not sure if I should replace std::map<tripoint, mm_submap> in the map_memory with std::map<tripoint, mm_region>, and keep submaps in these regions, or if I should keep it as is to match the mapbuffer.

Ideally, all of the functions in the interface would use the same scale.
It may be a good idea to hide the region coords completely and only use them internally, at least for now.

Collect map::draw code in one place, simplify indexing/bound checks.
Remove the "batch drawing" optimization: it doesn't help with drawing, but harms code clarity. Instead, convert all relevant draw methods to use 'wputch' and rely on caller to position the cursor properly.
@olanti-p
Copy link
Contributor Author

olanti-p commented Dec 6, 2020

Here's a bunch of memory tests (I love tests); all "memory usage" was measured as heap usage.


From compiler's PoV,
sizeof(submap) = 22'400 bytes
sizeof(mm_submap) = 6'344 bytes.


Simulating "empty" tile memory.
Allocating 21'504 of each (32 * 32 * 21) and placing them in a std::map<tripoint, shared_ptr_fast<T>> gives:
≈21.984 kb heap usage per single submap (incl. ≈0.11 kb overhead from std::map and smart pointer)
≈6.296 kb heap usage per single mm_submap (incl. ≈0.10 kb overhead, almost equal to what submap had)


Simulating "full" tile memory.
I've decided to simply allocate 21'504 mm_submaps, and fill them with random strings [5, 33] characters long (the numbers are approximate of what I've observed in the saved mm_submaps). As a result, I've got
≈11,173 kb heap usage per single mm_submap (incl, ≈0.10 kb overhead calculated in previous test)


Practical test 1.
Setup: avatar with no tile memory is driving a car at 90 km/h for 1 minute in a straight vertical line
(covering ≈32 overmap tiles, allocating additional 1'344 mm_submaps).

Memory usage, kb, old implementation vs the shiny new one:

Implementation At start At end Delta
old 320'192 699'460 +379'268
new 323'256 699'348 +376'092

Apparently, memory usage in this case is almost identical. I've tried this 2 times, and the numbers are within ±100 kb.


Practical test 2.
Pressed x, zoomed out and moved camera up/down 3 z levels.

Stage Memory usage, kb mm_submaps allocated
start 325'776 768
end 374'224 6'912
delta +48'448 +6'144

And this gives ≈7.88 kb per mm_submap


Conclusion:
Unless the player rapidly switches (or views other) z levels, memory usage should stay at approx. the same level. And even if they do, it doesn't grow uncontrollably. Possible optimizations:

  1. When saving, deallocate mm_submaps for other z levels. Right now, the code preserves them under assumption that if the player switched z level once, they're likely to do it again (e.g. the player descended into a basement, or they're exploring a lab), so deallocating the mm_submaps only to load them again in near future would be a waste of cpu cycles.
  2. When switching z levels, go through mm_submaps on the old z level and deallocate empty ones. That would require a small refactor, but seems doable. Wouldn't help with big empty fields though, since avatar commits all these "air" tiles into memory anyway.
  3. Implement something akin to copy-on-write for empty mm_submaps. Should be easy, given they already check for "empty" flag - just move the data into unique_ptr, and check if that ptr in null.

Besides that, I feel this PR is ready. I don't see any wonky behavior with tile memory, and the map drawing seems correct (had to tweak it a bit to allow memorizing off-screen tiles in ascii mode).

@olanti-p olanti-p changed the title [WIP] Refactor map memory, remove map memory limit. Refactor map memory, remove map memory limit. Dec 6, 2020
@Coolthulhu Coolthulhu merged commit 896e264 into cataclysmbnteam:upload Dec 7, 2020
@Coolthulhu
Copy link
Member

Looks good, didn't see any problems during testing.
I didn't do a serious stress test (just "organic" testing), but I trust that it won't suddenly explode and start acting completely unlike in your results.

Ideally, memorizing tiles wouldn't require actually looking up and down z-levels manually. Though this looks like it could be a real performance hog, since from what I can tell, at the moment memorizing a tile requires actually drawing it.

@olanti-p olanti-p deleted the refactor-mm branch December 7, 2020 10:19
@Raikiri
Copy link

Raikiri commented Dec 17, 2020

After glancing over how cataclysm's save data is stored, I'm wondering: why is it not compressed in any way? All the .map json files that take up the most space have remarkably compressible structure and just compressing a bunch of them with default zip compression resulted in ~30x compression ratio which is only possible because of repetitive nature of their contents.

So is it possible to use something like zlib to do transparent compression/decompression for all in-game file operations? It should be pretty simple to convert all text files to be compressed and uncompressed completely transparently for the rest of the code. There's lots of libraries that do that, for example https://github.com/rikyoz/bit7z or just zlib.

This could easily reduce overall memory usage and increase file access speed by a factor of 10 to 50. Besides, windows file access is slow. Reading even compressed files from a single pack is way faster than reading individual uncompressed files.

@olanti-p
Copy link
Contributor Author

So is it possible to use something like zlib to do transparent compression/decompression for all in-game file operations?

Yes, CleverRaven#44218.

I've run some tests using the code from there - compressing each .map file in the maps folder lowered the overall size by order of magnitude (it was, like, ~12x), but since almost all .map files were already below 4kb (cluster size on my partition), actual disk usage dropped by a measly 30%.

While it is good, I see this as waste of potential. It also has a downside - compressing a save full of compressed .map files produces an archive 2-3 times larger than if the .map files were plain json. Another downside - it becomes harder to manually edit the map (but who cares about it, it's not like people do it frequently).

But that's how that particular solution works, other approaches may work better.

for example https://github.com/rikyoz/bit7z

It could work, yes. If that library (or some other potential similar library) supports modifying archives, it should be easy to rig a system that would stuff the .map files into archives instead of on disk.

The question is how that would impact save/load time (I see it going both ways), and how well that library supports other platforms (we're speaking Linux/Windows/Mac/Android, x86/x64).

@Raikiri
Copy link

Raikiri commented Dec 17, 2020

I've run some tests using the code from there - compressing each .map file in the maps folder lowered the overall size by order of magnitude (it was, like, ~12x), but since almost all .map files were already below 4kb (cluster size on my partition), actual disk usage dropped by a measly 30%.

Obviously next step is to use some sort of a virtual file system instead of actual file system to store everything in one file. Compressing a bunch of files (especially similar ones) is always superior to compressing each file individually. Bundling files using packaging like this is how they optimize slow HDD access on consoles, added benefit of also drastically compressing content in case of Cataclysm comes as just a bonus. Heck, this is how quake engine has always worked, if I remember correctly. So even counting in all the compression/decompression overhead, I still expect to see a significant improvement in terms of both performance and memory.

We use our own implementation of such file system in our engine (Path of Exile), but here's a bunch OSS examples of such libraries: https://github.com/yevgeniy-logachev/vfspp , https://github.com/anthony-y/tiny-vfs . Note that they're typically designed to be a drop-in replacement for a physical filesystem which's pretty convenient.

@Coolthulhu
Copy link
Member

The question is how that would impact save/load time (I see it going both ways), and how well that library supports other platforms (we're speaking Linux/Windows/Mac/Android, x86/x64).

7z/xz/LZMA is notoriously slow to compress. Cataclysm saves are written about as often as they are read, so compression speed does matter.
https://quixdb.github.io/squash-benchmark/ suggests that zlib-ng, zstd and lz4 would be better here. Though without a big map from an actual game to test on, it's just a suggestion. I'll ask on Discord for some late game save.
https://github.com/inikep/lzbench - this would be useful to test things. It looks like it bundles many compressors in its repo, so it should be easy to use.

@Raikiri
Copy link

Raikiri commented Dec 17, 2020

7z/xz/LZMA is notoriously slow to compress. Cataclysm saves are written about as often as they are read, so compression speed does matter.

A trick that should be possible to do is to run index once (say, in build time) to create the dictionary that will be used for future compression so that the compressor would not have to rebuild it every time it needs to recompress something. This can work, because I expect the dictionary to change very little as data structure is very similar for different playthroughs and users. I'm not sure which exact libraries support this feature, but might be handy to look out for something like this.

@Coolthulhu
Copy link
Member

A trick that should be possible to do is to run index once (say, in build time) to create the dictionary that will be used for future compression

This is explicitly supported on zstd.

@Raikiri
Copy link

Raikiri commented Dec 17, 2020

Another random fact: Cataclysm should not need nearly as much write bandwidth as it needs read bandwidth, because in theory during playtime it only needs to save/compress changes (which's not much), but whenever you load a game, it needs to load/decompress the entire world. So (again, in theory, I'm not sure how saves are actually implemented), compression time should not be nearly as important as decompression time.

@Coolthulhu
Copy link
Member

Coolthulhu commented Dec 17, 2020

Another random fact: Cataclysm should not need nearly as much write bandwidth as it needs read bandwidth, because in theory during playtime it only needs to save/compress changes (which's not much)

It needs to touch all the timestamps for all loaded submaps. In theory, for uncompressed submaps, it would be possible to update only a few bits, but it doesn't work for compressed ones. At least not without extracting timestamps out of them.

Debug world, 23 mb

./lzbench ../DEBUG.tar
lzbench 1.8 (64-bit Linux)   Assembled by P.Skibinski
Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                   8898 MB/s  8826 MB/s    21841920 100.00 ../DEBUG.tar
density 0.14.2 -1        2236 MB/s  1885 MB/s    11682046  53.48 ../DEBUG.tar
density 0.14.2 -2         850 MB/s  1413 MB/s     5462256  25.01 ../DEBUG.tar
density 0.14.2 -3         456 MB/s   436 MB/s     3583128  16.40 ../DEBUG.tar
fastlz 0.5.0 -1           600 MB/s  1183 MB/s     4718472  21.60 ../DEBUG.tar
fastlz 0.5.0 -2           535 MB/s  1143 MB/s     4588649  21.01 ../DEBUG.tar
lizard 1.0 -10            754 MB/s  1812 MB/s     5059516  23.16 ../DEBUG.tar
lizard 1.0 -11            727 MB/s  1796 MB/s     5046902  23.11 ../DEBUG.tar
lizard 1.0 -12            203 MB/s  2041 MB/s     4043638  18.51 ../DEBUG.tar
lizard 1.0 -13            119 MB/s  2130 MB/s     3736618  17.11 ../DEBUG.tar
lizard 1.0 -14             98 MB/s  2241 MB/s     3507047  16.06 ../DEBUG.tar
lz4 1.9.3                 904 MB/s  3191 MB/s     4998271  22.88 ../DEBUG.tar
lz4fast 1.9.3 -3          913 MB/s  3188 MB/s     5032220  23.04 ../DEBUG.tar
lz4fast 1.9.3 -17         931 MB/s  3168 MB/s     5348680  24.49 ../DEBUG.tar
lzf 3.6 -0                650 MB/s   884 MB/s     4788803  21.92 ../DEBUG.tar
lzf 3.6 -1                645 MB/s   885 MB/s     4713484  21.58 ../DEBUG.tar
lzfse 2017-03-08           56 MB/s  1003 MB/s     2284950  10.46 ../DEBUG.tar
lzjb 2010                 491 MB/s   361 MB/s     5090531  23.31 ../DEBUG.tar
lzo1b 2.10 -1             630 MB/s  1006 MB/s     4408072  20.18 ../DEBUG.tar
lzo1c 2.10 -1             620 MB/s  1012 MB/s     4457993  20.41 ../DEBUG.tar
lzo1f 2.10 -1             618 MB/s   916 MB/s     4549635  20.83 ../DEBUG.tar
lzo1x 2.10 -1             896 MB/s   986 MB/s     4580332  20.97 ../DEBUG.tar
lzo1y 2.10 -1             884 MB/s   900 MB/s     4353101  19.93 ../DEBUG.tar
lzrw 15-Jul-1991 -1       588 MB/s   690 MB/s     5335019  24.43 ../DEBUG.tar
lzrw 15-Jul-1991 -3       537 MB/s   962 MB/s     4796576  21.96 ../DEBUG.tar
lzrw 15-Jul-1991 -4       539 MB/s   757 MB/s     4710317  21.57 ../DEBUG.tar
lzrw 15-Jul-1991 -5       200 MB/s  1102 MB/s     3802134  17.41 ../DEBUG.tar
lzsse4fast 2019-04-18     447 MB/s  3905 MB/s     4589059  21.01 ../DEBUG.tar
lzsse8fast 2019-04-18     434 MB/s  3526 MB/s     4753667  21.76 ../DEBUG.tar
lzvn 2017-03-08            94 MB/s  1288 MB/s     3448628  15.79 ../DEBUG.tar
pithy 2011-12-24 -0       838 MB/s  2230 MB/s     4281897  19.60 ../DEBUG.tar
pithy 2011-12-24 -3       831 MB/s  2253 MB/s     4160971  19.05 ../DEBUG.tar
pithy 2011-12-24 -6       810 MB/s  2276 MB/s     4124618  18.88 ../DEBUG.tar
pithy 2011-12-24 -9       755 MB/s  2263 MB/s     4101315  18.78 ../DEBUG.tar
quicklz 1.5.0 -1          694 MB/s  1113 MB/s     4155866  19.03 ../DEBUG.tar
quicklz 1.5.0 -2          352 MB/s  1236 MB/s     3736999  17.11 ../DEBUG.tar
shrinker 0.1              516 MB/s  1286 MB/s     4419312  20.23 ../DEBUG.tar
snappy 2020-07-11         668 MB/s  2256 MB/s     4399119  20.14 ../DEBUG.tar
tornado 0.6a -1           605 MB/s   677 MB/s     4194408  19.20 ../DEBUG.tar
tornado 0.6a -2           456 MB/s   591 MB/s     3840181  17.58 ../DEBUG.tar
tornado 0.6a -3           307 MB/s   498 MB/s     2674943  12.25 ../DEBUG.tar
zstd 1.4.5 -1             553 MB/s  1254 MB/s     2395081  10.97 ../DEBUG.tar
zstd 1.4.5 -2             520 MB/s  1195 MB/s     2461963  11.27 ../DEBUG.tar
zstd 1.4.5 -3             511 MB/s  1234 MB/s     2364942  10.83 ../DEBUG.tar
zstd 1.4.5 -4             506 MB/s  1235 MB/s     2366930  10.84 ../DEBUG.tar
zstd 1.4.5 -5             266 MB/s  1243 MB/s     2149050   9.84 ../DEBUG.tar
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=1706MB cSpeed=0MB)

Krwak's world, 151 mb

lzbench/lzbench Barlow.tar 
lzbench 1.8 (64-bit Linux)   Assembled by P.Skibinski
Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                   7664 MB/s  7710 MB/s   159948800 100.00 Barlow.tar
density 0.14.2 -1        2213 MB/s  1861 MB/s    86133206  53.85 Barlow.tar
density 0.14.2 -2         751 MB/s  1342 MB/s    36986604  23.12 Barlow.tar
density 0.14.2 -3         447 MB/s   433 MB/s    22424526  14.02 Barlow.tar
fastlz 0.5.0 -1           608 MB/s  1279 MB/s    30339927  18.97 Barlow.tar
fastlz 0.5.0 -2           564 MB/s  1331 MB/s    27403539  17.13 Barlow.tar
lizard 1.0 -10            838 MB/s  2294 MB/s    28533994  17.84 Barlow.tar
lizard 1.0 -11            819 MB/s  2292 MB/s    28209761  17.64 Barlow.tar
lizard 1.0 -12            234 MB/s  2589 MB/s    22650758  14.16 Barlow.tar
lizard 1.0 -13            133 MB/s  2682 MB/s    20804866  13.01 Barlow.tar
lizard 1.0 -14            115 MB/s  2782 MB/s    19567386  12.23 Barlow.tar
lz4 1.9.3                1036 MB/s  3799 MB/s    27887084  17.44 Barlow.tar
lz4fast 1.9.3 -3         1068 MB/s  3802 MB/s    28419843  17.77 Barlow.tar
lz4fast 1.9.3 -17        1123 MB/s  3748 MB/s    33253651  20.79 Barlow.tar
lzf 3.6 -0                603 MB/s  1057 MB/s    31079949  19.43 Barlow.tar
lzf 3.6 -1                634 MB/s  1076 MB/s    30320162  18.96 Barlow.tar
lzfse 2017-03-08           58 MB/s  1306 MB/s    14807982   9.26 Barlow.tar
lzjb 2010                 461 MB/s   375 MB/s    44283203  27.69 Barlow.tar
lzo1b 2.10 -1             655 MB/s  1156 MB/s    26015263  16.26 Barlow.tar
lzo1c 2.10 -1             616 MB/s  1108 MB/s    27797398  17.38 Barlow.tar
lzo1f 2.10 -1             577 MB/s  1033 MB/s    28158113  17.60 Barlow.tar
lzo1x 2.10 -1             991 MB/s  1101 MB/s    28148261  17.60 Barlow.tar
lzo1y 2.10 -1             984 MB/s  1026 MB/s    27217690  17.02 Barlow.tar
lzrw 15-Jul-1991 -1       515 MB/s   661 MB/s    42946078  26.85 Barlow.tar
lzrw 15-Jul-1991 -3       518 MB/s   958 MB/s    35835837  22.40 Barlow.tar
lzrw 15-Jul-1991 -4       540 MB/s   755 MB/s    34162494  21.36 Barlow.tar
lzrw 15-Jul-1991 -5       199 MB/s  1115 MB/s    28022788  17.52 Barlow.tar
lzsse4fast 2019-04-18     439 MB/s  4443 MB/s    28525983  17.83 Barlow.tar
lzsse8fast 2019-04-18     417 MB/s  4274 MB/s    29020482  18.14 Barlow.tar
lzvn 2017-03-08            97 MB/s  1496 MB/s    20717455  12.95 Barlow.tar
pithy 2011-12-24 -0       966 MB/s  2570 MB/s    24299837  15.19 Barlow.tar
pithy 2011-12-24 -3       972 MB/s  2605 MB/s    23524762  14.71 Barlow.tar
pithy 2011-12-24 -6       932 MB/s  2687 MB/s    23052713  14.41 Barlow.tar
pithy 2011-12-24 -9       872 MB/s  2663 MB/s    22874247  14.30 Barlow.tar
quicklz 1.5.0 -1          763 MB/s  1314 MB/s    23882622  14.93 Barlow.tar
quicklz 1.5.0 -2          404 MB/s  1512 MB/s    20157441  12.60 Barlow.tar
shrinker 0.1             2630 MB/s  5156 MB/s   139116454  86.98 Barlow.tar
snappy 2020-07-11         763 MB/s  2700 MB/s    28563615  17.86 Barlow.tar
tornado 0.6a -1           691 MB/s   802 MB/s    24807464  15.51 Barlow.tar
tornado 0.6a -2           549 MB/s   736 MB/s    22061904  13.79 Barlow.tar
tornado 0.6a -3           378 MB/s   629 MB/s    17130546  10.71 Barlow.tar
zstd 1.4.5 -1             701 MB/s  1722 MB/s    15885405   9.93 Barlow.tar
zstd 1.4.5 -2             662 MB/s  1637 MB/s    16206240  10.13 Barlow.tar
zstd 1.4.5 -3             644 MB/s  1726 MB/s    15355992   9.60 Barlow.tar
zstd 1.4.5 -4             626 MB/s  1731 MB/s    15364562   9.61 Barlow.tar
zstd 1.4.5 -5             280 MB/s  1749 MB/s    13909818   8.70 Barlow.tar
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=1706MB cSpeed=0MB)

TL;DR: zstd is the clear winrar in both single file tests.

@Raikiri
Copy link

Raikiri commented Dec 17, 2020

I think zstd looks like a good backend, but I think from implementation and maintenance standpoint it's more reasonable to pick a library that already does all the vfs packing/unpacking/updating transparently even if it'd have slightly worse performance parameters. Or maybe it'd just use zstd as a backend, idk.

I was always on the fence of using C-style unsafe libraries in C++ projects without a proper safe wrapper with smart pointers and stuff, but this one is up to you guys to decide, I have no clue what guidelines you follow in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants