Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMDB files compatibility between platforms #364

Closed
dustContributor opened this issue Jan 7, 2018 · 7 comments
Closed

LMDB files compatibility between platforms #364

dustContributor opened this issue Jan 7, 2018 · 7 comments

Comments

@dustContributor
Copy link

I was reading up about LMDB and I found this blog entry:

https://blog.separateconcerns.com/2016-04-03-lmdb-format.html

This caught my eye:

The first thing to know is that the data.mdb format is platform-specific, which means 
that you cannot necessarily open a database created on a machine on another one. 
In practice, on ARM and x86, the only thing you have to care about is whether the 
machine is 32 or 64 bits. 

I didn't see this explained anywhere in LMDB's site. I'm guessing it has something to do with padding and or relative pointers stored in the file.

I also see a few defines that might affect this in LMDB's sources:
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#L190

I thought this "VL32" build might have something to do with it but after reading this I don't think so:
https://www.openldap.org/lists/openldap-devel/201410/msg00001.html

Is it possible to compile it in such a way to make it compatible across 32 and 64 bit builds? Because as it is, if I understand right, if you load a program in a 32 bit JVM, save something, then load it in a 64 bit JVM, the db file wont work.

@dustContributor
Copy link
Author

dustContributor commented Jan 7, 2018

Ah, LMDB doesn't seems to have a pure in-memory mode. Hm, this puts a dent on what I was trying to accomplish.

I wanted to have read only LMDB files storing base game data and mods, and an in-memory representation of all of them combined. Then later write out "diffs" of what the player does as the would-be game save files.

Say, base game defines an apple on the counter. A mod modifies the apple's value. And a save game file modifies it's position (because the player moved it).

EDIT: This looks pretty interesting: UnQLite. Made in ANSI C, works in a similar way but it allows you to make it persistent or in-memory, and it says the stored data is platform independent. Can't find benchmarks for it though.

@Spasi
Copy link
Member

Spasi commented Jan 7, 2018

Cross-platform/architecture compatibility

LMDB files are indeed sensitive to CPU word size and endianness. We can reasonably assume little-endian CPUs everywhere, so 32-bits vs 64-bits is the real issue. The only 32-bit binaries we currently ship are on Windows x86, but we may have to worry about 32-bit ARM soon too. Some options:

  • Generate the application data twice, once to a 32-bit DB and once to a 64-bit DB. Ship the appropriate DB for the user's platform.
  • Generate the application data to a 32-bit DB, convert to a 64-bit DB (the opposite is not safe afaik). Again, ship the appropriate DB for the user's platform.
  • LWJGL switches to building 32-bit binaries with MDB_VL32. This indeed makes LMDB files compatible across architectures, at the expense of slightly lower performance on 32-bit (internal structures become 64-bit). Looks like Monero did this about a year ago (uses LMDB to store the blockchain) and I've also just tested it with success, so I guess this mode is supported and mature now. Obvious drawback: we break existing DBs, but I'm not aware of any LWJGL users that work with LMDB, except on server applications (x64-only).

In-memory mode

LMDB indeed does not have a pure in-memory mode, unless you count storing the DB on a RAM disk / tmpfs. However, you can get pretty close to it by enabling one or more of MDB_WRITEMAP, MDB_NOMETASYNC, MDB_NOSYNC, MDB_MAPASYNC. See the documentation for what they do exactly and how they interact with each other. You then use mdb_env_sync() to persist changes to disk, whenever is appropriate for your application.

Make sure to also use zero-copy reads/writes (see MDB_RESERVE).

@dustContributor
Copy link
Author

dustContributor commented Jan 7, 2018

Cool, the VL32 flag would work, it isn't much different from what LWJGL does when mapping all pointers to long, regardless of "bitness". The use case I was thinking on is using it as backend for storing game data and game save files. Having in mind the case where a player wants to copy save game files across systems, or like what Obsidian does with Tyranny, request the player's save game to reproduce the bug they're having.

As for the in-memory part, I saw that before and from what I googled around, it might have issues:
https://stackoverflow.com/questions/36170778/in-memory-databases-with-lmdb

For instance, SO people say that the OS can have a max amount of dirty pages before they get commited to disk regardless of how LMDB was setup, moreover, such mechanism varies by OS.

I also looked at this a while ago MVStore, made in Java and from H2 people, but the internals looked... icky, specially the crazyness of how it deals with serializing fields (some weird recursive calling of type handlers).

@Spasi
Copy link
Member

Spasi commented Jan 8, 2018

Have you considered using two separate databases (i.e. two MDB_env *)? One in "safe" mode and one in nosync/async mode. Store all transient data to the async db and when it's time to save the game, copy the modified data over to the safe db. Whether the OS has persisted the transient data to disk or not shouldn't make a difference.

@dustContributor
Copy link
Author

dustContributor commented Jan 10, 2018

Wouldn't it be simpler just to straight use a normally persisted DB for the save file in that case? I'd gain nothing from having two DBs if the idea is to serialize it anyway.

In any case, the thing is that I needed an in-memory version but not for the save file, but for the merging of base game data and mods. Save game stuff goes into another instance, which would work like a diff rather than an intersection of the game data.

  • Read-only file-backed: Game data, mods.
  • In-memory: Intersection between game data and mods. This one is constructed at runtime based on mod load order and never serialized.
  • Read-write file-backed: Game save file with "diffs" between the composed in-memory db and what the player does.

Process would work like: Reading game data, overwrite/add entries with mod data, and start recording changes from the playthrough into a separate db, which gets to a file when the player saves the game.

Game would be continuously recording what the differences are, since LMDB would be mapping that data to a file, then when the player saves I'd copy the db into another file backed db, close the first one (player's game save file) and keep recording in the new one.

In any case, I'll check this one first OpenHFT - ChronicleMap Makes some good promises, specially the low allocations part.

If that doesn't flies, I'll try that workaround for LMDB or another Java library.

EDIT: Nvm, lib is LGPL. F**k me.

Thanks for your input! Do I close the issue? Will next LWJGL builds have the VL32 define?

@Spasi
Copy link
Member

Spasi commented Jan 10, 2018

I'll close it soon, when the VL32 change is live.

@dustContributor
Copy link
Author

Cool, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants