-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error "Corruption: corrupted compressed block contents" #137
Comments
i guess the contents are compressed? |
Yeah well, that would make sense I guess since this only happens if there's a lot of data from local storage continously saved into the local storage leveldb folder. I'm very new to leveldb, do you have any experience with, what compression algorithms are being used commonly with leveldb or did you have the issue yourself ever before? Theoretically, I should be able to uncompress and then be able to read it with plyvel again, but how would I figure out the compression algorithm? |
don't specify anything and it will detect and use snappy if needed |
also wondering what you're trying to imply here |
I just read through the documentation again, I couldn't find any functionality to decompress and just not specifying anything, if I understood you correctly you meant it like this(?): db = plyvel.DB(dbdir) that stil ends up in the I also used repair_db and that causes my database to lose around 80% of containing information; including the key that was I guess saved in the compressed parts of the leveldb? |
just tried this on (a copy of) this directory on my machine:
with these versions:
which gives me
and similarly:
i see data from chrome, though it's chrome's internal binary format so good luck interpreting that. |
I've the exact same versioning as you do, but I can't run your code snippet, without getting the error in title, why is this happening for me?? Using this repo: https://github.com/AustEcon/plyvel-wheels, with python 3.8.10 and Chrome 96.0.4664.93 for windows 10 64bit |
🤔 perhaps your leveldb build lacks snappy support altogether? ( |
First off thanks alot for taking your time and actually trying to help me, I really appreciate you alot 👍 I spent the entire time trying to reproduce this, I eventually switched to a virtual machine with fresh windows 10 and english chrome with python 3.10.1 There I could run your snippet without issues, at the start. But later on I figured it out that theres, some kind of additional compression after the local storage is above 800 kb big, all the files that were in the folder before 800 kb was reached, get unioned to 1 single file, which then ends up being around 200 and 300 kb. You can reproduce this for yourself, if you delete all contents of the local storage leveldb folder and then browse enough websites until 800 kb is reached. I prepared this short snippet for the
python code I used afterwards to read the leveldb after additional compression: import plyvel
db = plyvel.DB(r'C:\Users\MyUserName\Appdata\Local\Google\Chrome\user data\Default\Local Storage\leveldb')
for each in db.iterator():print(each) I don't understand what compression is used for this, on wikipedia it says leveldb only uses snappy compression and chrome is listed to be using leveldb, so what are they even doing after 800 kb to cause this error?
Also this was not the case for me, not everything is using the internal binary format around 90% of it was always entirely visible as raw human readable string for me. |
do you have a stack trace? which call fails exactly and where? never heard of two types of compression in leveldb. missing snappy lib leads to compression error messages that can be confusing. plyvel linux binary wheels accidentally suffered from that at some point in the past |
Short video I recorded: https://youtu.be/vBLqgjMJelw The 2nd time I ran the same code, after opening way more websites, that loaded above 800 kb into local storage level db folder, extra compression kicked in and the file size was reduced to 247 kb in total. I also checked, this stuff has to be compressed, because websites which for example only used the local storage to save account information such as a token, would remember me even after the compression, so chrome somehow decompresses it and feeds it back to local storage in the browser which you can see if you press F12 > application tab > local storage Here's the leveldb, that plyvel causes a error with: https://easyupload.io/j2z8mr Traceback (most recent call last):
File "C:\Users\Rando\Desktop\dog.py", line 3, in <module>
for each in db.iterator():print(each)
File "plyvel\_plyvel.pyx", line 841, in plyvel._plyvel.Iterator.__next__
File "plyvel\_plyvel.pyx", line 886, in plyvel._plyvel.Iterator.real_next
File "plyvel\_plyvel.pyx", line 91, in plyvel._plyvel.raise_for_status
plyvel._plyvel.CorruptionError: b'Corruption: corrupted compressed block contents' Since chrome is based on chromium, I guess this might help if you understand C++ because I don't https://github.com/chromium/chromium/search?q=.ldb |
i tried this on my chrome profile's local storage database which is >10 mb large, and i cannot reproduce at all:
this dumps lots of stuff to the screen. my plyvel is compiled with libsnappy support, as
|
i tried the same on your sample database, and it also worked fine:
|
Did you try this with the plyvel for windows version? Maybe @AustEcon did not compile it properly or do you have any idea why I can't run it, because if libsnappy or whatever didn't work at all, I should've been not able to build it / use plyvel on the smaller database in first place right? But as you see it works on the video, unfortunately as soon as it gets bigger and I try to read it with plyvel again I get the error 😨 |
my testing was on an up-to-date linux system using the official (built by myself 🙃) plyvel wheel packages. i have not tried on windows, and i cannot / will not either; i have not used windows at all for ~20 years now. that said, technically, snappy is an optional dependency for leveldb, but not compiling leveldb against it is setting yourself up for nasty surprises… since it means databases using compression (most of them in the real world!) cannot be opened. i further suspect leveldb+snappy use opportunistic compression, meaning only data that benefits from it gets compressed. this could explain the ‘tipping point’ you see. |
closing since this is very likely not an issue in this repo |
Just a side note; I think it's pretty funny how someone from the chromium project / google read through this issue ticket and removed every single line of code that was assosciated with .ldb. I think at this point they're intentionally trying to dodge decompression of chrome's leveldb |
@Avnsx same question when use window plyvel from AustEcon/plyvel-wheels, |
@Avnsx a workaround is to use leveldb in window WSL, |
does plyvel has RepairDB? |
FYI, I can confirm this problem occurs on Chromium database if your build of leveldb does not link in the snappy library. I had to rebuild leveldb with snappy (on Windows), then the problem disappeared. |
Can you publish your build, so I can try it? @zmic |
Trying to read google chromes, local storage leveldb. It is located in
%LOCALAPPDATA%\Google\Chrome\User Data\Default\Local Storage\leveldb
. When deleting all contents of the folder and browsing only a couple websites, then closing chrome, I can read it with plyvel. But when I browse too many websites and close chrome(else it is not readable & the most recent changes to local storage are not saved to local storage leveldb, because chrome is still using it and blocking other programs from reading it, unless you create a temporal copy of the folder and read that instead), it starts outputting the error in the title. How do I solve this issue?used code:
I'm using the plyvel for windows 10 fork on python 3.8.10, but I'm very sure it's not the issue and that your most up to date repo, which I can't even install on windows - the most used operating system in the world -, will replicate the exact same behaviour.
The text was updated successfully, but these errors were encountered: