Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect logger output for non-Latin strings. #2925

Merged

Conversation

allexzander
Copy link
Contributor

Signed-off-by: allexzander [email protected]

@allexzander
Copy link
Contributor Author

Logging of non-Latin named characters outputs garbage (confirmed with Cyrillic and Chinese at least), making it hard to study sync bug reports for such languages. This is an attempt to fix that.

Especially looking forward to suggestions on how to do it better from @er-vin.

Copy link
Member

@er-vin er-vin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure my comments are really on point, they're more questions to ask ourselves on such tasks. I think I still lack a bit of context on that one. Or my brain is underpowered by inactivity... can't be excluded. :-)

src/common/ownsql.h Show resolved Hide resolved
src/gui/main.cpp Outdated
@@ -73,6 +73,9 @@ int main(int argc, char **argv)
#endif
OCC::Application app(argc, argv);

QTextCodec *codec1 = QTextCodec::codecForName("UTF-8");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that really makes sense outside of Windows? AFAIK everyone else is using UTF-8 by default now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@er-vin Might be not-needed outside of Windows, so conditional compilation could help. On Windows, though, adding this - helps to make logs written via qDebug, qCInfo, and others, readable in the .log files later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about this though... because we don't change the locale but then we enforce a different codec than the "right" codec for that locale... If it's "just" about the log file because we want to make sure it's always encoded in UTF-8 (because indeed right now it will be encoded in whatever codec required by the user locale) then I'd only enforce UTF-8 on the codec used in the QTextStream of Logger. This would have way less side effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually now that I think of it, this might be the only change you need... if I'm not mistaken, all QByteArrays written to the log file at that point would be forcibly interpreted as UTF-8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually now that I think of it, this might be the only change you need... if I'm not mistaken, all QByteArrays written to the log file at that point would be forcibly interpreted as UTF-8.

@er-vin Works in most scenarios but fails for the SQLite case, so changes to bindValue and qCInfo(lcDb) << "Updating file record for path:" << QString::fromUtf8(record._path) << "inode:" << record._inode are still needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@er-vin Unless, I could go with changing QByteArray to QString for paths everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually now that I think of it, this might be the only change you need... if I'm not mistaken, all QByteArrays written to the log file at that point would be forcibly interpreted as UTF-8.

@er-vin Works in most scenarios but fails for the SQLite case, so changes to bindValue and qCInfo(lcDb) << "Updating file record for path:" << QString::fromUtf8(record._path) << "inode:" << record._inode are still needed

OK, then I was likely wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@er-vin Unless, I could go with changing QByteArray to QString for paths everywhere.

Well, that'd be a much bigger and error prone endeavor. I'd keep that for much later, master doesn't need this right now. ;-)

@@ -909,7 +909,7 @@ bool SyncJournalDb::setFileRecord(const SyncJournalFileRecord &_record)
}
}

qCInfo(lcDb) << "Updating file record for path:" << record._path << "inode:" << record._inode
qCInfo(lcDb) << "Updating file record for path:" << QString::fromUtf8(record._path) << "inode:" << record._inode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you want is fromLocal8Bit? This is indeed what we should use for QByteArrays coming from the system before display.... that said I'm unsure, could be processed before and we already know it's utf8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@er-vin fromLocal8Bit gives garbage. This has been processed before, and yes we know it's UTF-8. It's coming from QByteArray csync_file_stat_t::path which was set to QString::fromWCharArray(handle->ffd.cFileName).toUtf8 before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So we're good with utf8 indeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'm wondering if you might want to do the same with the _e2eMangleName below, this is also a path stored as QByteArray... you might want to check that too.

@allexzander
Copy link
Contributor Author

Not sure my comments are really on point, they're more questions to ask ourselves on such tasks. I think I still lack a bit of context on that one. Or my brain is underpowered by inactivity... can't be excluded. :-)

@er-vin

The context can be seen when browsing log files of a Desktop Client. Here is one of such examples:

2021-02-10 17:19:48:408 [ info nextcloud.sync.csync.reconciler ]: INSTRUCTION_REMOVE client file: Ãðàôèêè ïî ó÷àñòêàì/!Ñëóæáà äîêóìåíòàöèè

As you can see, the line after client file: looks some kind of gibberish. So, the PR is about fixing such cases. In fact, it works right now, but I am not sure if that's the proper way of fixing it. And yes, this happens on Windows.

@allexzander allexzander marked this pull request as ready for review February 15, 2021 18:22
@allexzander allexzander force-pushed the fix-incorrect-logger-output-for-non-latin-strings branch from 32d1f0e to 414e792 Compare February 15, 2021 18:23
@allexzander allexzander requested a review from er-vin February 15, 2021 18:23
@allexzander allexzander changed the title Fix incorrect logger output for non-latin strings. Fix incorrect logger output for non-Latin strings. Feb 15, 2021
@allexzander
Copy link
Contributor Author

@er-vin If you got a spare minute - feel free to add some nitpicks :)

@@ -55,6 +55,8 @@ class OCSYNC_EXPORT SyncJournalFileRecord
bool isDirectory() const { return _type == ItemTypeDirectory; }
bool isFile() const { return _type == ItemTypeFile || _type == ItemTypeVirtualFileDehydration; }
bool isVirtualFile() const { return _type == ItemTypeVirtualFile || _type == ItemTypeVirtualFileDownload; }
QString pathFromUtf8() const { return QString::fromUtf8(_path); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a "fromSomething" I'd expect the "Something" as parameter while this takes no parameter. What about "pathAsString()"? I'm even wondering if just "path()" would be enough... since we were toying with the idea of switching it all to QString instead, having such a QString path() const would be a nice API to slowly port against and when _path isn't used directly anymore it's a sign it could be switched to QString. Sounds like a potentially nice strategy for such a changer over a longer period of time.

@@ -55,6 +55,8 @@ class OCSYNC_EXPORT SyncJournalFileRecord
bool isDirectory() const { return _type == ItemTypeDirectory; }
bool isFile() const { return _type == ItemTypeFile || _type == ItemTypeVirtualFileDehydration; }
bool isVirtualFile() const { return _type == ItemTypeVirtualFile || _type == ItemTypeVirtualFileDownload; }
QString pathFromUtf8() const { return QString::fromUtf8(_path); }
QString e2eMangledNameFromUtf8() const { return QString::fromUtf8(_e2eMangledName); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@allexzander allexzander force-pushed the fix-incorrect-logger-output-for-non-latin-strings branch from cb0ae42 to 4d66d20 Compare February 16, 2021 07:30
@allexzander allexzander requested a review from er-vin February 16, 2021 07:38
@nextcloud-desktop-bot
Copy link

AppImage file: Nextcloud-PR-2925-4d66d209bc4d46e42a97ba20cda908be84702093-x86_64.AppImage

To test this change/fix you can simply download above AppImage file and test it.

Please make sure to quit your existing Nextcloud app and backup your data.

@allexzander allexzander merged commit e6ec648 into master Feb 16, 2021
@allexzander allexzander deleted the fix-incorrect-logger-output-for-non-latin-strings branch February 16, 2021 14:13
@allexzander allexzander restored the fix-incorrect-logger-output-for-non-latin-strings branch February 17, 2021 12:52
@nextcloud nextcloud deleted a comment from backportbot-nextcloud bot Feb 17, 2021
@allexzander
Copy link
Contributor Author

/backport to stable-3.1

@backportbot-nextcloud
Copy link

The backport to stable-3.1 failed. Please do this backport manually.

@allexzander allexzander deleted the fix-incorrect-logger-output-for-non-latin-strings branch February 17, 2021 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants