Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs in handling non-ASCII characters in clipboard over restarts #357

Closed
comodoro opened this issue Dec 19, 2018 · 14 comments
Closed

Bugs in handling non-ASCII characters in clipboard over restarts #357

comodoro opened this issue Dec 19, 2018 · 14 comments
Labels
Bug Unexpected behavior from existing features. UpStream An issue related to an upstream dependency or project.

Comments

@comodoro
Copy link
Contributor

  1. If I select "závorka", do "stoosh three", restart Dragon and do "spark three", I get zxe1vorka. It should be encoded as z\xe1vorka. Probably affects characters 128-255. Confirmed by @mrob95

  2. Certain saved clipboard items give

<class 'toml.decoder.TomlDecodeError'>
Reserved escape sequence used (line 1 column 1 char 0)
<traceback object at 0x0EF9EB48>
  File "C:\workspaces\dns_userdir\caster\lib\utilities.py", line 85, in load_toml_file
    result = toml.loads(f.read())

  File "C:\Python27\lib\site-packages\toml\decoder.py", line 458, in loads
    raise TomlDecodeError(str(err), original, pos)

A clipboard file is pasted here: https://pastebin.com/jGeV5jvE

@Versatilus
Copy link
Collaborator

It looks like this is an issue upstream.
uiri/toml#201

If a fix isn't trivial to implement, it might be a good idea to switch this file and others which don't hold information usually edited by users to JSON while a fix is being worked on.

@comodoro
Copy link
Contributor Author

Thanks, good find. I am not sure what would be more work.

Changing to unicode at https://github.com/synkarius/caster/blob/bc5b2b8bd4a2495c910a4beb4260d50847f922a9/caster/lib/navigation.py#L100 seems to eliminate hex literals, but I am not certain about the whole issue

@Versatilus
Copy link
Collaborator

I'll have to investigate further. I should've paid more attention to the clipboard changes in Dragonfly. It used to all be Unicode and just work.

@LexiconCode LexiconCode added Bug Unexpected behavior from existing features. UpStream An issue related to an upstream dependency or project. labels Dec 19, 2018
@comodoro
Copy link
Contributor Author

I don't know, I think with Json it worked. The upstream issue mention was spot on, because the string \x causes problems, being saved as \u00, exactly as described in the issue. Still I would maybe consider dirty temporary fix before temporarily moving back to Json

@LexiconCode
Copy link
Member

@Versatilus @comodoro
It looks like we have two alternatives for Python implementations of toml.

@comodoro
Copy link
Contributor Author

@Versatilus @LexiconCode A little peek on them:

  • qtoml only supports Python 3
  • tomlkit seems to have problems with Unicode strings, but might be worth investigating if this is a trivial omission or not, as it has key sorting and there is Implement Sort keys when saving Toml files. #323, if I understand it correctly

@LexiconCode
Copy link
Member

Thank you for taking a closer look. How about for now, we utilize tomlkit as it seems to be a more advanced implementation?

  • generate comments as part of spec
  • The ability to modify TOML documents.
    and so on.

Then for non-user facing files such as clipboard will use json until/if toml parsing libraries become a bit more mature.

@comodoro
Copy link
Contributor Author

So it turns out I did not use Unicode literals, the tests in tomlkit are quite extensive and include Unicode without problems. So tomlkit seems to be OK everywhere.

@LexiconCode
Copy link
Member

LexiconCode commented Jan 18, 2019

@comodoro
tomlkit is not going to be a straight port.
It Caster loads after deleting .toml files but on second load the following stack trace is produce. Which means there's an issue with how toml are read. Which means that our current implementation is going to need some tweaking.

@comodoro
Copy link
Contributor Author

@LexiconCode This is really best solved upstream, I will post a patch there, in case of non-reaction there are options of Versatilus' going back to json, monkey patching or adapting tomlkit.

@LexiconCode
Copy link
Member

Sounds like a plan.

@LexiconCode
Copy link
Member

Let me know when you have a patch upstream and I'll comment supporting your implementation.

@comodoro
Copy link
Contributor Author

comodoro commented May 8, 2019

OK so it was just a preliminary shot, however it fails one of the tests and is still without reaction: uiri/toml#244. I have noticed as well that it can encode strings into TOML that the decoder is unable to decode, do not know if the decoder would have to be rewritten as well. All in all I think it is not worth the trouble as JSON was perfectly fine.

@comodoro
Copy link
Contributor Author

comodoro commented Jul 1, 2019

This was solved/cuircumvented by #582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Unexpected behavior from existing features. UpStream An issue related to an upstream dependency or project.
Projects
None yet
Development

No branches or pull requests

3 participants