Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ascii restriction mode #24

Closed
esavier opened this issue Nov 26, 2021 · 8 comments
Closed

ascii restriction mode #24

esavier opened this issue Nov 26, 2021 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@esavier
Copy link

esavier commented Nov 26, 2021

Issue Description

i am using this tool mainly to fix filesystem issues that i am having after working with different files from different sources, an example being files using UTF-8 character set outside the standard ascii range like Asian, Slavic, extended Latin or German special characters (i.e. 'ąłśćż' ). Those files while working on modern file systems like btrfs without a problem but in some cases, for example when being copied or transferred to file systems not supporting UTF-8 out of the box, it's erroneous or just prevents me from doing so, since the drivers do not know what to do with those.
Here comes the RnR which i use to strip those as far as i am able to, but it takes some wizardry to do properly and i am still not sure if i am not overlooked something. Also, it's extremely lossy at this moment.

Resolution Proposition

there is an library that translates the special characters to ASCII bound ones - https://github.com/anyascii/anyascii
can we get a flag that for example in addition to running regexp, forces the special characters to be conversed to the ascii ones? for example, --restrict

this issue is related mainly to unix/linux platform.

@esavier
Copy link
Author

esavier commented Nov 26, 2021

Worst cases scenario I can introduce this feature by myself, but I wanted to ask first since maybe I may not be seeing everything and also want to ask for green light to start working on it :)

@ismaelgv ismaelgv self-assigned this Nov 27, 2021
@ismaelgv ismaelgv added enhancement New feature or request needs-discussion This issue needs to be discussed and removed enhancement New feature or request labels Nov 27, 2021
@ismaelgv
Copy link
Owner

ismaelgv commented Nov 27, 2021

Hi @esavier , thanks for opening this issue.

If I understood you correctly you want to force the conversion to ASCII chars on the output to non-supported UTF-8 filesystems. Wouldn't be enough just to force the conversion using the current implementation?

Could you give me some examples so we can discuss the problem with more details? This way we can see if we need to implement something more or just use the current features.

This may be someway related to #6

@esavier
Copy link
Author

esavier commented Nov 27, 2021

Hi ismaelgv!
Yeah, sure,i will prepare the example usecases tomorrow when i am near the PC again, altogether with a proper description and explanation :)

@ismaelgv
Copy link
Owner

ismaelgv commented Nov 28, 2021

@esavier I've been playing around with any_ascii and added a new command to RnR. You can try it building this project directly from master since it is not released yet. It would be great if you can provide some feedback about it and test this new command a bit.

It keeps some common options with the root command but does not consider expression or replacement.

You can pass a list of files:

rnr to-ascii ńôn-äscìí.txt # should translate to non-ascii.txt

Or use the recursive mode:

rnr to-ascii -r ./ 

@esavier
Copy link
Author

esavier commented Nov 29, 2021

damn, that was fast!
Sorry i couldn't respond yesterday, just before i go into the testing, here is what i meant:

BørkBørk.nfo -> BorkBork.nfo Häuser.txt -> Hauser.txt ... and all of the Asian stuff you can think of

problem is there when i need to copy stuff to old FS'es, like ext3, fat32/exfat, or something exotic that does not support utf8, drivers usually say "operation failed with code XXX"

its hard to misuses anyascii as a lib so i assume you added an option i meant.
I promise, i will test it today's evening :)

much kudos <3 !

@esavier
Copy link
Author

esavier commented Nov 30, 2021

yeah, tested, it works marvelously,
for a moment i was thinking if it would be possible to chain it with regexp rename but it would be dangerous, unwieldy and non-trivial so in the end i like the implementation quite a lot.

FAT32 driver can now easily change the ASCII-like characters in UTF-8 to actual ASCII without throwing an error.
Thousand kudos :)

also just a note,
i noted Replace all file name chars with ASCII chars. This operation is extremely lossy. as an explanation,
while, well, ...yes, up to this point i was using a workaround that was looking like that:

./ńôn-äscìí_txt ⇒ ./__n-_sc___txt
./ńôn2-äscìí_txt ⇒ ./__n2-_sc___txt
./ńôn2-äscìí4_txt ⇒ ./__n2-_sc__4_txt```

so option with `anyascii` is not as bad as it looks like :wink: 

@esavier
Copy link
Author

esavier commented Nov 30, 2021

We can close this if there is nothing else to discus (^.^)

@ismaelgv
Copy link
Owner

ismaelgv commented Nov 30, 2021

Thanks for testing the new feature. 😉

yeah, tested, it works marvelously,
for a moment i was thinking if it would be possible to chain it with regexp rename but it would be dangerous, unwieldy and non-trivial so in the end i like the implementation quite a lot.

I thought about combining them but the complexity would rise and the user experience would be degraded. I separated this in a different command in the end, and refactored a part of the application to ease the addition of new commands.

I am planning to release a new version this week, I want to fix some minor stuff before.

Fixed in e708713

@ismaelgv ismaelgv added enhancement New feature or request and removed needs-discussion This issue needs to be discussed labels Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants