Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue in the name of the xhtml subpart when no id provided #217

Closed
mikrethor opened this issue Jan 7, 2020 · 6 comments
Closed

Comments

@mikrethor
Copy link

mikrethor commented Jan 7, 2020

I got a problem In all the xhtml file generated when an accentuated character is present in the title of the subdocument.

Let me give an example.

spine.adoc

= Asciidoctor EPUB3: Sample Book
Author Name
v1.0, 2014-04-15
:doctype: book
:producer: Asciidoctor
:keywords: Asciidoctor, samples, e-book, EPUB3, KF8, MOBI, Asciidoctor.js
:copyright: CC-BY-SA 3.0
// NOTE anthology adds support for an author per chapter; use book for a single author
:publication-type: anthology
:idprefix:
:idseparator: -

include::reference.adoc[]

reference.adoc

= Test é

The xhtml file generated for the reference.adoc get the name "test-é.xhtml"

The accentuated character is rendered properly (test-é.xhtml) in the nav.xhtml, toc.ncx and package.opf

I would have liked the behavior to be consistent.
At least removing the accentuated character from the filename and all other reference in such case.

For now, I added an id in all the file with accentuated character.

I tried to look in the code to correct it any heads-up would be appreciated. I am not familiar with ruby.

I used the 1.5.0.alpha.9 version

Thx

Xavier

@mojavelinux
Copy link
Member

I'm a bit rusty on the rules, but if my memory services me, it's EPUB3 and/or Kindle that doesn't allow accented characters in filenames. So I would recommend just avoid doing that. Set the ID explicitly to not include an accented character.

@slonopotamus
Copy link
Contributor

slonopotamus commented Jan 22, 2020

EPUB spec allows unicode filenames: http://idpf.org/epub/301/spec/epub-ocf.html#sec-container-filenames

So, technically it is asciidoctor-epub3 bug that it doesn't produce them.

Though I agree with @mojavelinux, usage of Unicode filenames can easily lead to issues in various software.

@slonopotamus slonopotamus added this to the v1.5.0.alpha.11 milestone Jan 22, 2020
slonopotamus added a commit to slonopotamus/asciidoctor-epub3 that referenced this issue Jan 22, 2020
slonopotamus added a commit to slonopotamus/asciidoctor-epub3 that referenced this issue Jan 22, 2020
@mojavelinux
Copy link
Member

EPUB spec allows unicode filenames

This issue has come up before, but I misremembered the conclusion. I believe it was that zip files don't support the unicode filenames, or at least rubyzip doesn't. Either way, the error wasn't originating from the Asciidoctor EPUB3 code itself.

@slonopotamus
Copy link
Contributor

at least rubyzip doesn't

It does, though in umm... specific way. #243 is waiting for your review.

@mojavelinux
Copy link
Member

It does, though in umm... specific way

I should have said "didn't" at the time. I commented on the change. Looks like a necessary evil if we want to support this.

slonopotamus added a commit to slonopotamus/asciidoctor-epub3 that referenced this issue Jan 23, 2020
@slonopotamus
Copy link
Contributor

slonopotamus commented Jan 23, 2020

However, see #162. Reported a bug to epubcheck: w3c/epubcheck#1097.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants