Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry in txt to HTML conversion #26

Open
gbnewby opened this issue Dec 17, 2019 · 2 comments
Open

Poetry in txt to HTML conversion #26

gbnewby opened this issue Dec 17, 2019 · 2 comments

Comments

@gbnewby
Copy link
Collaborator

gbnewby commented Dec 17, 2019

Discussed among WWers and catalogers: This is for files that are native .txt, and get automatically converted to HTML as part of ebookmaker.

The problem is that spacing in poetry gets ignored. The easiest partial solution is to insert
or similar whenever there is a space at the start of a line.

Problematic generated HTML may be seen, for example, here:
https://www.gutenberg.org/ebooks/60909
https://www.gutenberg.org/ebooks/60910

Compare the .txt to the generated .htm. When a new line begins with upper case, it's treated as a paragraph. When a new line begins with lower case, it's treated as part of the previous paragraph.

Email pgww (or gutcat) if you want further examples or to discuss solutions. Thanks for considering this.

@eshellman
Copy link
Collaborator

I can't think of any way to fix these without breaking conversion of thousands of non poetry files.
workarounds include:

  • wrapping the poetry in html by hand with <pre>
  • using rst markup instead of plain text
  • inventing new markup
  • magic

@eshellman
Copy link
Collaborator

We've seen a lot of problems with poetry in html books where <br> is used in conjunction with a css rule hiding br in a block span. kindlegen can't handle more than 10K hidden characters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants