This repository has been archived by the owner on Sep 5, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
157 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Kindle Format 10 (KFX) | ||
|
||
Inquiring further into Kindle Previewer 3 (beta), we may get a hint of the conversion process to KFX. | ||
|
||
At first sight, the process is epic. Please note [some folks have been retro-engineering already](http://www.mobileread.com/forums/showthread.php?t=263902). I can now confirm their findings. | ||
|
||
I’ll quote from and refer to thats thread’s posts since it would be disrespectful to steal and paraphrase. | ||
|
||
**TL;DR:** we’re screwed. | ||
|
||
## Acknowledgments | ||
|
||
Thanks to jhowell, and all the others folks in the MobileRead thread, whose research proved so valuable when inquiring into dat shit. | ||
|
||
## KFX Architecture | ||
|
||
It seems KFX is a follow-on AZK, the Kindle format for iOS you’ll get when using Kindle Previewer 2.94. | ||
|
||
More technical details about AZK [here](http://www.mobileread.com/forums/showpost.php?p=3097967&postcount=8) and [there](http://www.mobileread.com/forums/showpost.php?p=3100761&postcount=11). | ||
|
||
As a reminder, AZK is **not** the file Kindle for iOS uses, it is indeed converted to KCR (Kindle Cloud Reader) when you sideload it on your iDevice. [This process has been partly documented](https://github.com/FriendsOfEpub/WillThatBeOverriden/tree/master/ReadingSystems/Kindle/Kindle-iOS) and implies a lot of HTML + CSS sanitization. | ||
|
||
So, to sum things up, like AZK (and KCR), **KFX is a binary version of JSON** (JavaScript Object Notation). In other words, it is somehow likely the new Kindle renderer is sharing common traits with Kindle Cloud Reader, i.e. JavaScript built on top of jQuery and making use of webviews. That is still unclear though, so correct me if I’m wrong. | ||
|
||
KindleGen is not involved at this AZK/KFX conversion any more and there is no “public converter” for those two formats. `EpubToKFXConverter-1.0.jar` is bundled with KindlePreviewer (like `azkcreator` for AZK) but you could probably make use of your command-prompt-fu to access it (see [this post in MobileRead forums](http://www.mobileread.com/forums/showpost.php?p=3262219&postcount=338)). | ||
|
||
### History | ||
|
||
It looks like KFX has been around [since 2013](http://www.mobileread.com/forums/showpost.php?p=3182980&postcount=167). But it was [intended for magazines at first](http://www.mobileread.com/forums/showpost.php?p=3184542&postcount=170). In other words, KFX has been repurposed. | ||
|
||
### Technical details | ||
|
||
- Like KF8, KFX is using a webkit-based rendering engine. | ||
- Contents (excepted images) are encrypted/ofbuscated (AES-256) for both non- and DRM-protected files. | ||
- Contents are divided into small JSON files (like AZK and KCR, you can [go check online](https://www.amazon.com/cloudreader) for the latter). | ||
- Grayscale images are using an high compression format called JPEG-XR (transparency is not supported at the moment). | ||
- Color images are delivered in JPEG (at least for iOS). | ||
- It **seems** specific versions are prepared and delivered based on devices (e.g. grayscale images for eInk Readers). | ||
- Hyphens are turned on/off using `-webkit-hyphens`. | ||
- The hyphenation engine is [not written in JavaScript](http://www.mobileread.com/forums/showpost.php?p=3206602&postcount=237). | ||
|
||
### Goals | ||
|
||
In two words: Enhanced Typography. | ||
|
||
KFX brings support for drop caps, hyphenation & justification (H&J), kerning, ligatures, etc. | ||
|
||
What they don’t advertise though, are clever features that will enhance the user experience: | ||
|
||
- the renderer takes user settings into account and will adapt drop caps, H&J and `float` images dynamically, depending on font size; | ||
- the renderer supports gaiji images (`inline` images); | ||
- the renderer computes an RGAA2.0-compliant contrast ratio: `color` for text on background. | ||
|
||
As the previous renderer (for KF8) [performed very poorly with kerning and ligatures](http://www.mobileread.com/forums/showpost.php?p=3172282&postcount=420), it could explain why they decided to build a new one and manage a lot of stuff when processing files so that the new renderer don’t have to. | ||
|
||
While this is **pure speculation**, another goal may have been the creation of a stronger walled garden. Indeed, it looks like it would be pretty useless to convert KFX to ePub (more about that later). | ||
|
||
### Notes | ||
|
||
KFX is usually not mentioned in the release notes of Kindle updates. The addition of Bookerly seems to be a hint, though. | ||
|
||
## Kindle Previewer | ||
|
||
[Kindle Previewer 3](http://www.amazon.com/gp/feature.html/?docId=1003018611) (beta) outputs KDF files, which is basically KFX data in an sqlite3 database instead of a KFX container. That could be an intermediate format like AZK-only this time they didn’t bother or have time to implement a bridge inside apps which support KFX. | ||
|
||
There is a lot of interesting stuff to be found in KP3’s package: | ||
|
||
- `EpubToKFXConverter-1.0.jar`, the KFX converter; | ||
- `mobicontentdumper`, which dumps the mobi files generated by KindleGen as json files; | ||
- `yjhtmlcleanerapp`, which cleans up code that goes in the way; | ||
- `coreprocessor.js`, a script (3100+ lines beautified) which aim is to parse CSS styles and alter HTML files. | ||
|
||
That hints at a very complex process. | ||
|
||
And indeed, that process is mind-blowing. | ||
|
||
When converting a file with Kindle Previewer, which takes some time (yeah, that’s `coreprocessor` in action), temporary files are created. To sum things up: | ||
|
||
- the link to the external stylesheet(s) are erased (`head`); | ||
- book styles are parsed and inlined (`style` attribute) to each tag; | ||
- computed styles are then inlined (`computedstyle` attribute) as well. | ||
|
||
You end up with something like this: | ||
|
||
``` | ||
<p class="noindent" style="margin-top: 0px; margin-bottom: 0px; text-align: justify; display: block; text-indent: 0%; margin-left: 0%; margin-right: 0%; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:496px;height:96px;" >Text</p> | ||
``` | ||
|
||
Check file `enhanced.xhtml` for further details. | ||
|
||
Computed styles are always the same: | ||
|
||
1. `font_style`; | ||
2. `font_weight`; | ||
3. `font_variant`; | ||
4. `width`; | ||
5. `height`. | ||
|
||
The purpose of such an “enhancement” is currently unknown. The most probable assumption is that “they are determining the final styling result of css for each html element and then translating that to their own simplified binary document model” ([source](http://www.mobileread.com/forums/showpost.php?p=3262351&postcount=342)). | ||
|
||
Indeed, “Html and css files are replaced by text with associated formatting instructions in binary data structures. The possible formatting instructions are based loosely on css properties, but changed and somewhat simplified.” | ||
|
||
As for drop caps, Amzn has created its own non-standard style properties which are being applied after several conditions have been met (e.g. it is not a floating image, `span` is the first element in `p`, paragraph is the first element of the parent, `font-size` of the `span` is bigger than `font-size` of `p`, it is not a raised cap, etc.). As a matter of fact, that’s quite epic. | ||
|
||
And now to the **ion** data representation… | ||
|
||
I don’t really want to make it obscure so if you want further details, check [this post](http://www.mobileread.com/forums/showpost.php?p=3269649&postcount=360). | ||
|
||
To sum things up, content is divided into fragments, and I quote jhowell: | ||
|
||
1. `document_data`, a list of sections in reading order; | ||
2. each section is an HTML file in the source file. It has a page template and a reference to a story; | ||
3. a story contains a list of content, content types being based on HTML e.g. `container` for nested `div`, `text` for `div`, `p`, `h1`, etc.; | ||
4. formatting instructions are grouped into a `style` based on HTML attributes and CSS properties. | ||
|
||
**Note:** if the parser encounters an unsupported style, it will abort conversion to KDF. A list of currently unsupported styles would be difficult to create if it doesn’t exist already, somewhere in KP3. | ||
|
||
## KFX to ePub | ||
|
||
You’re screwed. | ||
|
||
1. KFX is encrypted and there is no way to break it at the moment. | ||
2. HTML semantic markup is lost (`text` for several tags, same for `list`, same for inline elements like `em`, etc.). | ||
3. Formatting relies heavily on styles and not semantic tags. | ||
|
||
## What’s next? | ||
|
||
Well, Amazon is very secretive about KFX and is unlikely to document it, if not providing a public converter you can use to create files in this format. After all, in [Kindle Previewer 3 FAQ](http://www.amazon.com/gp/feature.html/?docId=1003018611), they write “You can’t side load your books with Enhanced Typesetting.” That may be temporary but since KFX is a work in progress (WIP), it seems reasonable to imagine they’ll manage that on their side—a lot easier to update your own server and reprocess files when needed than managing tens of thousands of people using a local/native converter to provide KFX files directly. | ||
|
||
So, given this status of WIP, the whole process which turns documenting and maintaining default styles + overrides into a bloody nightmare and the dynamic sanitization/styling which probably happens at the renderer level, **we won’t inquire further.** Sorry Not Sorry, life is short and I don’t want to waste any more time on this one. | ||
|
||
You could try public shaming them to get documentation but I’m not convinced it would have any effect—remember AZK has not been documented either and it’s been years since its release. | ||
|
||
If you want to sacrifice yourself though, please let us know. We’ll be glad to help kickstart your research where we left ours. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
<!-- ?xml version="1.0" encoding="UTF-8"? --> | ||
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" style="display: block; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:512px;height:416px;" > | ||
<head style="display: none; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:auto;height:auto;" > | ||
<title style="display: none; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:auto;height:auto;" >Kindle Previewer 3 Sample ePub</title> | ||
<meta charset="utf-8" style="display: none; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:auto;height:auto;" /> | ||
<meta http-equiv="content-type" content="text/html; charset=UTF-8" /> | ||
<meta charset="UTF-8" /> | ||
</head> | ||
<body style="display: block; " body-margin-left="0%" body-margin-right="0%" body-margin-left-importance="" body-margin-right-importance="" computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:496px;height:375px;" > | ||
<a id="intro" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" ></a> | ||
<h1 class="header" style="font-size: 1.4em; text-align: center; margin-top: 1.5em; margin-bottom: 1.5em; display: block; margin-left: 0%; margin-right: 0%; line-height: 1.2em; " computedstyle="font_style:normal;font_weight:bold;font_variant:normal;width:496px;height:22px;" >About Enhanced Typesetting</h1> | ||
<p class="noindent" style="margin-top: 0px; margin-bottom: 0px; text-align: justify; display: block; text-indent: 0%; margin-left: 0%; margin-right: 0%; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:496px;height:96px;" >Enhanced Typesetting is a series of typographical and layout features that are automatically enabled for some Kindle books. These enhancements improve readability and create a more consistent display behavior across Kindle reading platforms, including Kindle devices, Fire tablets, and free Kindle reading apps for Android and iOS. Some Enhanced Typesetting features include:</p> | ||
<ul class="bulleted_list" style="text-align: left; display: block; list-style-type: disc; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:80px;" > | ||
<li class="bullet" style="display: list-item; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:16px;" ><a href="sample-3.xhtml" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >Drop caps that dynamically adjust with font size</a></li> | ||
<li class="bullet" style="display: list-item; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:16px;" ><a href="sample-4.xhtml" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >Kerning and ligature enhancements</a></li> | ||
<li class="bullet" style="display: list-item; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:16px;" ><a href="sample-4.xhtml" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >Hyphenation and improved justification</a></li> | ||
<li class="bullet" style="display: list-item; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:16px;" ><a href="sample-5.xhtml" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >Dynamic font coloration</a></li> | ||
<li class="bullet" style="display: list-item; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:456px;height:16px;" ><a href="sample-6.xhtml" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >Image optimizations</a></li> | ||
</ul> | ||
<p class="indent" style="margin-top: 0px; margin-bottom: 0px; text-align: justify; display: block; text-indent: 3.75%; margin-left: 0%; margin-right: 0%; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:496px;height:80px;" >If Enhanced Typesetting is enabled for your book, you’ll see “Enhanced Typesetting: Enabled” on that book’s detail page. We are continuously working to make Enhanced Typesetting compatible with more titles and will automatically enable Enhanced Typesetting features for your book when possible.</p> | ||
<p class="indent" style="margin-top: 0px; margin-bottom: 0px; text-align: justify; display: block; text-indent: 3.75%; margin-left: 0%; margin-right: 0%; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:496px;height:32px;" >Read more about Enhanced Typesetting at <a href="http://www.amazon.com/betterreading" style="display: inline; " computedstyle="font_style:normal;font_weight:normal;font_variant:normal;width:0px;height:0px;" >http://www.amazon.com/betterreading</a>.</p> | ||
</body> | ||
</html> |