-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML and EPUB documents are not produced with pipeline #152
Comments
There is a number of issues preventing the conversion of the example document to EPUB/HTML:
|
@danopolan @anetader: I estimate 2–4 hours of effort. When do you need this fixed? |
Thx for the analysis. It's not a priority now, so I will unassign you from this Issue and we will plan its implementation when it will be needed. |
@danopolan: Our current method for generating HTML and EPUB documents depends on the TeX4ht system. TeX4ht works by patching existing LaTeX packages to produce correct HTML output. However, this approach often lags behind active package development and can be unreliable, especially when dealing with modern, actively maintained LaTeX packages. Over the past several years, the LaTeX team has been enhancing the LaTeX kernel to support the creation of PDF 2.0 documents. These documents are designed to be fully accessible, complying with both the PDF/UA-2 standard and the Well-Tagged PDF (WTPDF) specification. Accessible PDFs are particularly useful because the WTPDF specification provides a well-defined general algorithm for extracting HTML (and EPUB) content directly from PDF documents. Like TeX4ht, many LaTeX packages are currently incompatible with accessible PDFs. However, the effort to make LaTeX packages compatible with accessible PDFs is decentralized, and the implementations are more likely to be stable and sustainably maintained by the package authors themselves. Given this, transitioning to accessible PDFs could be a better long-term alternative to TeX4ht. Moreover, future legal requirements might mandate ISTQB to produce materials as accessible PDFs in certain jurisdictions. Creating accessible PDFs requires many features of the LuaTeX engine. While pdfTeX is technically compatible with accessible PDFs, it may lack support for some features required by this process. Additionally, we need LuaTeX for other purposes, as discussed in issue #51 and issue #145 on our GitHub repository. As a result, the switch to accessible PDFs would also necessitate migrating our TeX codebase from pdfTeX to LuaTeX. In the meantime, we still need to maintain and work with the current code that relies on TeX4ht. To address this, I have reached out to the primary developer of TeX4ht for assistance in patching the LaTeX packages that are causing issues. We could start addressing these problems as early as December or January, depending on your preference. |
@danopolan: Please, let me know if this is something to look into. The primary developer of TeX4ht won't be always available, so now may be a good time to get the export to HTML and EPUB working and set up automated tests to prevent future regressions. |
@Witiko if I understood correctly, there are two main things discussed.
Regarding 1), I am not sure about the required effort and timeline. But yes accessible PDFs would be a priority for ISTQB. If this assumption is right, we do not need to reach out to TeX4ht developer now. |
We don't need to either migrate from pdfTeX to LuaTeX or fix issues with TeX4ht to start making our documents more accessible. In theory, just adding the following line on the first line of our .tex files should make the output PDFs fully accessible, complying with the PDF/UA-2 standard: \DocumentMetadata{pdfversion=2.0, pdfstandard=ua-2, testphase={phase-III, title, table, math, firstaid}} In practice, various LaTeX packages have varying degree of support for PDF tagging. Furthermore, in the future, the LaTeX team is expected to only support some features related to PDF tagging in LuaTeX, not pdfTeX. Therefore, some parts of the documents, such as tables, may be incorrectly tagged. However, unlike with TeX4ht, these are "soft" failures: We can detect them by parsing the logs for warnings and by checking the produced PDFs but they shouldn't crash the compilation. |
Ok, so I see two actions here:
I would not proceed with any of tasks now, since we have right before the release of CTAL TA in March and there is plenty of work going on based on the Beta review. But I would report both so we can start working on them when there will be a right time. |
Absolutely, I just wanted to put these ideas forward to help with long-term planning, as they'll need our attention eventually. Another important long-term task to consider—though I realize this is off-topic for this ticket—is enhancing the validation for input documents. |
I would most probably schedule a call for next month to discuss the long-term planning. |
After setting epub-output: true or html-output: true, the pipelines Produce HTML/EPUB document are failing.
See failed pipelines: https://github.com/istqborg/istqb_product_base/actions/runs/12083600897
The text was updated successfully, but these errors were encountered: