-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF Windows out of memory #338
Comments
It is also catastrophically slow for Plateau, but I have already identified that this is a lutaml issue: metanorma/metanorma-plateau#159 |
I've tried to generate PDF locally (Win 10) for XML from Differences between GH and local machine: From https://github.com/metanorma/metanorma-cli/actions/runs/13111529625/job/36587857593:
Locally:
I don't understand why 1355 pages on the GH... |
hello @Intelligent2013 thank you for continuing the check this issue. i will add the following information (based on my local build - win10) about the Plateau documents:
perhaps each document is being built? not sure if that is helpful information or not. |
@ReesePlews thank you! Actually, the issue occurs on the PDF generation for the document On my machine with JVM settings I think I'll get the same exception
Issue occurs due VeraPDF API. |
hello @Intelligent2013 thank you for the additional information. i dont recall checking or installing java when i installed mn on my machine. i am currently running in powershell (after the Plateau project is completed i will change to MS Linux WSL) here is my java information from within powershell
but when i look at "java settings" app i see this: System tab could there be an issue with my java install between windows and powershell? |
hello @ReesePlews the different versions in the powershell output (
no, |
The checking time by veraPDF in the simple standalone application takes 113sec. All unused object set to src = null;
...
xsltConverter = null;
fontcfg = null;
System.gc(); but the used heap memory size is still 650Mb: I'll investigate further. |
hello @Intelligent2013 thank you for the additional information and checking. i will await your answer. |
Found memory leaks in the Apache FOP. The simple program: File fPDF = new File("D:\\Work\\Metanorma\\XML\\PLATEAU\\test.pdf");
OutputStream out = null;
try {
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(); // identity transformer
FopFactory fopFactory = FopFactory.newInstance(new File("D:\\Work\\Metanorma\\XML\\PLATEAU\\document.presentation.pdf.pdf_fonts_config.xml.out"));
JEuclidFopFactoryConfigurator.configure(fopFactory);
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
foUserAgent.setProducer("Ribose Metanorma mn2pdf version " + Util.getAppVersion());
foUserAgent.getEventBroadcaster().addEventListener(new LoggingEventListener());
out = new FileOutputStream(fPDF);
out = new BufferedOutputStream(out);
String mime = MimeConstants.MIME_PDF;
Fop fop = fopFactory.newFop(mime, foUserAgent, out);
Source src = new StreamSource(new File("D:\\Work\\Metanorma\\XML\\PLATEAU\\test.pdf.fo.xml"));
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(src, res);
} catch (Exception e) {
System.out.println(e.toString());
} finally {
out.close();
}
System.gc();
VeraPDFValidator v = new VeraPDFValidator();
v.validate(fPDF, PDF_UA_MODE); Between Currently, I don't figure out how to find the reason in the code. Possible workaround solutions:
|
|
Currently, the PDF checking by veraPDF docker container integrated into mn-native-pdf repository: Regarding,
We have scenario when user run the metanorma process via the command line on own machine. Do we need to require the user to install the Docker for PDF checking? Or would be better to integrate one more jar (verapdfchecker.jar, for example) into mn2pdf-ruby package and run it immediately after the mn2pdf.jar?
I have to check the original Apache FOP with minimal dependencies. |
@ronaldtse the memory leaks occur in the original Apache FOP without any changes. |
Under Windows in GHA, but not OSX or Ubuntu, 001-v5 and 002-v5 are running out of heap space when generating PDF; this is new. There seem to be processing issues with the SVG as well.
https://github.com/metanorma/metanorma-cli/actions/runs/13111529625/job/36587857593
The text was updated successfully, but these errors were encountered: