(URGENT) Critical performance problem caused by recent refactoring #159

ronaldtse · 2025-01-28T10:34:29Z

As reported by @ReesePlews :

Previously, the 001-v5 document (sources/001-v5/document.adoc) compiled in around 1 hour.

Now the document takes over 3 hours without even any error message (and does not complete).

I suspect that the recent XML refactoring has introduced critical inefficiencies that is not working with big documents.

The delivery date for Plateau is Jan 30 and this should be addressed by then.

The text was updated successfully, but these errors were encountered:

ronaldtse · 2025-01-28T10:35:34Z

@kwkwan do you think this has anything to do with the recent xmi gem update?

ronaldtse · 2025-01-28T10:37:13Z

The most recent docker release still works:

https://github.com/metanorma/mn-samples-plateau/actions/runs/12963458010/job/36161070788

However, the local build is messed up. I only have metanorma-cli in my Gemfile, so using all released gems.

ReesePlews · 2025-01-28T10:52:17Z

thanks for checking this

Now the document takes over 3 hours without even any error message (and does not complete).

just to clarify here:
plateau document (sources/001-v5) took 3:30min to generate locally and it did complete.

opoudjis · 2025-01-28T15:40:16Z

Since the entirety of mn-samples-plateau, INCLUDING 001-v5, ran last week with the gems of that metanorma release on docker in 48 mins

https://github.com/metanorma/metanorma-cli/actions/runs/12876911452/job/35964616845

the "caused by recent refactoring" ranges from unlikely to nonsensical as having anything to do with metanorma.

I am now leaving this running on my side to find out what is going on, but I had no issues at all my side last week. If this turns out not to do with metanorma but with lutaml.

I will report back on my findings.

opoudjis · 2025-01-28T15:47:10Z

If this turns out to be lutaml (and if you're hearing nothing from metanorma, that is my suspicion), it would be useful for the metanorma lutaml plugin to introduce some sort of progress reporting, as I had done for AsciiMath processing...

opoudjis · 2025-01-28T17:14:35Z

After 1.5 hours, I get:

asciidoctor: ERROR: <stdin>: line 18762: dropping cells from incomplete row detected end of table

That is still Asciidoctor processing, not Metanorma, which pretty much tells me straight that the delay is in Lutaml, since that is embedded within Asciidoc preprocessing. I'm going to restart and monitor metanorma-plugin-lutaml.

opoudjis · 2025-01-28T17:28:40Z

The Lutaml preprocessors are all pretty fast:

LutamlPreprocessor START
LutamlPreprocessor END
LutamlUmlClassPreprocessor START
LutamlUmlClassPreprocessor END
LutamlEaXmiBase START
LutamlEaXmiBase END
LutamlEaXmiBase START
LutamlEaXmiBase END
LutamlUmlClassPreprocessor START
LutamlUmlClassPreprocessor END

Each instance of LutamlKlassTableBlockMacro is taking something like half a minute to run. Will leave it running for a half hour to see how far it gets....

opoudjis · 2025-01-28T17:49:43Z

20 minutes in, it's chugging along, alternating runs of LutamlKlassTableBlockMacro and runs of LutamlGmlDictionaryBlockMacro. It has processed 150 such blocks so far.

I have left an warning in for when Metanorma processing starts, and I am going out in 10 mins, but I will leave it running for a few hours to see what is going on...

opoudjis · 2025-01-28T18:00:37Z

30 minutes, 230 blocks processed. Whether or not it hangs, the slowdown is clearly in Lutaml.

I'll leave it running, though I will be out for a few hours...

opoudjis · 2025-01-28T18:17:05Z

45 mins, 300 blocks. I'm going out...

opoudjis · 2025-01-28T21:43:36Z

Got back, it finished doing Lutaml processing after 1.5 hours, and then it crashed because I had a bug when I wrote the warning for it starting Metanorma. I'm now restarting it, and I'll see what happens next.

I think that is a real slowdown compared to before in Lutaml that needs to be addressed, I just want to confirm that there isn't an issue downstream in Metanorma.

opoudjis · 2025-01-28T23:34:53Z

On my computer:

Lutaml took 1.5 hrs to run

Metanorma took 1 minute and compiled successfully.

I suspect that the recent XML refactoring has introduced critical inefficiencies that is not working with big documents.

Wrong. This is not my problem. It's @kwkwan 's.

ReesePlews · 2025-01-29T00:27:57Z

hello @opoudjis thank you for investigating this and providing the details. i will follow up with @ronaldtse and @kwkwan .

ronaldtse · 2025-01-29T07:20:57Z

This may be a lutaml-model performance issue since that's the only change at the "xmi" gem: it migrated from Shale to lutaml-model.

Ping @HassanAkbar

HassanAkbar · 2025-01-29T14:05:10Z

@ronaldtse I have investigated this performance issue thoroughly, and here are the findings:

It is confirmed that the issue comes after the migration of xmi gem to use lutaml-model instead of shale
The issue is with from_xml method of lutaml-model it is slow since we are first generating the doc hash and then deserializing it to the instance.

Here are the benchmarking results from a sample spec of xmi gem.

Benchmarking of from_xml method of lutaml-model:

As you can see it takes ~3 seconds to deserialize xml from xmi.

Shale on the other hand shows much better performance and deserializes the XML in less than 1 second.
Here are the benchmark results for Shale:

Conclusion:
We need to optimize the performance of lutaml-model to de-serialize XML faster. We might have to remove the conversion to doc hash and de-serialize the adapter object directly to improve perfomance and I am currently looking into that.

ronaldtse · 2025-01-29T23:25:06Z

Thanks @HassanAkbar for pinpointing the problem. I agree with the next steps on performance improvement by eliminating the doc_hash. My next question is: why are we calling to_xml? In this usage there is no XML generated through lutaml-model... what I mean is there could be another location of inefficiency that is not to_xml.

ronaldtse · 2025-01-30T01:39:09Z

Technically, this document generation only uses "to_liquid" in the "xmi" gem. I wonder if there is a performance issue there as well. Ping @kwkwan

ReesePlews · 2025-01-30T02:10:32Z

hello @HassanAkbar thank you for looking into this.

i generated the document locally and it successfully completed but took 3 hrs and 40min to complete.

over the past two weeks the client and i were discussing and testing liquid template updates with @kwkwan for the sorting order of the autogenerated UML tables in the Plateau Project (mn-samples-plateau). late last week i requested the updated template PRs be merged to the main branch. i did not know there were other non-liquid updates being made. regularly the document took about 1hr 30min to run. there were updates to lutaml in Nov 2024 that solved some speed issues at that time. after that i believed the only changes were mostly to the liquid-templates which are document specific in the mn-samples-plateau repo. these templates are used in the autogenerated tables in sources/001-v5.

do you think there is some type of looping/evaluation issue in the liquid template that could be causing this slowdown? i believe the liquid templates look at the data coming from the xmi and considers it by row or possibly by cell in each table. with a couple hundred tables that could cause a slowdown if there was a problem with the templates.

however, the document compile successfully completes. the client is now checking if those sorting order updates to the UML tables are still correct.

not sure if this is helpful information or not. thanks for checking into this.

HassanAkbar · 2025-01-30T07:41:36Z

My next question is: why are we calling to_xml? In this usage

@ronaldtse, actually the issue here is not with to_xml but with from_xml.
From what I figured out, the xmi gem reads the XML files and performs these steps while deserialization using from_xml:

It reads XML to an adapter instance
It makes a doc hash
It makes an instance from doc hash

I am working to eliminate step 2 so that the instance is directly created from the adapter instance, improving performance and saving time.
Please let me know if I am missing something here or is there any other flow that we need to look at.

ReesePlews · 2025-02-02T02:02:59Z

hello @HassanAkbar is there anything to report on the status of this issue? do let us know when you have feedback to discuss. thank you for working on this.

kwkwan · 2025-02-03T04:21:20Z

Technically, this document generation only uses "to_liquid" in the "xmi" gem. I wonder if there is a performance issue there as well. Ping @kwkwan

Hi @ronaldtse, the main differences between v0.3.11 and later of the gem xmi is just switched from shale to lutaml-model. I think there is not much area to improve the performance.

I compared the time consumed for running bundle exec rake. In v0.3.11, it took 10s. However, it took 1m in v0.3.13 (~ 5x slower). I think @HassanAkbar's founding is correct. The performance of from_xml in lutaml-model should be improved.

In the meanwhile, we can pin the version of the gem xmi to v0.3.11 in the the project mn-samples-plateau.

kwkwan · 2025-02-03T05:07:12Z

FYI. It takes 33m to compile sources/001-v5/document.adoc after pinned the version to v0.3.11 of the gem xmi in my local machine.

ReesePlews · 2025-02-03T05:28:55Z

wow, that is a great improvement! trying to test now based on the info from the other issue. thank you.

ronaldtse added the bug Something isn't working label Jan 28, 2025

ronaldtse assigned opoudjis Jan 28, 2025

ronaldtse added this to Metanorma Jan 28, 2025

github-project-automation bot moved this to 🆕 New in Metanorma Jan 28, 2025

ronaldtse moved this from 🆕 New to 🌋 Urgent in Metanorma Jan 28, 2025

opoudjis assigned kwkwan and unassigned opoudjis Jan 28, 2025

This was referenced Feb 3, 2025

Improve the performance of from_xml lutaml/lutaml-model#284

Closed

Poor performance when running with xmi v0.3.13 metanorma/mn-samples-plateau#335

Open

This was referenced Feb 3, 2025

PDF Windows out of memory metanorma/mn-samples-plateau#338

Open

(PDF) Breaking words in table cells through word-wrap when width is insufficient #154

Closed

HassanAkbar mentioned this issue Feb 6, 2025

Optimize performance of from_xml lutaml/lutaml-model#294

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(URGENT) Critical performance problem caused by recent refactoring #159

(URGENT) Critical performance problem caused by recent refactoring #159

ronaldtse commented Jan 28, 2025 •

edited

Loading

ronaldtse commented Jan 28, 2025

ronaldtse commented Jan 28, 2025

ReesePlews commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

ReesePlews commented Jan 29, 2025

ronaldtse commented Jan 29, 2025

HassanAkbar commented Jan 29, 2025

ronaldtse commented Jan 29, 2025 •

edited

Loading

ronaldtse commented Jan 30, 2025

ReesePlews commented Jan 30, 2025

HassanAkbar commented Jan 30, 2025

ReesePlews commented Feb 2, 2025

kwkwan commented Feb 3, 2025 •

edited

Loading

kwkwan commented Feb 3, 2025

ReesePlews commented Feb 3, 2025

(URGENT) Critical performance problem caused by recent refactoring #159

(URGENT) Critical performance problem caused by recent refactoring #159

Comments

ronaldtse commented Jan 28, 2025 • edited Loading

ronaldtse commented Jan 28, 2025

ronaldtse commented Jan 28, 2025

ReesePlews commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

opoudjis commented Jan 28, 2025

ReesePlews commented Jan 29, 2025

ronaldtse commented Jan 29, 2025

HassanAkbar commented Jan 29, 2025

ronaldtse commented Jan 29, 2025 • edited Loading

ronaldtse commented Jan 30, 2025

ReesePlews commented Jan 30, 2025

HassanAkbar commented Jan 30, 2025

ReesePlews commented Feb 2, 2025

kwkwan commented Feb 3, 2025 • edited Loading

kwkwan commented Feb 3, 2025

ReesePlews commented Feb 3, 2025

ronaldtse commented Jan 28, 2025 •

edited

Loading

ronaldtse commented Jan 29, 2025 •

edited

Loading

kwkwan commented Feb 3, 2025 •

edited

Loading