-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve schema compiler performance #1396
Conversation
daffodil-core/src/main/scala/org/apache/daffodil/core/dsom/DFDLSchemaFile.scala
Show resolved
Hide resolved
daffodil-lib/src/main/scala/org/apache/daffodil/lib/util/Timer.scala
Outdated
Show resolved
Hide resolved
All tests pass on my system. |
daffodil-core/src/main/scala/org/apache/daffodil/core/compiler/Compiler.scala
Outdated
Show resolved
Hide resolved
daffodil-core/src/main/scala/org/apache/daffodil/core/runtime1/SchemaSetRuntime1Mixin.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a regression somewhere in this commit that is costing us concurrency in when running the test suite. Nothing is jumping out at me, but comparing this commit to the main branch, running the full test suite takes roughly 3x as long on this branch than on main.
Yeah, I think I see where this happened. isError is not a lazy val. A bunch of work is done each time it is called. Most of that work will just be finding out all the various things have already been evaluated, but there is substantial overhead just traversing all the objects to find that out. I will investigate further. |
@jadams-tresys the most recent commit addresses the slow-down of the daffodil test suite. Was due to isError being called inside the synchronized block for the schema compiler. This was unnecessary. Calling isError outside the synchronized block restores the parallelism the loss of which was responsible for the slow-down, or at least that's my story and I'm sticking with it :-) |
With this change, schema compilation for Link16 drops from 2+ minutes to 18 seconds. Added instrumentation to compiler which shows that most time (by far, like 80+ percent) is spent in Xerces validating each schema file individually. This is unnecessary. We only need to validate the top level schema to detect UPA errors. Fixed other error causing repeated evaluation of OOLAG LV values. Eliminate isError inside synchronized block Avoids slowdown of Daffodil test suite. DAFFODIL-2965, DAFFODIL-2781
69e3876
to
bec6843
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
With this change, schema compilation for
Link16 drops from 2+ minutes to 18 seconds.
Added instrumentation to compiler which shows
that most time (by far, like 80+ percent) is
spent in Xerces validating each schema
file individually. This is unnecessary. We
only need to validate the top level schema
to detect UPA errors.
Fixed other error causing repeated
evaluation of OOLAG LV values.
DAFFODIL-2965, DAFFODIL-2781
This is the output from the instrumentation before the functional improvements in this change set:
Here is the instrumentation after: