Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code removal regression #2327

Closed
sdeleuze opened this issue Apr 9, 2020 · 14 comments
Closed

Code removal regression #2327

sdeleuze opened this issue Apr 9, 2020 · 14 comments
Assignees
Labels
bug native-image spring spring related issue

Comments

@sdeleuze
Copy link
Collaborator

sdeleuze commented Apr 9, 2020

It seems there is a big regression at code removal level between 19.3.1 and 20.0.0. The reports mentioned are available in https://github.com/sdeleuze/graal-issues/tree/master/code-removal-regression.

The very same jafu example:

With GraalVM 19.3.1:

  • Image build time: 36.2s
  • RSS memory: 14.7M
  • Image size: 15.0M

With GraalVM 20.0.0:

  • Image build time: 43.2s
  • RSS memory: 17.9M
  • Image size: 21.8M

A diff between packages reports available in the jafu-sample-19.3.1-reports and jafu-sample-20.0.0-reports shows that GraaVM 20.0.0 is unable to detect that following packages and related classes are not used:

  • com.sun.org.apache.xalan
  • com.sun.org.apache.xerces
  • com.sun.xml.internal.stream
  • javax.xml.datatype
  • javax.xml.namespace
  • javax.xml.stream
  • javax.xml.transform
  • jdk.xml.internal
  • org.w3c.dom
  • org.xml.sax.helpers

Same problem with the jafu-webmvc one:

With GraalVM 19.3.1:

  • Image build time: 78.8s
  • RSS memory: 38.3M
  • Image size: 37.8M

With GraalVM 20.0.0:

  • Image build time: 87.6s
  • RSS memory: 50.6M
  • Image size: 55.8M

Similar diff with a lot of XML related classed can be observed between packages reports available in the jafu-webmvc-sample-19.3.1-reports and jafu-webmvc-sample-20.0.0-reports.

@eginez
Copy link
Contributor

eginez commented Apr 9, 2020

@arodionov do you think this is related to your recent commit on xml packages?

@arodionov
Copy link

@eginez eginez assigned arodionov and unassigned cstancu Apr 16, 2020
@arodionov
Copy link

The XML-parsers support has been added in GraalVM 20.0.0.
Now it is possible to use Spring XML configuration without the agent.

As a result, if the code that loads parser via reflection is reachable, the specified parser and all related classes will be loaded too.
Spring with Java 8 triggers the DocumentBuilderFactory and SAXParserFactory parsers by the following branch of the call-tree: org.springframework.boot.SpringApplication.run -> org.springframework.boot.SpringApplication.getSpringFactoriesInstances -> org.springframework.core.io.support.SpringFactoriesLoader.loadFactoryNames -> org.springframework.core.io.support.SpringFactoriesLoader.loadSpringFactories -> org.springframework.core.io.support.PropertiesLoaderUtils.loadProperties -> org.springframework.core.io.support.PropertiesLoaderUtils.fillProperties -> java.util.Properties.loadFromXML -> java.util.Properties$XmlSupport.load -> ... -> javax.xml.parsers.FactoryFinder.find
This loads ~490 classes.

In Java 11 there is another implementation of the java.util.Properties.loadFromXML, with direct SAXParser call (without reflection usage). So, for Java 11 images size shouldn't change between GraalVM versions.
Now I am working on reducing the number of pulled classes, by splitting them between different parsers. The size of the image should decrease for 2Mb after this.

@sbrannen
Copy link

sbrannen commented Apr 21, 2020

Thanks for the detailed analysis, @arodionov.

For this particular use case (looking up the spring.factories file), Spring will never actually load the properties from an XML format, since the spring.factories file name does not end in .xml. So perhaps SpringFactoriesLoader could be reworked to omit the possibility of loading an XML file when loading spring.factories files, or perhaps we find some other workaround.

Though, Spring is always going to use java.util.Properties in such use cases. So perhaps it's a moot point about trying to avoid the loading of those XML types unless we introduce a substitution for Java 8.

@arodionov
Copy link

@sbrannen

unless we introduce a substitution for Java 8.

Yes, it could be a solution. For Java 11 java.util.Properties brings only ~16 classes with SAXParser.

@sdeleuze
Copy link
Collaborator Author

I have been able to restore previous size and memory consumption for our jafu sample (the most basic one) with this substitution. That said I still think optimizing the number of classes compiled make sense for use cases where XML parsers are effectively needed.

The jafu-webmvc one still includes them even if does not use XML converters, so I suspect another codepath includes it, I will have a deeper look.

@arodionov Could you explain how you proceed to diagnose what triggers XML parser usage? I have been myself using -H:+PrintAnalysisCallTree + searching in the call_tree_project_*_*.txt occurrences of directly calls + XML parser class fully qualified name, but it is not super efficient. Do you have a more efficient workflow?

@arodionov
Copy link

@sdeleuze There are 8 classes instantiated via reflection, so we create handlers for the following methods where those classes instantiated.
If one of those methods are reachable via the call-tree, all XML-classes will be added to the image.

I still think optimizing the number of classes compiled make sense for use cases where XML parsers are effectively needed.

Working on this, to make reachability handlers more precise and include only a limited set of XML-classes.

@arodionov
Copy link

arodionov commented Apr 22, 2020

@sdeleuze For jafu-webmvc there are several branches that trigger XML-parsers:

  • org.apache.catalina.servlets.DefaultServlet.renderXml(javax.servlet.http.HttpServletRequest, java.lang.String, org.apache.catalina.WebResource, javax.xml.transform.Source, java.lang.String) brings TransformerFactory.newInstance()
  • org.apache.tomcat.util.digester.Digester.getFactory() brings SAXParserFactory.newInstance()
  • org.springframework.http.converter.xml.SourceHttpMessageConverter.readDOMSource(InputStream, HttpInputMessage) brings DocumentBuilderFactory.newInstance()
  • org.springframework.http.converter.xml.SourceHttpMessageConverter.readStAXSource(java.io.InputStream, org.springframework.http.HttpInputMessage) brings javax.xml.stream.XMLInputFactory.newInstance()

@sdeleuze
Copy link
Collaborator Author

@arodionov Thanks I am working on fixing those on Spring side.

@fhanik Could you have a look to the Tomcat ones?

@sdeleuze
Copy link
Collaborator Author

sdeleuze commented Apr 23, 2020

Ok so using this branch with various optimizations to remove XML classes using @arodionov guidance and unused converters, I have these figures.

Before:
Build memory: 6.83GB
Image build time: 81.2s
RSS memory: 50.2M
Image size: 55.8M
Startup time: 0.012 (JVM running for 0.013)

After:
Build memory: 6.40GB
Image build time: 59.5s
RSS memory: 37.1M
Image size: 37.1M
Startup time: 0.009 (JVM running for 0.01)

So we are back at the level of previous figures.

Beware that for now I just remove org.apache.catalina.servlets.DefaultServlet.renderXml and org.apache.tomcat.util.digester.Digester.getFactory via substitutions, I guess we will have to use a more refined solution.

It seems also that a lot of crypto classes are now included, I am wondering if HTTPS support is now detected automatically via the code rather than enabled via a command line parameter, bu that's probably out of the scope of that issue.

@sdeleuze
Copy link
Collaborator Author

After a second look, XML classes are still included, the great figures I got seems to come for the sum other the various optimizations I did. I have not found yet what triggers their inclusion.

If you want to reproduce:

These XML classes are included despite having removed the 4 branches previously mentioned that were triggering XML parsers:

javax.xml.datatype.DatatypeFactory.<init>():void
javax.xml.namespace.QName.<init>(String, String):void
javax.xml.namespace.QName.<init>(String, String, String):void
javax.xml.namespace.QName.equals(Object):boolean
javax.xml.namespace.QName.getLocalPart():String
javax.xml.namespace.QName.getNamespaceURI():String
javax.xml.namespace.QName.hashCode():int
javax.xml.namespace.QName.toString():String
javax.xml.parsers.DocumentBuilder.<init>():void
javax.xml.parsers.DocumentBuilder.parse(InputStream):Document
javax.xml.parsers.DocumentBuilderFactory.<init>():void
javax.xml.parsers.DocumentBuilderFactory.isCoalescing():boolean
javax.xml.parsers.DocumentBuilderFactory.isExpandEntityReferences():boolean
javax.xml.parsers.DocumentBuilderFactory.isIgnoringComments():boolean
javax.xml.parsers.DocumentBuilderFactory.isIgnoringElementContentWhitespace():boolean
javax.xml.parsers.DocumentBuilderFactory.isNamespaceAware():boolean
javax.xml.parsers.DocumentBuilderFactory.isValidating():boolean
javax.xml.parsers.DocumentBuilderFactory.newInstance():DocumentBuilderFactory
javax.xml.parsers.DocumentBuilderFactory.setNamespaceAware(boolean):void
javax.xml.parsers.DocumentBuilderFactory.setValidating(boolean):void
javax.xml.parsers.FactoryConfigurationError.<init>(Exception, String):void
javax.xml.parsers.FactoryConfigurationError.<init>(String):void
javax.xml.parsers.FactoryConfigurationError.getCause():Throwable
javax.xml.parsers.FactoryConfigurationError.getMessage():String
javax.xml.parsers.FactoryFinder$1.<init>(Class):void
javax.xml.parsers.FactoryFinder$1.run():Object
javax.xml.parsers.FactoryFinder.dPrint(String):void
javax.xml.parsers.FactoryFinder.find(Class, String):Object
javax.xml.parsers.FactoryFinder.findServiceProvider(Class):Object
javax.xml.parsers.FactoryFinder.getProviderClass(String, ClassLoader, boolean, boolean):Class
javax.xml.parsers.FactoryFinder.newInstance(Class, String, ClassLoader, boolean):Object
javax.xml.parsers.FactoryFinder.newInstance(Class, String, ClassLoader, boolean, boolean):Object
javax.xml.parsers.ParserConfigurationException.<init>(String):void
javax.xml.parsers.SAXParser.<init>():void
javax.xml.parsers.SAXParserFactory.<init>():void
javax.xml.parsers.SAXParserFactory.newInstance():SAXParserFactory
javax.xml.parsers.SecuritySupport$1.<init>(SecuritySupport):void
javax.xml.parsers.SecuritySupport$1.run():Object
javax.xml.parsers.SecuritySupport$2.<init>(SecuritySupport, String):void
javax.xml.parsers.SecuritySupport$2.run():Object
javax.xml.parsers.SecuritySupport$3.<init>(SecuritySupport, File):void
javax.xml.parsers.SecuritySupport$3.run():Object
javax.xml.parsers.SecuritySupport$5.<init>(SecuritySupport, File):void
javax.xml.parsers.SecuritySupport$5.run():Object
javax.xml.parsers.SecuritySupport.doesFileExist(File):boolean
javax.xml.parsers.SecuritySupport.getContextClassLoader():ClassLoader
javax.xml.parsers.SecuritySupport.getFileInputStream(File):FileInputStream
javax.xml.parsers.SecuritySupport.getSystemProperty(String):String
javax.xml.stream.XMLEventFactory.<init>():void
javax.xml.stream.XMLInputFactory.<init>():void
javax.xml.stream.XMLOutputFactory.<init>():void
javax.xml.stream.XMLStreamException.<init>():void
javax.xml.stream.XMLStreamException.getLocation():Location
javax.xml.transform.TransformerFactory.<init>():void
javax.xml.transform.dom.DOMSource.<init>(Node):void
javax.xml.transform.dom.DOMSource.setNode(Node):void
javax.xml.transform.sax.SAXTransformerFactory.<init>():void
javax.xml.transform.stream.StreamSource.<init>(InputStream):void
javax.xml.transform.stream.StreamSource.setInputStream(InputStream):void

I guess I missed other branches but I have not found yet where they are. I am reporting that to double check there is not an underlying bug or unexpected behavior.

@arodionov
Copy link

@sdeleuze Will look again what triggers the XML inclusion.

@sdeleuze
Copy link
Collaborator Author

sdeleuze commented Apr 24, 2020

Oh I think I found, it could beorg.apache.catalina.servlets.DefaultServlet.findXsltSource, I will try to remove that one as well. I also need to disable this initialization block, I guess recent GraalVM native distributions now enable the security manager by default:

static {
        if (Globals.IS_SECURITY_ENABLED) {
            factory = DocumentBuilderFactory.newInstance();
            factory.setNamespaceAware(true);
            factory.setValidating(false);
            secureEntityResolver = new SecureEntityResolver();
        } else {
            factory = null;
            secureEntityResolver = null;
        }
    }

@sdeleuze
Copy link
Collaborator Author

Ok I managed to remove those one as well by enabling static initialization at build time for those Tomcat classes (security manager is still disabled by default):

Build memory: 4.58GB
Image build time: 59.3s
RSS memory: 33.1M
Image size: 29.7M
Startup time: 0.01 (JVM running for 0.01)

I will let you close this issue when you have finished your optimizations, on my side it is ok.

sdeleuze added a commit to sdeleuze/spring-native that referenced this issue May 15, 2020
This commit introduces various optimizations that allow to significantly
reduce Spring Boot applications footprint when compiled as native images.

The first one is the capability to remove XML parsers (see related
oracle/graal#2327 GraalVM issue) including those configured habitually
by Spring. XML Parsers from Logback, Tomcat, Spring Boot, Spring Framework
Core/WebMvc/WebFlux can be disabled via -Dspring.graal.remove-xml-support=true
and is implemented via various substitutions.
Notice for now FormHttpMessageConverter is used instead of
AllEncompassingFormHttpMessageConverter due to oracle/graal#2458.

It is now also possible to remove SpEL via -Dspring.graal.remove-spel-support=true
and JMX via -Dspring.graal.remove-jmx-support=true.

BackgroundPreinitializer are now disabled since they do not make sense with
GraalVM native.

spring-graal-feature is now added as a regular Maven dependency in order to make
the new org.springframework.boot.NativePropertiesListener taken in account when
the agent is used.

Key samples and petclinic ones have been updated to take advantage of those
optimizations. Spring Fu samples now also takes advantages of
spring-projects-experimental/spring-fu#269 and those capabilities.

In the long run, these "not so easy to maintain substitutions" could maybe be
replaced by a new GraalVM capability that would monitor used classes by the
JVM via an agent and pass this list to native-image compiler to remove those
unused classes from the native image in order to load classes lazily like
the JVM does. It would also have the advantage to not require an explicit
option to enable those optimizations.

Closes spring-atticgh-109
dsyer pushed a commit to scratches/spring-graalvm-native that referenced this issue Sep 23, 2020
This commit introduces various optimizations that allow to significantly
reduce Spring Boot applications footprint when compiled as native images.

The first one is the capability to remove XML parsers (see related
oracle/graal#2327 GraalVM issue) including those configured habitually
by Spring. XML Parsers from Logback, Tomcat, Spring Boot, Spring Framework
Core/WebMvc/WebFlux can be disabled via -Dspring.graal.remove-xml-support=true
and is implemented via various substitutions.
Notice for now FormHttpMessageConverter is used instead of
AllEncompassingFormHttpMessageConverter due to oracle/graal#2458.

It is now also possible to remove SpEL via -Dspring.graal.remove-spel-support=true
and JMX via -Dspring.graal.remove-jmx-support=true.

BackgroundPreinitializer are now disabled since they do not make sense with
GraalVM native.

spring-graal-feature is now added as a regular Maven dependency in order to make
the new org.springframework.boot.NativePropertiesListener taken in account when
the agent is used.

Key samples and petclinic ones have been updated to take advantage of those
optimizations. Spring Fu samples now also takes advantages of
spring-projects-experimental/spring-fu#269 and those capabilities.

In the long run, these "not so easy to maintain substitutions" could maybe be
replaced by a new GraalVM capability that would monitor used classes by the
JVM via an agent and pass this list to native-image compiler to remove those
unused classes from the native image in order to load classes lazily like
the JVM does. It would also have the advantage to not require an explicit
option to enable those optimizations.

Closes spring-atticgh-109
dsyer pushed a commit to scratches/spring-graalvm-native that referenced this issue Sep 23, 2020
This commit introduces various optimizations that allow to significantly
reduce Spring Boot applications footprint when compiled as native images.

The first one is the capability to remove XML parsers (see related
oracle/graal#2327 GraalVM issue) including those configured habitually
by Spring. XML Parsers from Logback, Tomcat, Spring Boot, Spring Framework
Core/WebMvc/WebFlux can be disabled via -Dspring.graal.remove-xml-support=true
and is implemented via various substitutions.
Notice for now FormHttpMessageConverter is used instead of
AllEncompassingFormHttpMessageConverter due to oracle/graal#2458.

It is now also possible to remove SpEL via -Dspring.graal.remove-spel-support=true
and JMX via -Dspring.graal.remove-jmx-support=true.

BackgroundPreinitializer are now disabled since they do not make sense with
GraalVM native.

spring-graal-feature is now added as a regular Maven dependency in order to make
the new org.springframework.boot.NativePropertiesListener taken in account when
the agent is used.

Key samples and petclinic ones have been updated to take advantage of those
optimizations. Spring Fu samples now also takes advantages of
spring-projects-experimental/spring-fu#269 and those capabilities.

In the long run, these "not so easy to maintain substitutions" could maybe be
replaced by a new GraalVM capability that would monitor used classes by the
JVM via an agent and pass this list to native-image compiler to remove those
unused classes from the native image in order to load classes lazily like
the JVM does. It would also have the advantage to not require an explicit
option to enable those optimizations.

Closes spring-atticgh-109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug native-image spring spring related issue
Projects
None yet
Development

No branches or pull requests

6 participants