-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate ways to speed up FHIRJsonParser #2424
Comments
I added code to
I also ran separate benchmark with VisualVM to see where the "hot spots" are for both versions of the parser. Hot spots for the original version of the parser:
Hot spots for the loop / switch version of the parser:
The loop / switch version of the parser is likely slower because it requires tracking state in between loop iterations to ensure that certain elements are not processed twice. This is primarily an issue when we have primitive typed elements with extensions. The evidence that supports this theory is the In either case, a lot of time is spent a) validating XHTML content and b) parsing date/time strings using the We can mitigate this by making XHTML content validation optional for situations where we are reading from the database and we know it has already been checked. I added a flag to test out this theory and ran a JMH benchmark using the original parser (one run checks XHTML content and the other does not). Here are the results:
This result shows that, by simply skipping XHTML content validation, we see a 30% improvement in throughput. We have several options on how to proceed:
For option 4, we can look at the structure definition for the narrative data type. There are two constraints that include the
and
Both of these could be implemented in Java and would likely be lighter weight than using HAPI does not currently implement the |
Additional thoughts from the team: @lmsurpre mentioned the following relevant tickets: From @punktilious Between eliminating the extra read(s), only reading metadata (fingerprint) in certain scenarios and skipping some of the validation I’m confident that we will have solid improvement in performance. Deferral of resource deserialization is a potential option but will likely require a fair bit more refactoring. |
Personally I think |
I put together a proof-of-concept where I updated the @Override
public Coverage build() {
return new Coverage(this, true);
}
@Override
public Coverage build(boolean extendedValidation) {
return new Coverage(this, extendedValidation);
} This parameter is passed to the constructor of the object that uses the builder for its construction: private Coverage(Builder builder, boolean extendedValidation) {
super(builder, extendedValidation);
coverage = ValidationSupport.requireNonNull(builder.coverage, "coverage");
priority = builder.priority;
if (extendedValidation) {
ValidationSupport.checkReferenceType(coverage, "coverage", "Coverage");
ValidationSupport.requireValueOrChildren(this);
}
} I also added a configuration property to the return builder.build(getPropertyOrDefault(FHIRParser.PROPERTY_EXTENDED_VALIDATION, java.lang.Boolean.TRUE, java.lang.Boolean.class)); I updated @Benchmark
public Resource benchmarkJsonParser(FHIRParsers parsers, FHIRParserState state) throws Exception {
parsers.jsonParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, true);
return parsers.jsonParser.parse(new StringReader(state.JSON_SPEC_EXAMPLE));
}
@Benchmark
public Resource benchmarkJsonParserWithoutExtendedValidation(FHIRParsers parsers, FHIRParserState state) throws Exception {
parsers.jsonParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, false);
return parsers.jsonParser.parse(new StringReader(state.JSON_SPEC_EXAMPLE));
}
@Benchmark
public Resource benchmarkXMLParser(FHIRParsers parsers, FHIRParserState state) throws Exception {
parsers.xmlParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, true);
return parsers.xmlParser.parse(new StringReader(state.XML_SPEC_EXAMPLE));
}
@Benchmark
public Resource benchmarkXMLParserWithoutExtendedValidation(FHIRParsers parsers, FHIRParserState state) throws Exception {
parsers.xmlParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, false);
return parsers.xmlParser.parse(new StringReader(state.XML_SPEC_EXAMPLE));
} Here are the numbers from 5 random spec examples:
There is a marked improvement in What I've learned from this experiment:
|
I created a second proof-of-concept that uses a thread-local configuration property. I added a thread-local map to /**
* A global map of configuration properties
*/
private static final Map<String, Object> globalProperties = new ConcurrentHashMap<>();
/**
* A thread local map of configuration properties
*/
private static final ThreadLocal<Map<String, Object>> localProperties = new ThreadLocal<Map<String, Object>>() {
@Override
public Map<String, Object> initialValue() {
return new HashMap<>();
}
}; and added method signatures with a public static void setProperty(String name, Object value, boolean local) {
Map<String, Object> properties = local ? localProperties.get() : globalProperties;
properties.put(requireNonNull(name), requireNonNull(value));
}
public static Object getProperty(String name, boolean local) {
Map<String, Object> properties = local ? localProperties.get() : globalProperties;
return properties.get(requireNonNull(name));
} Then I created a thread-local configuration property named public static final String PROPERTY_EXTENDED_VALIDATION = "com.ibm.fhir.model.extendedValidation";
public static boolean getExtendedCodeableConceptValidation() {
return getPropertyOrDefault(PROPERTY_EXTENDED_CODEABLE_CONCEPT_VALIDATION, DEFAULT_EXTENDED_CODEABLE_CONCEPT_VALIDATION, Boolean.class, true);
}
public static void setExtendedValidation(boolean extendedCodedElementValidation) {
setProperty(PROPERTY_EXTENDED_VALIDATION, extendedCodedElementValidation, true);
} I added a parser configuration property as I did in my first experiment. The Json and XML parsers check the parser configuration property and set the thread-local property accordingly: // com.ibm.fhir.model.parser.FHIRJsonParser
public <T extends Resource> T parseAndFilter(InputStream in, Collection<java.lang.String> elementsToInclude) throws FHIRParserException {
FHIRModelConfig.setExtendedValidation(getPropertyOrDefault(FHIRParser.PROPERTY_EXTENDED_VALIDATION, java.lang.Boolean.TRUE, java.lang.Boolean.class));
//... <snip>
} Methods in public static void checkXHTMLContent(String value) {
if (!FHIRModelConfig.getExtendedValidation()) {
return;
}
try {
Validator validator = THREAD_LOCAL_VALIDATOR.get();
validator.reset();
validator.validate(new StreamSource(new StringReader(value)));
} catch (Exception e) {
throw new IllegalStateException(String.format("Invalid XHTML content: %s", e.getMessage()), e);
}
} After running
These results are slightly different than the "parameterized builder method" version in that the XML parser without extended validation appears to be consistently faster than the XML parser with extended validation. This wasn't as apparent in the previous results. One advantage of this approach is that we don't need to change the public API and can implement this as more of a "power user" configuration option. @lmsurpre suggested that we change the behavior of |
Re Investigate ways to speed up FHIRJsonParser #2424
maybe something here is compelling performance wise - https://advancedweb.hu/a-categorized-list-of-all-java-and-jvm-features-since-jdk-8-to-16/#performance-improvements or naturally in the JDK
Example ... https://build.fhir.org/ig/HL7/carin-bb/Organization-OrganizationProvider1.json.html basically, we would assume text.div.where(text.status = 'generated') is valid (based on a configuration it could be turned on or off).
We'd need to modify the Audit APIs, today we take the Resource and then strip off the values we need. We could vastly simplify by only passing the relevant data (e.g. Resource.id as part of the adapter instead of deserializing.)
| - 1 Level Would it be good to test more of the hierarchical resources with codeable concepts? Also, what happens with a Cnt of 100 in the benchmark? does the error rate drop? |
For this "investigate" ticket, I put together one final prototype. It is very similar to the "parameterized build method" version but instead it parameterizes the builder factory method and introduces the concept of a builder factory. I updated the @Override
public Builder toBuilder() {
return new Builder().from(this);
}
@Override
public Builder toBuilder(Map<java.lang.String, ?> options) {
return new Builder(options).from(this);
}
public static Builder builder() {
return new Builder();
}
public static Builder builder(Map<java.lang.String, ?> options) {
return new Builder();
} Updates to the public static class Builder extends DomainResource.Builder {
private List<Identifier> identifier = new ArrayList<>();
private AccountStatus status;
private CodeableConcept type;
private String name;
private List<Reference> subject = new ArrayList<>();
private Period servicePeriod;
private List<Coverage> coverage = new ArrayList<>();
private Reference owner;
private String description;
private List<Guarantor> guarantor = new ArrayList<>();
private Reference partOf;
private Builder() {
super();
}
private Builder(Map<java.lang.String, ?> options) {
super(options);
}
// ... <snip>
} Updates to the public abstract class AbstractBuilder<T> implements Builder<T> {
protected final Map<String, ?> options;
public AbstractBuilder() {
this(Collections.emptyMap());
}
public AbstractBuilder(Map<String, ?> options) {
this.options = Objects.requireNonNull(options, "options");
}
@Override
public abstract T build();
public Map<String, ?> getOptions() {
return options;
}
} Added code to public final class FHIRModelBuilderFactory {
private final Map<java.lang.String, ?> options;
private FHIRModelBuilderFactory(Map<java.lang.String, ?> options) {
super();
this.options = Objects.requireNonNull(options, "options");
}
public static FHIRModelBuilderFactory newInstance(Map<java.lang.String, ?> options) {
return new FHIRModelBuilderFactory(options);
}
public Account.Builder accountBuilder() {
return Account.builder(options);
}
public Account.Coverage.Builder accountCoverageBuilder() {
return Account.Coverage.builder(options);
}
public Account.Guarantor.Builder accountGuarantorBuilder() {
return Account.Guarantor.builder(options);
}
public ActivityDefinition.Builder activityDefinitionBuilder() {
return ActivityDefinition.builder(options);
}
// ... <snip>
public Account.Builder toBuilder(Account account) {
return account.toBuilder(options);
}
public Account.Coverage.Builder toBuilder(Account.Coverage accountCoverage) {
return accountCoverage.toBuilder(options);
}
public Account.Guarantor.Builder toBuilder(Account.Guarantor accountGuarantor) {
return accountGuarantor.toBuilder(options);
}
public ActivityDefinition.Builder toBuilder(ActivityDefinition activityDefinition) {
return activityDefinition.toBuilder(options);
}
public ActivityDefinition.DynamicValue.Builder toBuilder(ActivityDefinition.DynamicValue activityDefinitionDynamicValue) {
return activityDefinitionDynamicValue.toBuilder(options);
}
public ActivityDefinition.Participant.Builder toBuilder(ActivityDefinition.Participant activityDefinitionParticipant) {
return activityDefinitionParticipant.toBuilder(options);
}
// ... <snip>
} The idea here is that the "builder factory" class can be used to create builders and uniformly configure them with a single map of options as follows: // usage
FHIRModelBuilderFactory f = FHIRModelBuilderFactory.newInstance(options);
Patient patient = f.patientBuilder()
.name(f.humanNameBuilder()
.family(string("Doe"))
.build())
.build(); Then in the parser classes we would use a "builder factory" instance to create all of the builders. The map of options would be created from parser options. My thoughts after (partially) implementing this approach:
|
I completely agree with these points. |
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
Signed-off-by: John T.E. Timm <[email protected]>
* Issue #2424 - add support for non-validating builders/parsers Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - update copyright header Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updates per PR feedback Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - introduce ignoringUnrecognizedElements field Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated unit test Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - fixed issue with Builder.validate method Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc wording Signed-off-by: John T.E. Timm <[email protected]>
John decided on a variation of the options presented here: a flag on the parser (and the underlying builders) that indicates whether it should perform validation (or not) while building objects. This decision was made after taking another look at Effective Java, the Google Protobuf Java library, the javapoet Java library, and this Stack Overflow:
|
* Issue #2424 - add support for non-validating builders/parsers Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - update copyright header Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updates per PR feedback Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - introduce ignoringUnrecognizedElements field Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated unit test Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - fixed issue with Builder.validate method Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc wording Signed-off-by: John T.E. Timm <[email protected]>
FHIRJsonParser
is currently generated from FHIR structure definitions usingcom.ibm.fhir.tools.CodeGenerator
. The current structure of the parser is to check for all possible elements which results in more map lookups and method invocations than are needed for a typical instance. A better structure might be something like this:The text was updated successfully, but these errors were encountered: