Investigate ways to speed up FHIRJsonParser #2424

JohnTimm · 2021-05-24T17:54:43Z

FHIRJsonParser is currently generated from FHIR structure definitions using com.ibm.fhir.tools.CodeGenerator. The current structure of the parser is to check for all possible elements which results in more map lookups and method invocations than are needed for a typical instance. A better structure might be something like this:

Patient.Builder builder = Patient.builder();
for (String key : jsonObject.keySet()) {
    switch (key) {
    case "active":
        builder.active(parseBoolean(...));
        break;
    case "gender":
        builder.gender(parseString(...));
        break;
    // ...
}
return builder.build();

The text was updated successfully, but these errors were encountered:

JohnTimm · 2021-05-25T15:41:48Z

I added code to com.ibm.fhir.tools.CodeGenerator to generate a separate Json parser that uses the loop / switch structure mentioned in the description. I then ran a JMH benchmark comparing the performance of the two versions. What I found is that the loop / switch version is actually ~14% slower than the original version:

FHIRParserBenchmark.benchmarkJsonParser             patient-example  thrpt    4  6087.543 ±  814.819  ops/s
FHIRParserBenchmark.benchmarkNewJsonParser          patient-example  thrpt    4  5237.625 ±  723.408  ops/s

I also ran separate benchmark with VisualVM to see where the "hot spots" are for both versions of the parser. Hot spots for the original version of the parser:

java.time.ZonedDateTime.from (16.8%)
javax.xml.validation.Validator.validate (16.7%)
java.time.YearMonth.from (7.2%)
java.time.format.DateTimeFormatter.parseBest (7.1%)

Hot spots for the loop / switch version of the parser:

javax.xml.validation.Validator.validate (15%)
java.time.ZonedDateTime.from (9.7%)
java.util.HashSet.add (8.7%)
java.time.format.DateTimeFormatter.parseBest (6.8%)
com.ibm.fhir.model.util.JsonSupport.checkForUnrecognizedElements (6%)

The loop / switch version of the parser is likely slower because it requires tracking state in between loop iterations to ensure that certain elements are not processed twice. This is primarily an issue when we have primitive typed elements with extensions. The evidence that supports this theory is the java.util.HashSet.add hot spot.

In either case, a lot of time is spent a) validating XHTML content and b) parsing date/time strings using the java.time library.

We can mitigate this by making XHTML content validation optional for situations where we are reading from the database and we know it has already been checked. I added a flag to test out this theory and ran a JMH benchmark using the original parser (one run checks XHTML content and the other does not). Here are the results:

FHIRParserBenchmark.benchmarkJsonParser             patient-example  thrpt    4  5674.831 ± 1300.087  ops/s
FHIRParserBenchmark.benchmarkNewJsonParser          patient-example  thrpt    4  7344.746 ± 1886.645  ops/s

This result shows that, by simply skipping XHTML content validation, we see a 30% improvement in throughput.

We have several options on how to proceed:

Skip XHTML content validation altogether
Make XHTML content validation optional using a thread local configuration option in FHIRModelConfig with the default value true (current behavior)
Move XHTML content validation to the FHIRPath engine (there is an htmlChecks FHIRPath function that is attached to constraints on the Narrative data type)
Simplify XHTML content validation
Some combination of the above

For option 4, we can look at the structure definition for the narrative data type. There are two constraints that include the htmlChecks FHIRPath function that also include a reference to an XPath version:

The narrative SHALL contain only the basic html formatting elements and attributes described in chapters 7-11 (except section 4 of chapter 9) and 15 of the HTML 4.0 standard, <a> elements (either name or href), images and internally contained style attributes

not(descendant-or-self::*[not(local-name(.)=('a', 'abbr', 'acronym', 'b', 'big', 'blockquote', 'br', 'caption', 'cite', 'code', 'col', 'colgroup', 'dd', 'dfn', 'div', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'img', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr', 'tt', 'ul', 'var'))]) and not(descendant-or-self::*/@*[not(name(.)=('abbr', 'accesskey', 'align', 'alt', 'axis', 'bgcolor', 'border', 'cellhalign', 'cellpadding', 'cellspacing', 'cellvalign', 'char', 'charoff', 'charset', 'cite', 'class', 'colspan', 'compact', 'coords', 'dir', 'frame', 'headers', 'height', 'href', 'hreflang', 'hspace', 'id', 'lang', 'longdesc', 'name', 'nowrap', 'rel', 'rev', 'rowspan', 'rules', 'scope', 'shape', 'span', 'src', 'start', 'style', 'summary', 'tabindex', 'title', 'type', 'valign', 'value', 'vspace', 'width'))])

and

The narrative SHALL have some non-whitespace content

descendant::text()[normalize-space(.)!=''] or descendant::h:img[@src]

Both of these could be implemented in Java and would likely be lighter weight than using javax.xml.validation.Validator. We do XHTML content validation during object construction and thus our com.ibm.fhir.path.function.HtmlChecksFunction.apply method always returns true:

https://github.com/IBM/FHIR/blob/05a1e05108a69d4f72f7afcfd111395b3630c5cd/fhir-path/src/main/java/com/ibm/fhir/path/function/HtmlChecksFunction.java#L33-L36

HAPI does not currently implement the htmlChecks function:

https://github.com/hapifhir/org.hl7.fhir.core/blob/5db9b65c217aa3014aff62313397b51762314200/org.hl7.fhir.r4/src/main/java/org/hl7/fhir/r4/utils/FHIRPathEngine.java#L2885-L2888

JohnTimm · 2021-05-25T15:52:09Z

Additional thoughts from the team:

@lmsurpre mentioned the following relevant tickets:
Avoid redundant reads during udpates: #2283
No need to read full resources during updates: #2417

From @punktilious
The doRead now returns a SingleResourceResult so potentially we could just stash the raw data there and use that for the audit instead of deserializing/serializing. I haven't looked at the audit APIs yet to see what that might need.

Between eliminating the extra read(s), only reading metadata (fingerprint) in certain scenarios and skipping some of the validation I’m confident that we will have solid improvement in performance.

Deferral of resource deserialization is a potential option but will likely require a fair bit more refactoring.

lmsurpre · 2021-05-25T19:09:22Z

Personally I think Deferral of resource deserialization is on the same order of magnitude as #2417 (and I actually like it a bit more). Should we open an issue for that idea?

JohnTimm · 2021-05-26T15:10:02Z

I put together a proof-of-concept where I updated the CodeGenerator to generate Builder classes that have a second build method that takes a single boolean parameter called extendedValidation:

            @Override
            public Coverage build() {
                return new Coverage(this, true);
            }

            @Override
            public Coverage build(boolean extendedValidation) {
                return new Coverage(this, extendedValidation);
            }

This parameter is passed to the constructor of the object that uses the builder for its construction:

        private Coverage(Builder builder, boolean extendedValidation) {
            super(builder, extendedValidation);
            coverage = ValidationSupport.requireNonNull(builder.coverage, "coverage");
            priority = builder.priority;
            if (extendedValidation) {
                ValidationSupport.checkReferenceType(coverage, "coverage", "Coverage");
                ValidationSupport.requireValueOrChildren(this);
            }
        }

I also added a configuration property to the FHIRAbstractParser class called PROPERTY_EXTENDED_VALIDATION. Finally, I updated the generation of the Json and XML parsers so that they call the parameterized build method as follows:

return builder.build(getPropertyOrDefault(FHIRParser.PROPERTY_EXTENDED_VALIDATION, java.lang.Boolean.TRUE, java.lang.Boolean.class));

I updated FHIRParserBenchmark to compare the parser performance with and without extended validation:

    @Benchmark
    public Resource benchmarkJsonParser(FHIRParsers parsers, FHIRParserState state) throws Exception {
        parsers.jsonParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, true);
        return parsers.jsonParser.parse(new StringReader(state.JSON_SPEC_EXAMPLE));
    }
    
    @Benchmark
    public Resource benchmarkJsonParserWithoutExtendedValidation(FHIRParsers parsers, FHIRParserState state) throws Exception {
        parsers.jsonParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, false);
        return parsers.jsonParser.parse(new StringReader(state.JSON_SPEC_EXAMPLE));
    }
    
    @Benchmark
    public Resource benchmarkXMLParser(FHIRParsers parsers, FHIRParserState state) throws Exception {
        parsers.xmlParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, true);
        return parsers.xmlParser.parse(new StringReader(state.XML_SPEC_EXAMPLE));
    }
    
    @Benchmark
    public Resource benchmarkXMLParserWithoutExtendedValidation(FHIRParsers parsers, FHIRParserState state) throws Exception {
        parsers.xmlParser.setProperty(FHIRParser.PROPERTY_EXTENDED_VALIDATION, false);
        return parsers.xmlParser.parse(new StringReader(state.XML_SPEC_EXAMPLE));
    }

Here are the numbers from 5 random spec examples:

-- parameterized build method version ---

servicerequest-example-lipid
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4   9882.252 ±  834.817  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  20597.446 ± 2059.247  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   6961.406 ± 1510.813  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4   8045.860 ± 2437.989  ops/s

observation-example-vitals-panel
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  16717.863 ± 1152.632  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  39479.768 ± 7987.225  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4  11272.700 ±  169.419  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4  12126.900 ±  805.585  ops/s

composition-example
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4   6938.540 ± 1326.027  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  15598.913 ±  520.688  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   4935.206 ±  455.430  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4   5311.247 ± 1072.577  ops/s

diagnosticreport-example
Benchmark                                                                 Mode  Cnt     Score     Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4   575.583 ± 294.283  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  1445.721 ± 586.085  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   538.146 ±  39.644  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4   648.089 ±  93.675  ops/s

audit-event-example-logout
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  10266.561 ± 1314.894  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  30512.253 ± 7860.949  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   6789.172 ± 2049.612  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4   6842.786 ± 1530.199  ops/s

There is a marked improvement in FHIRJsonParser performance and a slight improvement in FHIRXMLParser performance when running without extended validation.

What I've learned from this experiment:

I really don’t like exposing the extendedValidation flag so overtly in the public API
I also don’t like that it is a single flag. Perhaps generalizing to Map<String, ?> options would be more appropriate.
I think the option of using a thread-local FHIRModelConfig properties is cleaner and less invasive. We already use ThreadLocal in at least one other spot in the fhir-model module
I think that the Java microbenchmarking numbers support making a change

JohnTimm · 2021-05-26T15:26:51Z

Personally I think Deferral of resource deserialization is on the same order of magnitude as #2417 (and I actually like it a bit more). Should we open an issue for that idea?

I have created the following issue:
#2432

JohnTimm · 2021-05-26T19:03:16Z

I created a second proof-of-concept that uses a thread-local configuration property. I added a thread-local map to FHIRModelConfig:

    /**
     * A global map of configuration properties
     */
    private static final Map<String, Object> globalProperties = new ConcurrentHashMap<>();

    /**
     * A thread local map of configuration properties
     */
    private static final ThreadLocal<Map<String, Object>> localProperties = new ThreadLocal<Map<String, Object>>() {
        @Override
        public Map<String, Object> initialValue() {
            return new HashMap<>();
        }
    };

and added method signatures with a boolean local parameter:

    public static void setProperty(String name, Object value, boolean local) {
        Map<String, Object> properties = local ? localProperties.get() : globalProperties;
        properties.put(requireNonNull(name), requireNonNull(value));
    }
    
    public static Object getProperty(String name, boolean local) {
        Map<String, Object> properties = local ? localProperties.get() : globalProperties;
        return properties.get(requireNonNull(name));
    }

Then I created a thread-local configuration property named EXTENDED_VALIDATION:

    public static final String PROPERTY_EXTENDED_VALIDATION = "com.ibm.fhir.model.extendedValidation";
    
    public static boolean getExtendedCodeableConceptValidation() {
        return getPropertyOrDefault(PROPERTY_EXTENDED_CODEABLE_CONCEPT_VALIDATION, DEFAULT_EXTENDED_CODEABLE_CONCEPT_VALIDATION, Boolean.class, true);
    }

    public static void setExtendedValidation(boolean extendedCodedElementValidation) {
        setProperty(PROPERTY_EXTENDED_VALIDATION, extendedCodedElementValidation, true);
    }

I added a parser configuration property as I did in my first experiment. The Json and XML parsers check the parser configuration property and set the thread-local property accordingly:

    // com.ibm.fhir.model.parser.FHIRJsonParser
    public <T extends Resource> T parseAndFilter(InputStream in, Collection<java.lang.String> elementsToInclude) throws FHIRParserException {
        FHIRModelConfig.setExtendedValidation(getPropertyOrDefault(FHIRParser.PROPERTY_EXTENDED_VALIDATION, java.lang.Boolean.TRUE, java.lang.Boolean.class));
        //... <snip>
    }

Methods in ValidationSupport are then guarded using this configuration property:

    public static void checkXHTMLContent(String value) {
        if (!FHIRModelConfig.getExtendedValidation()) {
            return;
        }
        try {
            Validator validator = THREAD_LOCAL_VALIDATOR.get();
            validator.reset();
            validator.validate(new StreamSource(new StringReader(value)));
        } catch (Exception e) {
            throw new IllegalStateException(String.format("Invalid XHTML content: %s", e.getMessage()), e);
        }
    }

After runningFHIRParserBenchmark, here are the results for 5 random spec examples:

-- ThreadLocal configuration property version ---

patient-example-infant-twin-2
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  13866.226 ±  470.760  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  28243.930 ± 2659.462  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4  10271.131 ± 1325.849  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4  17104.964 ± 1551.517  ops/s

operation-coverageeligibilityrequest-submit
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  11963.914 ± 1576.891  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  29793.192 ± 5166.065  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   7481.299 ± 1468.037  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4  13216.107 ± 1534.359  ops/s

operationoutcome-example-allok
Benchmark                                                                 Mode  Cnt       Score       Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4   49695.766 ±  9420.258  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  178567.699 ± 21809.022  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4   26461.004 ±  4328.145  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4   43535.497 ±  9015.497  ops/s

schedule-provider-location2-example
Benchmark                                                                 Mode  Cnt      Score       Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  27664.936 ±  1411.245  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  55143.203 ± 14185.432  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4  16105.559 ±  5391.986  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4  25062.477 ±  6644.957  ops/s

visionprescription-example-1
Benchmark                                                                 Mode  Cnt      Score      Error  Units
FHIRParserBenchmark.benchmarkJsonParser                                  thrpt    4  16703.184 ± 3232.652  ops/s
FHIRParserBenchmark.benchmarkJsonParserWithoutExtendedValidation         thrpt    4  24531.976 ± 2672.914  ops/s
FHIRParserBenchmark.benchmarkXMLParser                                   thrpt    4  10831.041 ± 1297.799  ops/s
FHIRParserBenchmark.benchmarkXMLParserWithoutExtendedValidation          thrpt    4  13331.628 ± 5288.716  ops/s

These results are slightly different than the "parameterized builder method" version in that the XML parser without extended validation appears to be consistently faster than the XML parser with extended validation. This wasn't as apparent in the previous results.

One advantage of this approach is that we don't need to change the public API and can implement this as more of a "power user" configuration option. @lmsurpre suggested that we change the behavior of FHIRModelConfig so that if getProperty is called it will first check for local property and then fall back to a global property. Using this design, way we could have multiple global properties (e.g. checkReferenceTypes, extendedCodeableConceptValidation, checkXHTMLContent, etc.) and then override those with local property values when we are reading from the database. We could also have a single parser option that covers all three of those properties. Finally, we could have an option VALIDATION_DURING_OBJECT_CONSTRUCTION that could be used to disable all validation during objection construction. This would include other checks.

prb112 · 2021-05-27T13:09:49Z

Re Investigate ways to speed up FHIRJsonParser #2424

Time to check this on JDK 11 (at least), does the performance hold true. Maybe we switch to JDK11 compatibility in fhir-parent and fhir-benchmark

maybe something here is compelling performance wise - https://advancedweb.hu/a-categorized-list-of-all-java-and-jvm-features-since-jdk-8-to-16/#performance-improvements or naturally in the JDK

If you take out the XHTML from the samples, where are the hotspots? I know the XHTML in many examples is a textual representation of the resource. I think you mentioned having a configuration to avoid processing, could it be a performance switch to avoid processing generated XHTML.

"text" : {
    "status" : "generated",
    "div" : "<div>"
}

Example ... https://build.fhir.org/ig/HL7/carin-bb/Organization-OrganizationProvider1.json.html

basically, we would assume text.div.where(text.status = 'generated') is valid (based on a configuration it could be turned on or off).

If we dig into the datetime processing is there some lazy loading we can force to happen in advance?
For the comment on Investigate ways to speed up FHIRJsonParser #2424 (comment) What's the default hashset size? is it always trying to grow. (this was in addition to your supposition on the primitive typed elements)
Audit API enhancement (re. Investigate ways to speed up FHIRJsonParser #2424 (comment))

The doRead now returns a SingleResourceResult so potentially we could just stash the raw data there and 
use that for the audit instead of deserializing/serializing. I haven't looked at the audit APIs yet to see what
that might need.

We'd need to modify the Audit APIs, today we take the Resource and then strip off the values we need. We could vastly simplify by only passing the relevant data (e.g. Resource.id as part of the adapter instead of deserializing.)

I like the deferral of resource deserialization. as a follow on, do we know the original resource type? if we do, we can stream that directly, rather than reparsing and generating to the output stream. (e.g. if we know originally xml and requested JSON we'd generate/stream, else if XML is requested we'd stream directly)
extendValidation ... could you give a list of possible configurations you'd see? ... ah... CODEABLE CONCEPT
I like Lee's suggestion on FHIRModelConfig changes.
Not to bias a result, the EOB from CarinBB can be one of the most hierachical resources. e.g.

| - 1 Level
| - 2 Level
| -3 Level

Would it be good to test more of the hierarchical resources with codeable concepts? Also, what happens with a Cnt of 100 in the benchmark? does the error rate drop?

JohnTimm · 2021-05-28T13:58:20Z

For this "investigate" ticket, I put together one final prototype. It is very similar to the "parameterized build method" version but instead it parameterizes the builder factory method and introduces the concept of a builder factory. I updated the CodeGenerator to generate these method signatures:

    @Override
    public Builder toBuilder() {
        return new Builder().from(this);
    }

    @Override
    public Builder toBuilder(Map<java.lang.String, ?> options) {
        return new Builder(options).from(this);
    }

    public static Builder builder() {
        return new Builder();
    }

    public static Builder builder(Map<java.lang.String, ?> options) {
        return new Builder();
    }

Updates to the Builder class with an alternate constructor:

    public static class Builder extends DomainResource.Builder {
        private List<Identifier> identifier = new ArrayList<>();
        private AccountStatus status;
        private CodeableConcept type;
        private String name;
        private List<Reference> subject = new ArrayList<>();
        private Period servicePeriod;
        private List<Coverage> coverage = new ArrayList<>();
        private Reference owner;
        private String description;
        private List<Guarantor> guarantor = new ArrayList<>();
        private Reference partOf;

        private Builder() {
            super();
        }

        private Builder(Map<java.lang.String, ?> options) {
            super(options);
        }
        // ... <snip>
    }

Updates to the AbstractBuilder class (the base class of all Builder classes):

public abstract class AbstractBuilder<T> implements Builder<T> {
    protected final Map<String, ?> options;

    public AbstractBuilder() {
        this(Collections.emptyMap());
    }

    public AbstractBuilder(Map<String, ?> options) {
        this.options = Objects.requireNonNull(options, "options");
    }

    @Override
    public abstract T build();

    public Map<String, ?> getOptions() {
        return options;
    }
}

Added code to CodeGenerator to generate a "builder factory" class called FHIRModelBuilderFactory:

public final class FHIRModelBuilderFactory {
    private final Map<java.lang.String, ?> options;

    private FHIRModelBuilderFactory(Map<java.lang.String, ?> options) {
        super();
        this.options = Objects.requireNonNull(options, "options");
    }

    public static FHIRModelBuilderFactory newInstance(Map<java.lang.String, ?> options) {
        return new FHIRModelBuilderFactory(options);
    }

    public Account.Builder accountBuilder() {
        return Account.builder(options);
    }

    public Account.Coverage.Builder accountCoverageBuilder() {
        return Account.Coverage.builder(options);
    }

    public Account.Guarantor.Builder accountGuarantorBuilder() {
        return Account.Guarantor.builder(options);
    }

    public ActivityDefinition.Builder activityDefinitionBuilder() {
        return ActivityDefinition.builder(options);
    }
   // ... <snip>
    public Account.Builder toBuilder(Account account) {
        return account.toBuilder(options);
    }

    public Account.Coverage.Builder toBuilder(Account.Coverage accountCoverage) {
        return accountCoverage.toBuilder(options);
    }

    public Account.Guarantor.Builder toBuilder(Account.Guarantor accountGuarantor) {
        return accountGuarantor.toBuilder(options);
    }

    public ActivityDefinition.Builder toBuilder(ActivityDefinition activityDefinition) {
        return activityDefinition.toBuilder(options);
    }

    public ActivityDefinition.DynamicValue.Builder toBuilder(ActivityDefinition.DynamicValue activityDefinitionDynamicValue) {
        return activityDefinitionDynamicValue.toBuilder(options);
    }

    public ActivityDefinition.Participant.Builder toBuilder(ActivityDefinition.Participant activityDefinitionParticipant) {
        return activityDefinitionParticipant.toBuilder(options);
    }
    // ... <snip>
}

The idea here is that the "builder factory" class can be used to create builders and uniformly configure them with a single map of options as follows:

// usage
FHIRModelBuilderFactory f = FHIRModelBuilderFactory.newInstance(options);
Patient patient = f.patientBuilder()
    .name(f.humanNameBuilder()
        .family(string("Doe"))
        .build())
    .build();

Then in the parser classes we would use a "builder factory" instance to create all of the builders. The map of options would be created from parser options. My thoughts after (partially) implementing this approach:

We need to generate two additional methods in FHIRModelBuilderFactory for EVERY code subtype (e.g. AccountStatus) which seems like a lot if we add that to what was already generated (2 methods for each Resource, Data Type and nested class).
It still seems somewhat error prone for an API user despite the introduction of a "builder factory" class
It seems like a heavyweight approach with a huge blast area in the public model APIs all for what will likely be (at least initially) a single flag (e.g. extendedValidation)
I'm still leaning towards the "thread local configuration" version of the experiment. It has the smallest blast radius and is mostly transparent to API users.

prb112 · 2021-06-01T12:10:46Z

It seems like a heavyweight approach with a huge blast area in the public model APIs all for what will likely be (at least initially) a single flag (e.g. extendedValidation)
I'm still leaning towards the "thread local configuration" version of the experiment. It has the smallest blast radius and is mostly transparent to API users.

I completely agree with these points.

Signed-off-by: John T.E. Timm <[email protected]>

* Issue #2424 - add support for non-validating builders/parsers Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - update copyright header Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updates per PR feedback Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - introduce ignoringUnrecognizedElements field Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated unit test Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - fixed issue with Builder.validate method Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc wording Signed-off-by: John T.E. Timm <[email protected]>

lmsurpre · 2021-06-05T20:34:41Z

John decided on a variation of the options presented here: a flag on the parser (and the underlying builders) that indicates whether it should perform validation (or not) while building objects.

This decision was made after taking another look at Effective Java, the Google Protobuf Java library, the javapoet Java library, and this Stack Overflow:
https://stackoverflow.com/questions/38173274/builder-pattern-validation-effective-java
Specifically the mention of this quote from Effective Java:

It is critical that they be checked after copying the parameters from the builder to the object, and that they be checked on the object fields rather than the builder fields (Item 39). If any invariants are violated, the build method should throw an IllegalStateException (Item 60).

* Issue #2424 - add support for non-validating builders/parsers Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - update copyright header Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updates per PR feedback Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - introduce ignoringUnrecognizedElements field Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated unit test Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - fixed issue with Builder.validate method Signed-off-by: John T.E. Timm <[email protected]> * Issue #2424 - updated Javadoc wording Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added the performance performance label May 24, 2021

JohnTimm self-assigned this May 24, 2021

JohnTimm mentioned this issue May 27, 2021

Add support for the deferral of resource deserialization when reading resources from the database #2432

Open

JohnTimm added this to the Sprint 2021-07 milestone May 28, 2021

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - add support for non-validating builders/parsers

c5151fc

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - update copyright header

88cd2d5

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - updates per PR feedback

45807ca

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - updated Javadoc

8d0eb6e

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - introduce ignoringUnrecognizedElements field

3a7e480

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - updated unit test

db7abbd

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - fixed issue with Builder.validate method

f9c4c05

Signed-off-by: John T.E. Timm <[email protected]>

JohnTimm added a commit that referenced this issue Jun 1, 2021

Issue #2424 - updated Javadoc wording

372e652

Signed-off-by: John T.E. Timm <[email protected]>

lmsurpre closed this as completed Jun 5, 2021

This was referenced Jul 12, 2021

Support skipping resource validation for trusted ingestion #2428

Open

Reject FHIR resources with the null char \u0000 or similar control chars #2605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate ways to speed up FHIRJsonParser #2424

Investigate ways to speed up FHIRJsonParser #2424

JohnTimm commented May 24, 2021 •

edited

Loading

JohnTimm commented May 25, 2021

JohnTimm commented May 25, 2021 •

edited

Loading

lmsurpre commented May 25, 2021 •

edited

Loading

JohnTimm commented May 26, 2021 •

edited

Loading

JohnTimm commented May 26, 2021

JohnTimm commented May 26, 2021 •

edited

Loading

prb112 commented May 27, 2021 •

edited

Loading

JohnTimm commented May 28, 2021 •

edited

Loading

prb112 commented Jun 1, 2021

lmsurpre commented Jun 5, 2021 •

edited

Loading

Investigate ways to speed up FHIRJsonParser #2424

Investigate ways to speed up FHIRJsonParser #2424

Comments

JohnTimm commented May 24, 2021 • edited Loading

JohnTimm commented May 25, 2021

JohnTimm commented May 25, 2021 • edited Loading

lmsurpre commented May 25, 2021 • edited Loading

JohnTimm commented May 26, 2021 • edited Loading

JohnTimm commented May 26, 2021

JohnTimm commented May 26, 2021 • edited Loading

prb112 commented May 27, 2021 • edited Loading

JohnTimm commented May 28, 2021 • edited Loading

prb112 commented Jun 1, 2021

lmsurpre commented Jun 5, 2021 • edited Loading

JohnTimm commented May 24, 2021 •

edited

Loading

JohnTimm commented May 25, 2021 •

edited

Loading

lmsurpre commented May 25, 2021 •

edited

Loading

JohnTimm commented May 26, 2021 •

edited

Loading

JohnTimm commented May 26, 2021 •

edited

Loading

prb112 commented May 27, 2021 •

edited

Loading

JohnTimm commented May 28, 2021 •

edited

Loading

lmsurpre commented Jun 5, 2021 •

edited

Loading