issues #1839 and #1743 - support search parameter disambiguation

1. Update ParametersMap to support storing multiple search parameters with the same code 2. Address #1743 by collecting to a map instead of a list 3. Update SearchUtil.getSearchParameter to lookup the search parameter by URI from the config if possible (instead of applying the filter to the full set of built-in parameters). 4. Update the docs to reflect that search parameter filtering now applies to tenant-specific search parameters as well. This should help us move toward #1596 Also fixed a bad trace message and did some minor formatting / javadoc. Signed-off-by: Lee Surprenant <[email protected]>
LinuxForHealth · Dec 23, 2020 · 2a6e351 · 2a6e351
1 parent c888e1d
commit 2a6e351
Show file tree

Hide file tree

Showing 7 changed files with 160 additions and 145 deletions.
diff --git a/docs/src/pages/guides/FHIRSearchConfiguration.md b/docs/src/pages/guides/FHIRSearchConfiguration.md
@@ -36,11 +36,11 @@ To configure tenant-specific search parameters, create a file called `extension-
 
 When the IBM FHIR Server processes a request associated with the `acme` tenant, the server uses the built-in search parameters and the user-defined search parameters defined in the `acme` tenant's `extension-search-parameters.json` file. Likewise, when processing a request associated with the `qpharma` tenant, the server uses the built-in search parameters and the user-defined search parameters defined in the `qpharma` tenant's `extension-search-parameters.json` file.
 
-If a tenant-specific extension-search-parameters.json file does not exist, the server falls back to the global `extension-search-parameters.json` file found at `${server.config.dir}/config/default/extension-search-parameters.json`.
+If a tenant-specific extension-search-parameters.json file does not exist, the server falls back to the global `extension-search-parameters.json` file found at `${server.config.dir}/config/default/extension-search-parameters.json`. For performance reasons, we recommend having an `extension-search-parameters.json` for each tenant.
 
-The IBM FHIR Server caches search parameters in memory (organized first by tenant id, then by resource type and search parameter). Any updates to a tenant's `extension-search-parameters.json` file causes the IBM FHIR Server to re-load the tenant's search parameters and refresh the information stored in the cache, without requiring a server re-start. This allows the deployer to deploy a new tenant's `extension-search-parameters.json` or update an existing file without re-starting the IBM FHIR Server and any subsequent requests processed by the IBM FHIR Server after the updates have been made use the updated search parameters. However, it is important to note that this process **does not** re-index already-created resources that are stored on the IBM FHIR Server. One technique for updating the indices for a given resource type is to `read` and `update` each resource instance with itself, triggering search parameter extraction (and creating a new version of each resource).
+The IBM FHIR Server caches search parameters in memory (organized first by tenant id, then by resource type and search parameter code). Any updates to a tenant's `extension-search-parameters.json` file causes the IBM FHIR Server to re-load the tenant's search parameters and refresh the information stored in the cache, without requiring a server re-start. This allows the deployer to deploy a new tenant's `extension-search-parameters.json` or update an existing file without re-starting the IBM FHIR Server and any subsequent requests processed by the IBM FHIR Server after the updates have been made use the updated search parameters. However, it is important to note that this process **does not** re-index already-created resources that are stored on the IBM FHIR Server.
 
-Starting in version 4.5.0, the IBM FHIR Server supports [re-indexing resources](#2-re-index) with an updated set of search parameters. This is very similar to creating a new version of the resources, except in this case the version number doesn't change and the data for the resource never leaves the server.
+Starting in version 4.5.0, the IBM FHIR Server supports [re-indexing resources](#2-re-index) with an updated set of search parameters. This is very similar to creating a new version of the resources, except in this case the lastUpdated time and the resource version number don't change and the data for the resource never leaves the server.
 
 #### 1.1.1 Search parameters configuration: extension-search-parameters.json
 To configure the IBM FHIR Server with one or more custom search parameters, create a file called `extension-search-parameters.json` and populate the contents with a Bundle of `SearchParameter` resources.
@@ -95,10 +95,10 @@ The `fhir-search` module requires that the [expression](https://www.hl7.org/fhir
 
 A few things to note are:
 - This SearchParameter includes an xpath element for completeness, but the IBM FHIR Server does not use the XPath during extraction; it only uses the expression (FHIRPath).
-- The SearchParameter with a path including `value` use the Choice data types which are determined based on the SearchParameter type .
+- The SearchParameter with a path including `value` use the Choice data types which are determined based on the SearchParameter type.
 - Each time a resource is created or updated, the IBM FHIR Server evaluates the FHIRPath expression applicable to the resource type and indexes the values of the matching elements, making these available via a search where the query parameter name matches the `code` element on the `SearchParameter` definition.
 
-In the preceding example, extension elements (on a Patient resource) with a url of `http://ibm.com/fhir/extension/Patient/favorite-color` are indexed by the `favorite-color` search parameter. To search for Patients with a favorite color of "pink", users could send an HTTP GET request to a URL like `[base]/api/v1/Patient?favorite-color:exact=pink`.
+In the preceding example, extension elements (on a Patient resource) with a url of `http://ibm.com/fhir/extension/Patient/favorite-color` are indexed by the `favorite-color` search parameter. To search for Patients with a favorite color of "pink", users could send an HTTP GET request to a URL like `[base]/Patient?favorite-color:exact=pink`.
 
 For more information on search parameters, see the [HL7 FHIR specification](https://www.hl7.org/fhir/R4/searchparameter.html).
 
@@ -108,11 +108,11 @@ When creating the SearchParameter FHIRPath expression, be sure to test both the
 If a search parameter expression extracts an element with a data type that is incompatible with the declared search parameter type, the server skips the value and logs a message. For choice elements, like Extension.value, its recommended to restrict the expression to values of the desired type by using the `as` function. For example, to select only Decimal values from the http://example.org/decimal extension, use an expressions like `Basic.extension.where(url='http://example.org/decimal').value.as(Decimal)`.
 
 ### 1.2 Filtering
-The IBM FHIR Server supports the filtering of built-in search parameters. The default behavior of the IBM FHIR Server is to consider all built-in search parameters when storing resources or processing search requests, but you can configure inclusion filters to restrict the IBM FHIR Server's view to specific search parameters on a given resource type. This filtering feature does not apply to user-defined search parameters in the extension-search-parameters.json file. User-defined search parameters are always included in the IBM FHIR Server's view regardless of the configured inclusion filters.
+The IBM FHIR Server supports the filtering of search parameters through `fhir-server-config.json`. The default behavior of the IBM FHIR Server is to consider all built-in and tenant-specific search parameters when storing resources or processing search requests, but you can configure inclusion filters to restrict the IBM FHIR Server's view to specific search parameters on a given resource type.
 
 Why would you want to filter built-in search parameters? The answer lies in how search parameters are used by the IBM FHIR Server. When the FHIR server processes a _create_ or _update_ operation, it stores the resource contents in the datastore, along with search index information that is used by the IBM FHIR Server when performing search operations. The search index information stored for a particular resource instance is driven by the search parameters defined for that resource type. Therefore if you are storing a resource whose type has a lot of built-in search parameters defined for it (e.g. `Patient`), then you could potentially be storing a lot of search index information for each resource.
 
-For performance and scalability reasons, it might be desirable to limit the number of search parameters considered during a _create_ or _update_ operation for particular resource types, if those search parameters will never be used in search operations. After all, if there will be no need to use the search index information, there's no need to store it. For example, if you know that due to the way in which a particular tenant's applications use the FHIR REST API that those applications will never need to search Patients by birthdate, then there would be no need to store search index information for the `birthdate` attribute in `Patient` resources. Consequently, you could filter out the `birthdate` search parameter for the `Patient` resource type and not lose any needed functionality, but yet save a little bit of storage capacity in your datastore.
+For performance and scalability reasons, it might be desirable to limit the number of search parameters considered during a _create_ or _update_ operation for particular resource types. If there will be no need to use the search index information, there's no need to store it. For example, if you know that due to the way in which a particular tenant's applications use the FHIR REST API that those applications will never need to search Patients by birthdate, then there would be no need to store search index information for the `birthdate` attribute in `Patient` resources. Consequently, you could filter out the `birthdate` search parameter for the `Patient` resource type and not lose any needed functionality, but yet save a little bit of storage capacity in your datastore.
 
 The search parameter filtering feature is supported through a set of inclusion rules specified via the `fhirServer/resources` property group in `fhir-server-config.json`. The search parameter inclusion rules allow you to define a set of search parameters per resource type that should be included in the IBM FHIR Server's view of search parameters when storing resources and performing search operations. The following snippet shows the general form for specifying inclusion rules:
 
@@ -227,7 +227,7 @@ However, in order for a resource to be returned as expected on a compartment sea
 }
 ```
 
-In order to avoid this issue, inclusion critera search parameters should not be filtered out. If any filtering is configured in `fhir-server-config.json` for resources that may be members of a compartment, their inclusion criteria search parameters should be included in the list of allowed search parameters. Again using the example above, if search parameter filtering is specified for the `Observation` resource type, the `subject` and `performer` search parameters must be specified in the `searchParameters` list (assuming an entry of `"*": "*"` is not specified) in order for `Observation` resources to be returned in `Patient` compartment searches. The following snippet illustrates a search parameter configuration in which the `subject` and `performer` search parameters have been included in the list of allowed search parameters:
+In order to avoid this issue, inclusion criteria search parameters should not be filtered out. If any filtering is configured in `fhir-server-config.json` for resources that may be members of a compartment, their inclusion criteria search parameters should be included in the list of allowed search parameters. Again using the example above, if search parameter filtering is specified for the `Observation` resource type, the `subject` and `performer` search parameters must be specified in the `searchParameters` list (assuming an entry of `"*": "*"` is not specified) in order for `Observation` resources to be returned in `Patient` compartment searches. The following snippet illustrates a search parameter configuration in which the `subject` and `performer` search parameters have been included in the list of allowed search parameters:
 
 ```
 "resources": {

diff --git a/fhir-search/src/main/java/com/ibm/fhir/search/parameters/ParametersMap.java b/fhir-search/src/main/java/com/ibm/fhir/search/parameters/ParametersMap.java
@@ -5,10 +5,9 @@
  */
 package com.ibm.fhir.search.parameters;
 
-import java.util.ArrayList;
 import java.util.Collection;
+import java.util.HashSet;
 import java.util.LinkedHashMap;
-import java.util.List;
 import java.util.Map;
 import java.util.Map.Entry;
 import java.util.Objects;
@@ -24,7 +23,7 @@
 public class ParametersMap {
     private static final Logger log = Logger.getLogger(ParametersMap.class.getName());
 
-    private final Map<String, SearchParameter> codeMap;
+    private final Map<String, Set<SearchParameter>> codeMap;
     private final Map<String, SearchParameter> urlMap;
 
     /**
@@ -44,38 +43,13 @@ public void insert(String code, SearchParameter parameter) {
         Objects.requireNonNull(parameter, "cannot insert a null parameter");
 
         String url = parameter.getUrl().getValue();
-        if (codeMap.containsKey(code)) {
-            SearchParameter previous = codeMap.get(code);
-            if (previous.getExpression() == null || previous.getExpression().equals(parameter.getExpression())) {
-                if (log.isLoggable(Level.FINE) && !url.equals(previous.getUrl().getValue())) {
-                    log.fine("SearchParameter with code '" + code + "' already exists with the same expression; "
-                            + "adding additional url '" + url + "'");
-                }
-            } else {
-                // Sometimes the base spec defines a search parameter like 'Type1.field | Type2.field' and an IG
-                // will refine that for a single resource, so try splitting on '|' and matching the subcomponents
-                String[] split = previous.getExpression().getValue().split("\\|");
-                List<String> clauses = new ArrayList<>();
-                for (String string : split) {
-                    string = string.trim();
-                    if (string.startsWith(parameter.getBase().get(0).getValue())) {
-                        clauses.add(string);
-                    }
-                }
-                String previousExpressionString = String.join(" | ", clauses);
-                if (previousExpressionString != null && previousExpressionString.equals(parameter.getExpression().getValue())) {
-                    if (!url.equals(previous.getUrl().getValue())) {
-                        log.info("SearchParameter with code '" + code + "' already exists with a similar expression; "
-                                + "adding additional url '" + url + "'");
-                    }
-                } else {
-                    log.warning("SearchParameter with code '" + code + "' already exists with a different expression;\n"
-                            + "replacing [url=" + previous.getUrl().getValue() + ", expression=" + previous.getExpression().getValue()
-                            + "] with [url=" + url + ", expression=" + parameter.getExpression().getValue() + "]");
-                }
+        Set<SearchParameter> previousParams = codeMap.get(code);
+        if (previousParams != null && previousParams.size() > 0) {
+            if (log.isLoggable(Level.FINE)) {
+                log.fine("SearchParameter with code '" + code + "' already exists; adding additional parameter with url '" + url + "'");
             }
         }
-        codeMap.put(code, parameter);
+        codeMap.computeIfAbsent(code, k -> new HashSet<>()).add(parameter);
 
         if (urlMap.containsKey(url)) {
             SearchParameter previous = urlMap.get(url);
@@ -97,12 +71,12 @@ public void insert(String code, SearchParameter parameter) {
      * @implSpec package-private to prevent insertion from outside the package
      */
     public void insertAll(ParametersMap map) {
-        for (Entry<String, SearchParameter> entry : map.codeMap.entrySet()) {
-            insert(entry.getKey(), entry.getValue());
+        for (SearchParameter sp : map.urlMap.values()) {
+            insert(sp.getCode().getValue(), sp);
         }
     }
 
-    public SearchParameter lookupByCode(String searchParameterCode) {
+    public Set<SearchParameter> lookupByCode(String searchParameterCode) {
         return codeMap.get(searchParameterCode);
     }
 
@@ -111,7 +85,7 @@ public SearchParameter lookupByUrl(String searchParameterUrl) {
     }
 
     public Collection<SearchParameter> values() {
-        return codeMap.values();
+        return urlMap.values();
     }
 
     public boolean isEmpty() {
@@ -122,7 +96,7 @@ public int size() {
         return codeMap.size();
     }
 
-    public Set<Entry<String, SearchParameter>> codeEntries() {
+    public Set<Entry<String, Set<SearchParameter>>> codeEntries() {
         return codeMap.entrySet();
     }
 

diff --git a/fhir-search/src/main/java/com/ibm/fhir/search/parameters/ParametersUtil.java b/fhir-search/src/main/java/com/ibm/fhir/search/parameters/ParametersUtil.java
@@ -99,12 +99,12 @@ private static Map<String, ParametersMap> buildSearchParametersMap() {
 
             if (parameter.getExpression() == null || !parameter.getExpression().hasValue()) {
                 if (log.isLoggable(Level.FINE)) {
-                    log.fine(String.format(MISSING_EXPRESSION, parameter.getCode().getValue()));
+                    log.fine(String.format(MISSING_EXPRESSION_WARNING, parameter.getCode().getValue()));
                 }
             } else {
                 /*
                  * In R4, SearchParameter changes from a single Base resource to an array.
-                 * As Base is an array, there are going be potential collisions in the map.
+                 * As Base is an array, there are going to be potential collisions in the map.
                  */
                 List<ResourceType> types = parameter.getBase();
                 for (ResourceType type : types) {

diff --git a/...rc/main/java/com/ibm/fhir/search/parameters/cache/TenantSpecificSearchParameterCache.java b/...rc/main/java/com/ibm/fhir/search/parameters/cache/TenantSpecificSearchParameterCache.java
@@ -25,7 +25,7 @@
 /**
  * This class implements a cache of SearchParameters organized by tenantId. Each object stored in the cache will be a
  * two-level map of SearchParameters organized first by resource type, then by search parameter code.
- * 
+ *
  * Note: While we support json format only, to enable XML, it's best to create a new cache specific to XML. This change
  * should change one line in this class, and be instantiated in the SearchUtil, and embedded in the call to Parameters.
  * Alternatively, one could, upon not finding the JSON file, load the XML file.
@@ -59,11 +59,10 @@ public Map<String, ParametersMap> createCachedObject(File f) throws Exception {
             log.fine(String.format(LOG_FILE_LOAD, f.toURI()));
         }
         try (Reader reader = new FileReader(f);) {
+            // Default is to use JSON in R4
             Bundle bundle = FHIRParser.parser(Format.JSON).parse(reader);
             return ParametersUtil.buildSearchParametersMapFromBundle(bundle);
         } catch (Throwable t) {
-            // In R4, there are two files used with postfix JSON.
-            // Default is to use JSON in R4
             throw new FHIROperationException(String.format(OPERATION_EXCEPTION, f.getAbsolutePath()), t);
         }
     }