Skip to content

Commit

Permalink
issue #3136 added documentation and create container feature
Browse files Browse the repository at this point in the history
Signed-off-by: Robin Arnold <[email protected]>
  • Loading branch information
punktilious committed Feb 23, 2022
1 parent 960d880 commit e362758
Show file tree
Hide file tree
Showing 4 changed files with 291 additions and 3 deletions.
192 changes: 192 additions & 0 deletions fhir-persistence-blob-app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# Payload Offload for Azure Blob

The IBM FHIR Server can be configured to store the resource payload records in an Azure Blob container. Each record is stored as a JSON string with UTF-8 encoding. The IBM FHIR Server relies on the blob service to compress/encrypt the data at rest. The blob service must apply the necessary security controls required when storing PHI, and connection to the service must use an encrypted transport.

## Configurating the IBM FHIR Server to use Payload Offloading with Azure Blob

To enable payload offloading using Azure Blob storage, complete the steps summarized below:

1. Pick a container name to use for each tenant/datasource combination;
2. Create a database.properties file with connection information for the FHIRSERVER user (not FHIRADMIN);
3. Configure payload offloading in the main `default/fhir-server-config.json` file;
4. Configure payload offloading in each tenant-specific `fhir-server-config.json` file;
5. Create the container using the Azure Blob user interface, the Azure Blob API or the IBM FHIR Server fhir-persistence-blob-app-*-cli.jar tool;
6. (Optional) Periodically run the reconciliation tool to remove any orphan resource records from the offload datastore.

Take the following restrictions into account:

1. Payload offloading is enabled at the server level. Offloading cannot be enabled/disabled on a per-tenant basis. If you want to support offloading for just one tenant, use a different IBM FHIR Server instance;
2. Payload offloading must be configured prior to ingesting any resource data;
3. Payload offloading must not be disabled after resource data has been ingested;

### 1. Pick a Container Name

Each tenant/datasource combination requires its own container. Use a name that can be easily identified as belonging to the tenant and datasource, subject to the Azure Blob service naming restrictions.

### 2. Create the `database.properties` file

Create a properties file containing the RDBMS connection information. Note that the user should not be the FHIRADMIN user which is only used by the RDBMS schema creation tool. Use the database user configured in the IBM FHIR Server `datasource.xml` file. Following the principle of least privilege access, this user typically has just the right set of privileges for the application to use objects in the FHIR data (`fhirdata`) schema:

```
db.host=localhost
db.port=5432
db.database=fhirdb
user=fhirserver
password=change-password
currentSchema=fhirdata
```

### 3. Add Payload Offload Configuration to `default/fhir-server-config.json`

In the main `default/fhir-server-config.json`, configure the `fhirServer/persistence/factoryClassname` as shown below and add a `fhirServer/persistence/payload` block containing the connection information for each datasource you have defined under `fhirServer/persistence/datasources` (typically there is just one datasource called `default`).
```
{
"__comment": "FHIR Server configuration",
"fhirServer": {
...
"persistence": {
"factoryClassname": "com.ibm.fhir.persistence.blob.FHIRPersistenceJDBCBlobFactory",
"datasources": {
...
},
"payload": {
"default": {
"__comment": "Azure Blob (azurite docker) configuration for storing FHIR resource payload data",
"type": "azure.blob",
"connectionProperties" : {
"connectionString": "your-azure-connection-string",
"containerName": "default-default"
}
}
}
}
}
}
```

Container names allowed by Azure Blob are more restrictive than IBM FHIR Server tenant and datasource names (for example `_` is not allowed). For this reason, the container name for each tenant and datasource must be specified in the `fhirServer/persistence/payload/connectionProperties/containerName` property.

### 4. Configure Payload Offloading Per Tenant

In each tenant `fhir-server-config.json` file, add a `fhirServer/persistence/payload` block containing the connection information for each datasource you have defined under `fhirServer/persistence/datasources` (typically there is just one datasource called `default`).

```
{
"__comment": "FHIR Server configuration",
"fhirServer": {
...
"persistence": {
"datasources": {
...
},
"payload": {
"default": {
"__comment": "Azure Blob (azurite docker) configuration for storing FHIR resource payload data",
"type": "azure.blob",
"connectionProperties" : {
"connectionString": "your-azure-connection-string",
"containerName": "default-default"
}
}
}
}
}
}
```


### 5. Create the Container

The container can be created using the Azure Blob service or by running the following command:

```
java -jar fhir-persistence-blob-app-*-cli.jar \
--fhir-config-dir /path/to/wlp/usr/servers/defaultServer \
--tenant-id <tenant-id> \
--ds-id <ds-id> \
--create-container \
--confirm
```

The container name will be read from the tenant's fhir-server-config.json configuration, so it is important to complete that configuration step before running this command.

The command is designed to be idempotent - it checks first to see if the container already exists before attempting to create it. If the container is otherwise created after the exists check but before the create command is issued, the command will fail (this is a very small window).

## 6. Running Reconciliation

When payload offloading is configured, the IBM FHIR Server stores the payload in an Azure Blob container. This store operation is not transactional, so if the global transaction is rolled back, any cleanup of the payload data stored during the transaction must be handled by the application code.

If the IBM FHIR Server terminates before this cleanup is completed, records may be left in the container which are not associated with any resource record in the RDBMS. Although this is likely to be uncommon, the IBM FHIR Server provides a reconciliation tool to scan the container and look for resource payload records which do not have a corresponding RDBMS record. The reconciliation tool can optionally delete these records.

The following examples use PostgreSQL as the database type, but the tool also supports db2 and derby as options.

To identify orphan records without deleting anything, run:
```
java -jar fhir-persistence-blob-app-*-cli.jar \
--fhir-config-dir /path/to/wlp/usr/servers/defaultServer \
--tenant-id default \
--ds-id default \
--db-properties database.properties \
--db-type postgresql \
--reconcile \
--dry-run \
--max-scan-seconds 600
```

To identify and delete orphan records, run:
```
java -jar fhir-persistence-blob-app-*-cli.jar \
--fhir-config-dir /path/to/wlp/usr/servers/defaultServer \
--tenant-id default \
--ds-id default \
--db-properties database.properties \
--db-type postgresql \
--reconcile \
--confirm \
--max-scan-seconds 600
```

To continue an earlier scan and delete which didn't complete, use the last continuation token value reported in the log of the previous run then specify when running the command again:

```
java -jar fhir-persistence-blob-app-*-cli.jar \
--fhir-config-dir /path/to/wlp/usr/servers/defaultServer \
--tenant-id default \
--ds-id default \
--db-properties database.properties \
--db-type postgresql \
--reconcile \
--confirm \
--max-scan-seconds 600 \
--continuation-token "<token>"
```
The continuation token can be identified in the log by searching for the string `__CONTINUATION_TOKEN__ =`. The application logs an INFO message after each page of records is processed, so use the last occurrence found in the log. The value of the continuation token is opaque and only meaningful to the Azure Blob service.

## Development

For development, payload offload can be configured to point to a local `azurite` container which emulates the API of the Azure Blob store service. Run the following command to start a local `azurite` container:

```
podman run -d -p 10000:10000 \
-v ./data/azurite:/data:z \
mcr.microsoft.com/azure-storage/azurite \
azurite-blob --blobHost 0.0.0.0 --blobPort 10000
```

More info on using `azurite` for development can be found in the official documents here: https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azurite?tabs=docker-hub.

## Command Line Options

| Option | Argument Type | Description |
| -------- | ---- | ----------- |
| --fhir-config-dir `<dir>` | String | (Required) Path to the IBM FHIR Server base server configuration directory. |
| --tenant-id `<tenant-id>` | String | (Optional, Default="default") The tenant identifier. |
| --ds-id `<ds-id>` | String | (Optional, Default="default") The datasource identifier. |
| --db-properties `<db-properties>` | String | (Required for running with `--reconcile`) File name of a `.properties` file containing database connection details. |
| --db-type `<db-type>` | String | (Required for running with `--reconcile`) The database type, `postgresql`, `db2` or `derby`. |
| --reconcile | | Run the reconciliation process |
| --create-container | | Create the container. The container name is obtained by reading the IBM FHIR Server tenant payload configuration. |
| --dry-run | | (Default) Do not make any changes (create/delete). |
| --confirm | | (Optional) Enable changes (create/delete). |
| --max-scan-seconds `<seconds>` | Integer | (Optional) Stop the scan after `<seconds>`. The scan emits a continuation token which can be used to restart the scan from a prior point. |
| --continuation-token `<token>` | String | (Optional) Start the scan from a previous point.|
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
* (C) Copyright IBM Corp. 2022
*
* SPDX-License-Identifier: Apache-2.0
*/

package com.ibm.fhir.persistence.blob.app;

import java.util.logging.Level;
import java.util.logging.Logger;

import com.ibm.fhir.config.FHIRRequestContext;
import com.ibm.fhir.exception.FHIRException;
import com.ibm.fhir.persistence.blob.BlobContainerManager;
import com.ibm.fhir.persistence.blob.BlobManagedContainer;

/**
* Create the container if it doesn't currently exist
*/
public class CreateContainer {
private static final Logger logger = Logger.getLogger(CreateContainer.class.getName());
private final String tenantId;
private final String dsId;
private final boolean dryRun;

/**
* Public constructor
* @param tenantId
* @param dsId
* @param dryRun
*/
public CreateContainer(String tenantId, String dsId, boolean dryRun) {
this.tenantId = tenantId;
this.dsId = dsId;
this.dryRun = dryRun;
}

/**
* Create the container for the configured tenant and datasource pair
*/
public void run() throws FHIRException {
// Set up the request context for the configured tenant and datastore
FHIRRequestContext.set(new FHIRRequestContext(tenantId, dsId));

// Check to see if the container already exists, and if not, then
// issue the create container command and wait for the response
BlobManagedContainer bmc = BlobContainerManager.getSessionForTenantDatasource();
try {
Boolean exists = bmc.getClient().exists().block();
if (exists != null && exists.booleanValue()) {
logger.info("Container already exists: " + bmc.getContainerName());
return;
}

if (!this.dryRun) {
logger.info("Creating container: " + bmc.getContainerName());
bmc.getClient().create().block();
logger.info("Container created");
} else {
logger.info("[dry-run] Creating container: " + bmc.getContainerName());
}
} catch (RuntimeException x) {
logger.log(Level.SEVERE, "failed to create container: " + bmc.getContainerName(), x);
throw x;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

import com.ibm.fhir.config.FHIRConfiguration;
import com.ibm.fhir.database.utils.model.DbType;
import com.ibm.fhir.exception.FHIRException;
import com.ibm.fhir.persistence.blob.BlobContainerManager;

/**
Expand All @@ -33,6 +34,7 @@ public class Main {
private String dsId;
private DbType dbType;
private boolean reconcile;
private boolean createContainer;
private boolean dryRun = true;
private String continuationToken = null;

Expand Down Expand Up @@ -92,6 +94,9 @@ protected void parseArgs(String[] args) {
case "--reconcile":
this.reconcile = true;
break;
case "--create-container":
this.createContainer = true;
break;
case "--confirm":
this.dryRun = false;
break;
Expand Down Expand Up @@ -135,14 +140,31 @@ protected void process() throws Exception {
}

FHIRConfiguration.setConfigHome(fhirConfigDir);

boolean didSomething = false;
if (createContainer) {
createContainer();
didSomething = true;
}

if (this.reconcile) {
runReconciliation();
} else {
// just in case we want to extend this app to support more functions
logger.info("Nothing do to");
didSomething = true;
}

if (!didSomething) {
throw new IllegalArgumentException("Must specify at least one action");
}
}

/**
* Create the container
*/
private void createContainer() throws FHIRException {
CreateContainer action = new CreateContainer(this.tenantId, this.dsId, this.dryRun);
action.run();
}

/**
* Run the reconciliation process
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,11 @@ public BlobContainerAsyncClient getClient() {
public BlobPropertyGroupAdapter getProperties() {
return this.properties;
}

/**
* @return the containerName property value
*/
public String getContainerName() {
return properties.getContainerName();
}
}

0 comments on commit e362758

Please sign in to comment.