This repository contains an Azure DevOps pipeline and associated scripts for delivering survey data. The process is triggered by Concourse jobs and utilises various components to securely transfer data from Blaise to on-premises locations.
Survey configurations are defined in JSON files within the configurations
folder. Here's the standard default configuration:
{
"deliver" : {
"spss" : true,
"ascii": true,
"json" : false,
"xml" : false
},
"createSubFolder" : false,
"packageExtension" : "zip",
"auditTrailData" : true,
"batchSize" : 0,
"throttleLimit" : 3
}
Setting | Description |
---|---|
deliver | Specifies which file formats to include in the delivered package. |
spss | Metadata in SPSS format. |
ascii | Data in ascii format, used for SPSS. |
json | Data in JSON format. |
xml | Metadata in XML format. |
createSubFolder | If true, creates a timestamped subfolder for the non-Blaise delivery formats. |
packageExtension | Determines the package file extension (e.g., zip). |
auditTrailData | If true, includes a CSV file containing audit trail information. |
batchSize | Sets the maximum number of cases per batch (0 for all records). |
throttleLimit | Limits the number of concurrently processed questionnaires. |
The default configuration delivers the Blaise data along with SPSS and ASCII formats. To customise the configuration for a survey, create a JSON file named <survey>.json
in the configurations
folder, where <survey>
is the survey type acronym (e.g., OPN, LM, IPS).
- Concourse job is triggered on schedule or manually.
- Job passes survey name and Azure DevOps pipeline ID to another pipeline, which calls shell scripts.
- Shell script calls Python script.
- Python script initiates Azure DevOps pipeline via secure HTTP request.
- Azure DevOps pipeline runs data delivery YAML on a dedicated VM via agent.
- YAML executes scripts, referencing survey-specific config JSON.
- PowerShell scripts set up the environment (Blaise license, Manipula).
- Blaise-CLI fetches survey data using NuGet package.
- Manipula generates various data formats (CSV, JSON, SPSS, ASCII).
- PowerShell zips data and places it in NiFi staging bucket.
- Cloud function encrypts the zip and moves it to the NiFi bucket.
- Another cloud function publishes zip metadata to Pub/Sub.
- NiFi monitors the Pub/Sub topic.
- NiFi consumes the message, unzips the data, and delivers it on-premises.
An automated solution for delivering survey data files from a sandbox NiFi Google Cloud Storage bucket to the development (dev) NiFi Google Cloud Storage bucket.
This supports future survey uplifts and testing in sandbox environments by delivering test data to the development (dev) environment for business approval.
The cloud function also renames files to distinguish them from normal survey data.
- A
.zip
file containing survey data arrives in the sandbox NiFi bucket. - The cloud function is triggered and determines if the filename starts with
dd_<survey>_<timestamp>.zip
. - If the prefix
dd
is found, the process continues, otherwise it is ignored. - It renames the file to include the sandbox/environment suffix, like so:
dd_<survey>_sandbox_<env_suffix>_<timestamp>.zip
- The renamed file is then copied into the development (dev) NiFi bucket.