Blaise Data Delivery

This repository contains an Azure DevOps pipeline and associated scripts for delivering survey data. The process is triggered by Concourse jobs and utilises various components to securely transfer data from Blaise to on-premises locations.

Configuration

Survey configurations are defined in JSON files within the configurations folder. Here's the standard default configuration:

{
    "deliver" : {
        "spss" : true,
        "ascii": true,
        "json" : false,
        "xml" : false
    },

    "createSubFolder" : false,
    "packageExtension" : "zip",
    "auditTrailData" : true,
    "batchSize" : 0,
    "throttleLimit" : 3
}

Configuration settings

Setting	Description
deliver	Specifies which file formats to include in the delivered package.
spss	Metadata in SPSS format.
ascii	Data in ascii format, used for SPSS.
json	Data in JSON format.
xml	Metadata in XML format.
createSubFolder	If true, creates a timestamped subfolder for the non-Blaise delivery formats.
packageExtension	Determines the package file extension (e.g., zip).
auditTrailData	If true, includes a CSV file containing audit trail information.
batchSize	Sets the maximum number of cases per batch (0 for all records).
throttleLimit	Limits the number of concurrently processed questionnaires.

Setting up new survey

The default configuration delivers the Blaise data along with SPSS and ASCII formats. To customise the configuration for a survey, create a JSON file named <survey>.json in the configurations folder, where <survey> is the survey type acronym (e.g., OPN, LM, IPS).

High level data delivery process

Concourse job is triggered on schedule or manually.
Job passes survey name and Azure DevOps pipeline ID to another pipeline, which calls shell scripts.
Shell script calls Python script.
Python script initiates Azure DevOps pipeline via secure HTTP request.
Azure DevOps pipeline runs data delivery YAML on a dedicated VM via agent.
YAML executes scripts, referencing survey-specific config JSON.
PowerShell scripts set up the environment (Blaise license, Manipula).
Blaise-CLI fetches survey data using NuGet package.
Manipula generates various data formats (CSV, JSON, SPSS, ASCII).
PowerShell zips data and places it in NiFi staging bucket.
Cloud function encrypts the zip and moves it to the NiFi bucket.
Another cloud function publishes zip metadata to Pub/Sub.
NiFi monitors the Pub/Sub topic.
NiFi consumes the message, unzips the data, and delivers it on-premises.

Sandbox data delivery cloud function

An automated solution for delivering survey data files from a sandbox NiFi Google Cloud Storage bucket to the development (dev) NiFi Google Cloud Storage bucket.
This supports future survey uplifts and testing in sandbox environments by delivering test data to the development (dev) environment for business approval.
The cloud function also renames files to distinguish them from normal survey data.

Sandbox data delivery process

A .zip file containing survey data arrives in the sandbox NiFi bucket.
The cloud function is triggered and determines if the filename starts with dd_<survey>_<timestamp>.zip.
If the prefix dd is found, the process continues, otherwise it is ignored.
It renames the file to include the sandbox/environment suffix, like so: dd_<survey>_sandbox_<env_suffix>_<timestamp>.zip
The renamed file is then copied into the development (dev) NiFi bucket.

Name		Name	Last commit message	Last commit date
Latest commit History 328 Commits
.github/workflows		.github/workflows
blaise_data_delivery_functions		blaise_data_delivery_functions
configuration		configuration
scripts		scripts
slack		slack
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cloud_logging.py		cloud_logging.py
data_delivery_pipeline.yml		data_delivery_pipeline.yml
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
test_main.py		test_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blaise Data Delivery

Configuration

Configuration settings

Setting up new survey

High level data delivery process

Sandbox data delivery cloud function

Sandbox data delivery process

About

Contributors 19

Languages

License

ONSdigital/blaise-data-delivery

Folders and files

Latest commit

History

Repository files navigation

Blaise Data Delivery

Configuration

Configuration settings

Setting up new survey

High level data delivery process

Sandbox data delivery cloud function

Sandbox data delivery process

About

Resources

License

Stars

Watchers

Forks

Contributors 19

Languages