Skip to content

Latest commit

 

History

History
191 lines (133 loc) · 6.88 KB

README.md

File metadata and controls

191 lines (133 loc) · 6.88 KB

SEL-735 Meter Event Data Pipeline

This repository contains a set of Bash scripts that make up a data pipeline, designed to automate the process of interacting with an SEL-735 meter. The pipeline is divided into two main executable scripts:

  1. data_pipeline.sh: Handles the first four steps:

    • Connecting to the meter via FTP
    • Downloading new files
    • Organizing and creating metadata
    • Compressing data
  2. archive_pipeline.sh: Handles the final step:

    • Archiving and transferring event data to the dedicated server.

Prerequisites

Ensure you have the following before running the pipeline:

  • Unix-like environment (Linux, macOS, or a Unix-like Windows terminal)
  • FTP credentials for the meter
  • Meter Configuration
  • Must have installed:
    • lftp
    • yq
    • zip
    • rsync
    • jq

Installation

  1. Clone the repository:

    git clone [email protected]:acep-uaf/camio-meter-streams.git
    cd camio-meter-streams/cli_meter

    Note: You can check your SSH connection with ssh -T [email protected]

Configuration

General Configuration Steps

  1. Navigate to the config directory and copy the example configuration files to a new file:

    cd config
    cp config.yml.example config.yml
    cp archive_config.yml.example archive_config.yml
  2. Update the configuration files with the target details:

    • config.yml: Add the FTP server credentials and meter configuration data.
    • archive_config.yml: Add the source and destination directories and other relevant details.
  3. Secure the configuration files so that only the owner can read and write:

    chmod 600 config.yml
    chmod 600 archive_config.yml

Execution

To run the data pipeline and then transfer data to the target server:

  1. Run the Data Pipeline

    Execute the data_pipeline script from the cli_meter directory. The script requires a configuration file specified via the -c/--config flag. If this is your first time running the pipeline, the initial download may take a few hours. To pause the download safely, see: How to Stop the Pipeline

    Command

    ./data_pipeline.sh -c config/config.yml
  2. Run the Archive Pipeline

    After the data_pipeline script completes, execute the archive_pipeline script from the cli_meter directory. The script requires a configuration file specified via the -c/--config flag.

    Command

    ./archive_pipeline.sh -c config/archive_config.yml

    Notes

    The rsync uses the --exclude flag to exclude the working directory to ensure only complete files are transfered.

  3. Run the Cleanup Process (Conditional)

    If the archive_pipeline script completes successfully and the enable_cleanup flag is set to true in the archive configuration file, the cleanup.sh script will be executed automatically. This script removes outdated event files based on the retention period specified in the configuration file.

    If the enable_cleanup flag is not enabled, you can run the cleanup manually by passing in the archive configuration file.

    Command

    ./cleanup.sh -c config/archive_config.yml

    Notes

    Ensure that the archive_config.yml file is properly configured with the retention periods for each directory in the cleanup process.

How to Stop the Pipeline

When you need to stop the pipeline:

  • To Stop Safely/Pause Download:
    • Use Ctrl+C to interrupt the process.
    • If interupting the proccess doesn't work try Ctrl+\ to quit.
    • If you would like to resume the download, rerun the data_pipelinecommand.The download will resume from where it left off, provided the same config file (-c)is used.
  • Avoid Using Ctrl+Z:
    • Do not use Ctrl+Z to suspend the process, as it may cause the pipeline to end without properly closing the FTP connection.

Testing

This repository includes automated tests for the scripts using Bats (Bash Automated Testing System) along with helper libraries: bats-assert, bats-mock, and bats-support. The tests are located in the test directory and are automatically run on all pull requests using Github Actions to ensure code quality and functionality.

Prerequisites

Ensure you have cloned the repository with its required submodules, they should be located under the test and test/test_helper directories:

  • bats-core
  • bats-assert
  • bats-mock
  • bats-support
  1. Clone the repository with submodules:

    git clone --recurse-submodules [email protected]:acep-uaf/camio-meter-streams.git

    If you have already cloned the repository without submodules, you can initialize and update them with:

    git submodule update --init --recursive

Running the Tests

  1. Navigate to the project directory:

    cd /path/to/camio-meter-streams/cli_meter
  2. Run all the tests:

    bats test

Adding Tests

When making changes to the pipeline, it is essential to add or update tests to cover the new or modified functionality. Follow these steps to add tests:

  1. Locate the appropriate test file:

    Navigate to the test directory and identify the test file that corresponds to the functionality you're modifying. If no such file exists, create a new test file using the .bats extension (e.g., my_script_test.bats).

  2. Write your tests:

    Use bats-assert, bats-mock, and bats-support helper libraries to write comprehensive tests. Refer to the bats-core documentation.

    If your tests require shared variables or helper functions, define them in test/test-helper/commons.bash to ensure consistency and reusability across multiple test files. For example:

    # commons.bash
    MY_VARIABLE="common value"
    function my_helper_function {
        echo "This is a helper function"
    }

    Example test structure:

    @test "description of the test case" {  
        # Arrange  
        # Set up any necessary environment or input data.  
    
        # Act  
        result=$(command-to-test)  
    
        # Assert  
        assert_success  
        assert_output "expected output"  
    }  
  3. Run your tests locally:

    Ensure your new tests pass locally by running bats test.

  4. Commit and push your changes:

    Include your test updates in the same pull request as the code changes.

Continuous Testing with GitHub Actions

All tests in the repository are automatically executed through GitHub Actions on every pull request. This ensures that all contributions meet quality and functionality standards before merging. Ensure your pull request passes all tests to avoid delays in the review process.