Skip to content

Commit

Permalink
Add ECG Data Pipeline Tool (#53)
Browse files Browse the repository at this point in the history
# *Add ECG Data Pipeline Tool*

## ♻️ Current situation & Problem
This PR adds the ECG Data Pipeline tool into a separate folder in the
root of the project repository. Currently, the project lacks a dedicated
pipeline for processing, analyzing and visualising the ECG data. This
addition aims to provide a streamlined way to handle ECG data with
enhanced analysis and visualisation capabilities.


## ⚙️ Release Notes 
- Added ECG Data Pipeline folder containing the pipeline tool for
processing, analyzing, and visualising ECG data.
- The pipeline includes modules for data preparation, access to
Firebase, utility functions, and data visualization, alongside an
interactive Python notebook (ECGDataPipelineTemplate.ipynb) for ECG data
review.
- Migration Guide: No breaking changes introduced. Users can integrate
the pipeline tool into their workflow by cloning the repository and
following the setup instructions in the README.

```python
!git clone https://github.com/StanfordBDHG/PediatricAppleWatchStudy.git
%cd PediatricAppleWatchStudy/ECGDataPipeline
```

## 📚 Documentation
The ECG Data Pipeline tool is fully documented. The README within the
ECG Data Pipeline folder provides an overview, setup instructions, and
usage examples for integration and utilization of the tool.

## ✅ Testing
A job to build and test the ECG Data Pipeline notebook was added into
the existing workflow build-and-test. This includes setting up Python,
NodeJS, Java, and LaTeX environments; installing necessary dependencies;
and executing the notebook with Firebase emulators to validate the data
pipeline's functionality. The notebook is then converted to a PDF
document, which is uploaded as an artifact for review.

### Code of Conduct & Contributing Guidelines 

By submitting creating this pull request, you agree to follow our [Code
of
Conduct](https://github.com/StanfordBDHG/.github/blob/main/CODE_OF_CONDUCT.md)
and [Contributing
Guidelines](https://github.com/StanfordBDHG/.github/blob/main/CONTRIBUTING.md):
- [x] I agree to follow the [Code of
Conduct](https://github.com/StanfordBDHG/.github/blob/main/CODE_OF_CONDUCT.md)
and [Contributing
Guidelines](https://github.com/StanfordBDHG/.github/blob/main/CONTRIBUTING.md).

---------

Co-authored-by: Paul Schmiedmayer <[email protected]>
  • Loading branch information
Vicbi and PSchmiedmayer authored Mar 21, 2024
1 parent 5085051 commit 84a0ddd
Show file tree
Hide file tree
Showing 24 changed files with 4,537 additions and 35 deletions.
50 changes: 50 additions & 0 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,53 @@ jobs:
coveragereports: PAWS.xcresult
secrets:
token: ${{ secrets.CODECOV_TOKEN }}
buildandtestdatapipelinenotebook:
name: Build and Test ECG Data pipeline Notebook
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
- name: Setup NodeJS
uses: actions/setup-node@v3
- name: Setup Java
uses: actions/setup-java@v3
with:
distribution: 'microsoft'
java-version: '17'
- name: Setup LaTex
run: |
sudo apt-get update -y
sudo apt-get install -y --no-install-recommends pandoc texlive-xetex texlive-fonts-recommended texlive-plain-generic || true
if ! dpkg -l pandoc texlive-xetex texlive-fonts-recommended texlive-plain-generic; then
sudo apt-get update --fix-missing
sudo apt-get install -y pandoc texlive-xetex texlive-fonts-recommended texlive-plain-generic
fi
- name: Cache Firebase Emulators
uses: actions/cache@v3
with:
path: ~/.cache/firebase/emulators
key: ${{ runner.os }}-${{ runner.arch }}-firebase-emulators-${{ hashFiles('~/.cache/firebase/emulators/**') }}
- name: Install Firebase CLI Tools
run: npm install -g firebase-tools
- name: Install Infrastructure
run: |
python -m pip install --upgrade pip
pip install jupyterlab
- name: Install ECGDataPipelineTemplate Dependencies
run: |
pip install pandas numpy matplotlib firebase-admin requests ipywidgets pytz
- name: Set Firestore Emulator Environment Variable
run: |
echo "FIRESTORE_EMULATOR_HOST=localhost:8080" >> $GITHUB_ENV
echo "GCLOUD_PROJECT=ecgdatapipelinetemplate" >> $GITHUB_ENV
- name: Run Firebase Emulator & Execute Notebook
run: |
firebase emulators:exec --import=./ECGDataPipelineTemplate/sample_data "jupyter nbconvert --to pdf --execute ./ECGDataPipelineTemplate/ECGDataPipelineTemplate.ipynb"
env:
CI: true
- uses: actions/upload-artifact@v4
with:
name: ECGDataPipelineTemplate.pdf
path: ECGDataPipelineTemplate.pdf

17 changes: 17 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,20 @@ firebase-debug.*.log*

# Swift Package List
PAWS/package-list.json

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints
.virtual_documents

# IPython
profile_default/
ipython_config.py

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
13 changes: 13 additions & 0 deletions .reuse/dep5
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

Files: ECGDataPipelineTemplate/Figures/*
Copyright: 2024 Stanford University and the project authors (see CONTRIBUTORS.md)
License: MIT
Comment: All figures are part of the Stanford Spezi Data Pipeline Template open source project.

Files: ECGDataPipelineTemplate/sample_data/*
Copyright: 2024 Stanford University and the project authors (see CONTRIBUTORS.md)
License: MIT
Comment: All files are part of the Stanford Spezi Data Pipeline Template open source project.


171 changes: 171 additions & 0 deletions ECGDataPipelineTemplate/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
#
# This source file is part of the Stanford Spezi open-source project
#
# SPDX-FileCopyrightText: 2024 Stanford University and the project authors (see CONTRIBUTORS.md)
#
# SPDX-License-Identifier: MIT
#

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints
.virtual_documents

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

.DS_Store
Loading

0 comments on commit 84a0ddd

Please sign in to comment.