This repository has been archived by the owner on Jan 29, 2024. It is now read-only.
First sketch of the overall pipeline (+ research relevant tools/approaches) #562
Labels
🗄️ database
Creation and maintenance of a database of scientific literature
🚀 Feature
We are getting closer and closer to implementing all the relevant steps of our ETL pipeline. IMO it might be beneficial to create a first sketch of the overall pipeline and also look into tools that might be relevant.
Some ideas/to discuss
a) What logic/tool do we use to define the pipeline (raw shell script, DVC-like tools, ...)
b) How do we trigger it? (cronjob, manually, ...)
c) How do we monitor (live ideally) the progress of a running pipeline (something like github actions or jenkins do)
d) How do we test it (e.g. extending current integration tests - related #532)
The text was updated successfully, but these errors were encountered: