Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Handle external signals to topic-extract gracefully #575

Open
jankrepl opened this issue Feb 10, 2022 · 1 comment
Open

Handle external signals to topic-extract gracefully #575

jankrepl opened this issue Feb 10, 2022 · 1 comment
Labels
🗄️ database Creation and maintenance of a database of scientific literature

Comments

@jankrepl
Copy link
Contributor

jankrepl commented Feb 10, 2022

🚀 Feature

It would be very useful to be able to send a signal (e.g. SIGINT / keyboard interrupt) to a running bbs_database topic-extract process and have a guarantee that the already processed articles would be stored inside of the output .jsonl before the process is killed.

Currently, one needs to wait for all articles to be finished.

Why useful?

  • Debugging purposes - e.g. timeout 100 bbs_database topic_extract .... I am currently working on the overall pipeline and it would be nice to be able to have "some results" quickly rather than having "all results" slowly.

Other commands at least have a logic of saving things to disk when they run (download, parse) so killing them is not a big deal.

@jankrepl jankrepl added the 🗄️ database Creation and maintenance of a database of scientific literature label Feb 10, 2022
@jankrepl
Copy link
Contributor Author

A couple of related ideas:

  • Write to the .jsonl more regularly (maybe not necessary)
  • Check whether any entries already exist in the .jsonl and if yes, make sure to skip those articles

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🗄️ database Creation and maintenance of a database of scientific literature
Projects
None yet
Development

No branches or pull requests

1 participant