-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Create a database backend with an associated API #50
Comments
@PrajwalM2212 recommended sqlite as well: I think we can just choose sqlite3 because 1. It is faster 2. It is good for applications where code that executes sql statements and the application reside on the same machine. 3. It also supports huge amount of data upto 140TB with greater performance 4. It is provided as part of python standard lib https://www.sqlite.org/whentouse.html |
The main requirement is that the storage be self contained ,right? that's why redis is not an option? @nishakm |
@zoek1 That was one of the reasons why I suggested sqlite. Since we are only using the cache for analysis purpose ( our internal use ) , sqlite gives the best value. |
At this time, my main concern is to move away from storing data in a YAML file and into something that is queryable. The discussion I would really like to have is whether we should be using a key-value store (like Redis) or a relational database (like sqlite). One thing about choosing a relational database is that you will need to put time into designing the database. Once done, it is difficult to undo. Key-value stores are easier to change, but suffer from the same problems as the flat YAML file which is that as more data gets added, it becomes less queryable. I am personally leaning towards implementing this in sqlite because we already have a data model and making an API for queries means the database can be switched with something else. |
My research shows that using a json file as a backend greatly improves performance: yaml backend: 76 seconds We would still like a database backend so folks can set up a centralized repository which is queryable but for now, replacing the caching format from json to yaml is an easy improvement. |
|
What's the status of this proposal and can I work on it? |
I don't know if it is possible but since we are aiming to store the container image into database, can't we convert docker image to JSON format and then store in JSON data in redis database. Since JSON greatly increase the performance and also accessing database through Redis is faster. |
It could be useful to have a database backend so that data can be more easily organized and queried. I think SQLite would be a good fit (at least at first) due to its ease of setup and management via the sqlite3 module in the standard library. Eventually we can add support for other databases.
The text was updated successfully, but these errors were encountered: