Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Create a database backend with an associated API #50

Open
jeevers opened this issue May 15, 2018 · 8 comments
Open

Proposal: Create a database backend with an associated API #50

jeevers opened this issue May 15, 2018 · 8 comments
Labels
proposal Propose a change to the project super Super issue - other issues are linked to this one

Comments

@jeevers
Copy link

jeevers commented May 15, 2018

It could be useful to have a database backend so that data can be more easily organized and queried. I think SQLite would be a good fit (at least at first) due to its ease of setup and management via the sqlite3 module in the standard library. Eventually we can add support for other databases.

@nishakm nishakm added the proposal Propose a change to the project label Nov 29, 2018
@nishakm
Copy link
Contributor

nishakm commented Feb 26, 2020

@PrajwalM2212 recommended sqlite as well: I think we can just choose sqlite3 because 1. It is faster 2. It is good for applications where code that executes sql statements and the application reside on the same machine. 3. It also supports huge amount of data upto 140TB with greater performance 4. It is provided as part of python standard lib https://www.sqlite.org/whentouse.html

@nishakm nishakm added this to the Near Future milestone Mar 19, 2020
@nishakm nishakm added the GSoC For Google Summer of Code label Mar 19, 2020
@zoek1
Copy link
Contributor

zoek1 commented Mar 25, 2020

The main requirement is that the storage be self contained ,right? that's why redis is not an option? @nishakm

@PrajwalM2212
Copy link
Contributor

PrajwalM2212 commented Mar 26, 2020

@zoek1 That was one of the reasons why I suggested sqlite. Since we are only using the cache for analysis purpose ( our internal use ) , sqlite gives the best value.

@nishakm
Copy link
Contributor

nishakm commented Mar 26, 2020

At this time, my main concern is to move away from storing data in a YAML file and into something that is queryable. The discussion I would really like to have is whether we should be using a key-value store (like Redis) or a relational database (like sqlite). One thing about choosing a relational database is that you will need to put time into designing the database. Once done, it is difficult to undo. Key-value stores are easier to change, but suffer from the same problems as the flat YAML file which is that as more data gets added, it becomes less queryable. I am personally leaning towards implementing this in sqlite because we already have a data model and making an API for queries means the database can be switched with something else.

@rnjudge rnjudge modified the milestones: Near Future, Beta Release May 21, 2020
@nishakm nishakm added super Super issue - other issues are linked to this one show-stopper We really really need a solution! and removed GSoC For Google Summer of Code labels Jun 1, 2020
@nishakm
Copy link
Contributor

nishakm commented Jun 10, 2020

My research shows that using a json file as a backend greatly improves performance:

yaml backend: 76 seconds
json backend: 0.47 seconds

We would still like a database backend so folks can set up a centralized repository which is queryable but for now, replacing the caching format from json to yaml is an easy improvement.

@nishakm nishakm changed the title Proposal: Replace cache.yml with a database backend Proposal: Create a database backend with an associated API Jun 10, 2020
@nishakm nishakm removed the show-stopper We really really need a solution! label Jun 10, 2020
@nishakm nishakm added the GSoC For Google Summer of Code label Jan 19, 2021
@nishakm
Copy link
Contributor

nishakm commented Jan 19, 2021

  1. Design CRUD API for different items in the database CRUD API for the cache #792
  2. Implement the database Implement a database backend #863
  3. Implement the sync mechanism Create a sync mechanism between the cache and the hosted database #862

@ashok-arora
Copy link

What's the status of this proposal and can I work on it?

@rnjudge rnjudge removed this from the Beta Release milestone Jan 25, 2022
@urmilkalaria
Copy link

I don't know if it is possible but since we are aiming to store the container image into database, can't we convert docker image to JSON format and then store in JSON data in redis database. Since JSON greatly increase the performance and also accessing database through Redis is faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Propose a change to the project super Super issue - other issues are linked to this one
Projects
None yet
Development

No branches or pull requests

7 participants