-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: ADR for mediator-message-storage [skip ci] #530
Changes from 3 commits
271a340
3903663
660f165
1a568a3
836c5c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Mediator message storage | ||
|
||
- Status: draft [ accepted | deprecated | superseded by [xxx](yyyymmdd-xxx.md)] | ||
- Deciders: Yurii Shynbuiev, Benjamin Voiturier, Shailesh Patil, Fabio Pinheiro , David Poltorak | ||
- Date: [2023-05-09] | ||
- Tags: storage, db, message, mongo, postgres, sql | ||
|
||
## Context and Problem Statement | ||
Mediator storage | ||
Relational databases like PostgreSQL store data in structured tables, with rows and columns that help establish relationships between various tables and entities. | ||
SQL is used in PostgreSQL to save, retrieve, access, and manipulate the database data. | ||
While PostgreSQL may be ideal for managing structured data streams, it tends to struggle when dealing with big unstructured data, as maintaining such relations can increase time complexity significantly. | ||
Postgres SQL relies on relational data models that need to defined in advance | ||
Change in any field in table requires lengthy process of migration scripts run and to maintain but it works there are tools available and we also use it PrismAgent, | ||
But maintenance cost of software is higher. | ||
Contrastingly, non-relational databases like MongoDB excel in these scenarios because data isn't constrained to a single table. | ||
This approach permits massive data streams to be imported directly into the system without the burden of setting up increasingly complex relationships and keys. | ||
MongoDB typically stores data within documents using JSON (JavaScript Object Notation) or BSON (Binary JavaScript Object Notation), which simplifies the handling of big data and complex data streams. | ||
document database supports a rapid, iterative cycle of development the way that a document database turns data into code. | ||
MongoDB is faster at inserting and for queries that use nested references instead of joins | ||
In Mediator the data which we send or receive is json message and to process json message we can avoid the unnecessary serialization and deserialization by having it stored in document database. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Woudn't using postgres with a single text/string field accomplish the same thing? What are the read and write paths we ne need to support - sequential or random access [get by ID or search] The cost [congnitive load, someone to install it, someone to write migrations and manage it, something to sort it out when it goes wrong] can be high even when using a tool that has automations for such operations. We still need to understand how to run it locally for development and for self hosted |
||
Postgres SQL by default in vertically scalable where as Mongo can scale horizontally, This can help in scalability problem | ||
Mediator messages store in simple and strigh forward write there is no transactional workflow involved so we don't gain much by using relational db like postgres. | ||
|
||
Below are the 2 options which we can use to reduce infrastructure management | ||
MongoDB Atlas. Fully managed MongoDB in the cloud which can reduce the infrastructure management https://www.mongodb.com/atlas/database | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do these reduce the overhead? Is it worth the cost over using and existing DB that's already running and using jsonblobs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It reduces the overhead of not worrying about the backup and scalability and cluster maangement |
||
Amazon DocumentDB (with MongoDB compatibility) https://aws.amazon.com/documentdb/ | ||
|
||
|
||
## Refrences used | ||
https://www.plesk.com/blog/various/mongodb-vs-postgresql/ | ||
https://www.dbvis.com/thetable/json-vs-jsonb-in-postgresql-a-complete-comparison/ | ||
https://severalnines.com/blog/overview-json-capabilities-within-postgresql/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This link paints a very good picture of the JSON support - my personal view from the information provided would be that it would be better to start off by using a simple postgres database with a single column for the json data with a key which we can do gets on and then if we find it's too slow or has other problems - introduce the complexity of another db There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the Mediator pickup protocol, we need to query the JSON based on key etc another field, IN Postgresql so we won't be able to index the field in the JSON blob, If the database grows that won't be the ideal solution to use Postgres, We have seen with PrismAgent, The message we stored to have the same problem. |
||
https://www.mongodb.com/docs/manual/core/schema-validation/ | ||
https://www.mongodb.com/compare/mongodb-dynamodb | ||
https://www.projectpro.io/article/dynamodb-vs-mongodb/826 | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The title usually depicts the decision being made - in this - if the outcome is to be mongodb - the title could be: use-mongo-db-for-message-storage.md
This way, just the filenames tell part of the story - this should be part of the guidelines but having read them - it doens't make this clear - I'll move them to the handbook and update them