Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: ADR for mediator-message-storage [skip ci] #530

Merged
merged 5 commits into from
May 24, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions docs/decisions/20230515-mediator-message-storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Mediator message storage
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title usually depicts the decision being made - in this - if the outcome is to be mongodb - the title could be: use-mongo-db-for-message-storage.md

This way, just the filenames tell part of the story - this should be part of the guidelines but having read them - it doens't make this clear - I'll move them to the handbook and update them


- Status: draft [ accepted | deprecated | superseded by [xxx](yyyymmdd-xxx.md)]
- Deciders: Yurii Shynbuiev, Benjamin Voiturier, Shailesh Patil, Fabio Pinheiro , David Poltorak
- Date: [2023-05-09]
- Tags: storage, db, message, mongo, postgres, sql

## Context and Problem Statement
Mediator storage
Relational databases like PostgreSQL store data in structured tables, with rows and columns that help establish relationships between various tables and entities.
SQL is used in PostgreSQL to save, retrieve, access, and manipulate the database data.
While PostgreSQL may be ideal for managing structured data streams, it tends to struggle when dealing with big unstructured data, as maintaining such relations can increase time complexity significantly.
Postgres SQL relies on relational data models that need to defined in advance
Change in any field in table requires lengthy process of migration scripts run and to maintain but it works there are tools available and we also use it PrismAgent,
But maintenance cost of software is higher.
Contrastingly, non-relational databases like MongoDB excel in these scenarios because data isn't constrained to a single table.
This approach permits massive data streams to be imported directly into the system without the burden of setting up increasingly complex relationships and keys.
MongoDB typically stores data within documents using JSON (JavaScript Object Notation) or BSON (Binary JavaScript Object Notation), which simplifies the handling of big data and complex data streams.
document database supports a rapid, iterative cycle of development the way that a document database turns data into code.
MongoDB is faster at inserting and for queries that use nested references instead of joins
In Mediator the data which we send or receive is json message and to process json message we can avoid the unnecessary serialization and deserialization by having it stored in document database.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woudn't using postgres with a single text/string field accomplish the same thing? What are the read and write paths we ne need to support - sequential or random access [get by ID or search]

The cost [congnitive load, someone to install it, someone to write migrations and manage it, something to sort it out when it goes wrong] can be high even when using a tool that has automations for such operations. We still need to understand how to run it locally for development and for self hosted

Postgres SQL by default in vertically scalable where as Mongo can scale horizontally, This can help in scalability problem
Mediator messages store in simple and strigh forward write there is no transactional workflow involved so we don't gain much by using relational db like postgres.

Below are the 2 options which we can use to reduce infrastructure management
MongoDB Atlas. Fully managed MongoDB in the cloud which can reduce the infrastructure management https://www.mongodb.com/atlas/database
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do these reduce the overhead? Is it worth the cost over using and existing DB that's already running and using jsonblobs?

Copy link
Contributor Author

@mineme0110 mineme0110 May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazon DocumentDB (with MongoDB compatibility) https://aws.amazon.com/documentdb/


## Refrences used
https://www.plesk.com/blog/various/mongodb-vs-postgresql/
https://www.dbvis.com/thetable/json-vs-jsonb-in-postgresql-a-complete-comparison/
https://severalnines.com/blog/overview-json-capabilities-within-postgresql/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link paints a very good picture of the JSON support - my personal view from the information provided would be that it would be better to start off by using a simple postgres database with a single column for the json data with a key which we can do gets on and then if we find it's too slow or has other problems - introduce the complexity of another db

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Mediator pickup protocol, we need to query the JSON based on key etc another field, IN Postgresql so we won't be able to index the field in the JSON blob, If the database grows that won't be the ideal solution to use Postgres, We have seen with PrismAgent, The message we stored to have the same problem.

https://www.mongodb.com/docs/manual/core/schema-validation/
https://www.mongodb.com/compare/mongodb-dynamodb
https://www.projectpro.io/article/dynamodb-vs-mongodb/826