API: Extremely Poor Docker Resource Utilization Efficiency #2730

palisadoes · 2024-12-02T22:39:18Z

Describe the bug

We run a demonstration instance of Talawa-API on a GoDaddy VPS server running Ubuntu. It has the following resources:

1 core
2 GB of RAM
40 GB of disk

Other information:

The demo instance is intended to create an evaluation environment for new GtiHub contributors and users alike as they decide to use Talawa. The DB of the demo instance gets reset every day.
Talawa API runs natively on this VPS server with acceptable performance with one user. The load average is approximately 1, which is the target value for a system with only 1 core.
When Talawa API runs on the server using docker. The load average reaches 130, the swap process is the top CPU resource user. The system is so overloaded that only one ssh session at a time is achievable.

The purpose of this issue is to find ways to tune all Talawa-API Dockerfile and app configurations to lower its CPU and RAM utilization by at least 75%

With the current Docker performance very few developers or end users will want to try Talawa themselves.
This has been a recurring issue with Talawa API. The poor performance threatens the success of our current MongoDB based MVP.

To Reproduce
Steps to reproduce the behavior:

Run Talawa-API on a system
See excessive resource utilization

Expected behavior

Acceptable usage information such that it can run easily on a mid-range laptop without impacting its performance

Actual behavior

Poor performance

Screenshots

Additional details
Add any other context or screenshots about the feature request here.

Potential internship candidates

Please read this if you are planning to apply for a Palisadoes Foundation internship

Student Internship Programs talawa#359

prayanshchh · 2024-12-03T07:07:05Z

can u please assign, I want to work on this issue but I will need guidance

varshith257 · 2024-12-03T08:11:27Z

This mostly related of reducing docker image size

prayanshchh · 2024-12-06T03:19:55Z

Diffrent ways to approach this issue

1. Multi-Stage Builds
Using a multi-stage build can help separate the build and runtime environments, ensuring that only production-ready artifacts are included in the final image. This can be achieved by:

Installing dependencies and building the application in the first stage.
Copying only the necessary files (e.g., dist, node_modules) into a minimal runtime stage.

2. Optimizing Base Images
Switching to optimized base images can dramatically reduce size:

Baseline Image (Full Node.js): ~900 MB
Using Multi-Stage with Slim: ~400–500 MB
Using Multi-Stage with Alpine: ~250–300 MB
With Distroless: ~150–200 MB

3. Using Compression Tools
Tools like docker-slim can further compress the final image by analyzing and stripping unused dependencies and files:
With docker-slim: ~100–150 MB.

please suggest a method that doesn't impact comaptibility with codebase

palisadoes · 2024-12-06T03:30:15Z

@prayanshchh

Please investigate the best solution and propose it after testing on your system. It's not just RAM, but also ways to reduce the CPU overhead.

prayanshchh · 2024-12-06T04:01:32Z

alright sir

vasujain275 · 2024-12-07T17:35:50Z

@palisadoes @prayanshchh

The main problem I found with the API is that we have to run it in dev mode in the production Docker environment because our build process for the Talawa API is broken, so we can't use npm run start. If we resolve the build issue, we can drastically improve performance and security of the docker container.

I think @varshith257 also tried to solve the build process issue a few months back, any upadates on that?

palisadoes · 2024-12-07T18:00:00Z

Would this PR by @adithyanotfound provide any insights?

Refactored DockerFile to improve efficiency talawa-admin#2607

palisadoes · 2024-12-07T18:00:59Z

@vasujain275 Why do you say the build process is broken? Can you create an issue for someone else to try to fix it?

prayanshchh · 2024-12-07T19:31:20Z

Would this PR by @adithyanotfound provide any insights?

Refactored DockerFile to improve efficiency talawa-admin#2607

Yes this helps, I will start my work on this in two days, have got end sem exams

prayanshchh · 2024-12-14T07:53:10Z

am unassigning myself from the issue due to lack of progress

PurnenduMIshra129th · 2024-12-14T19:36:21Z

@palisadoes plz assign me

PurnenduMIshra129th · 2024-12-17T18:30:30Z

@palisadoes what is the load average if the api runs without docker means what is the performance . I need this because i will only focus to improve to docker performance.If not then i have to use profiler to measure what is the exact issue is it related to docker container or in code sue unOptimized query.

PurnenduMIshra129th · 2024-12-17T19:00:29Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism. And one doubt is how i give more load to this api because at the time of testing l am the only user .

vasujain275 · 2024-12-17T19:08:47Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism.

We don't need k8s
Multistage builds and lightweight base image will not help, we already have multi stage builds with alpine images. The main issue is our build process.
@palisadoes Due to my end semester exams right now I am not able to create that Graphql build Error Issue that is the main performance blocker on this. I will get to in 2-3 days once my exams end. Sorry for the delay.
I think we should close the docker performance related issues as they create unnecessary confusion. Our docker images are well optimised. The main issue is that we are running our api in dev mode in them, once the build is fixed we can modify the docker files to see the performance improvements.

PurnenduMIshra129th · 2024-12-17T19:20:09Z

Build related issue means i don't get means u are saying about unnecessary node modules or something like this are in build at the time of building the docker image there which are causing the issue. I need futher calrity. And in above u commented u are not able to run npm run start it is working fine because api service is starting

palisadoes · 2024-12-17T20:25:01Z

@palisadoes for now i have done limits it cpu and memory usage . Also added the multistage build and used one light weight image . But i think this will handle upto a specific user . But To handle it effectivly can i use kubernatives or any other services to handle the load . So it will scale the pods if load increase and reduce the cpu usage and improve the performance.If not does the vps server where the container is hosted can it provides this mechanism.
1. We don't need k8s

2. Multistage builds and lightweight base image will not help, we already have multi stage builds with alpine images. The main issue is our build process.

3. @palisadoes Due to my end semester exams right now I am not able to create that Graphql build Error Issue that is the main performance blocker on this. I will get to in 2-3 days once my exams end. Sorry for the delay.

4. I think we should close the docker performance related issues as they create unnecessary confusion. Our docker images are well optimised. The main issue is that we are running our api in dev mode in them, once the build is fixed we can modify the docker files to see the performance improvements.

OK.

PurnenduMIshra129th · 2024-12-17T21:02:01Z

@palisadoes i run a load test on the server with docker and with out docker on the configuration of duration of 30 sec and 2 req/sec and found means total of 60 request will be made in 30 sec in this scenerio both have equal successRate . But when i run the same test for same duration but with different request rate like 5 req/sec means 150 request in 30 sec got the result of slightly better performance of server with out docker . But the thing is server can't handle 150 request in 30 sec as many of request is under processing and not completed the request out of this only 40 request is successful.And if u want run the docker on low end service for a small user base like in 60 sec it makes 50 to 60 (considerable factor like medicore device 4gb of ram and 4core ) it will handle the request easily if talwa-api will reduce its cpu excessive task and if we limit the cpu usage also it will handle but some slowness will be there in this scenerio. What u say?

palisadoes · 2024-12-20T21:46:33Z

@PurnenduMIshra129th please coordinate with @vasujain275

There appears to be multiple causes. The application is clearly over using resources.

Here is additional information.

Cloud Based API Instance for Developers #1428 (comment)

PurnenduMIshra129th · 2024-12-21T19:38:33Z

@vasujain275 yes u are correct build process is broken . After build it is not working properly . Also when i try to run npm run prod it is not running gives multiple error. U have any thoughts on this ? should we have use import instead of require.

bandhan-majumder · 2025-01-30T03:21:47Z

@palisadoes is there anything I can help with?

palisadoes · 2025-01-31T12:18:38Z

We need to focus on the app performance which is causing docker to appear to be slow.

@PurnenduMIshra129th the bare minimum need to to be done to get the demo instance usable on the cloud server. Please coordinate with @vasujain275

gautam-divyanshu · 2025-01-31T16:12:20Z

@vasujain275 @PurnenduMIshra129th What's the status?

palisadoes · 2025-01-31T16:29:57Z

I need a volunteer to take this issue over as @vasujain275 doesn't seem to be available.
- Cloud Based API Instance for Developers #1428
The API needs to run under the talawa-api user and Admin needs to run under the talawa-admin user
The host is api-demo.talawa.io which is the same OS instance as admin-demo.talawa.io
Whoever is interested in working on this, contact me on slack to post their public SSH key so they can login to the server. They will also get sudo access.
This absolutely needs to be resolved this weekend. We are highlighting the demo as part of our GSoC 2025 application, as justifiable validation of our progress.

PurnenduMIshra129th · 2025-01-31T16:34:38Z

@palisadoes @varshith257 If you provide some guidance then i can give give a try

palisadoes · 2025-01-31T16:47:44Z

@palisadoes @varshith257 If you provide some guidance then i can give give a try

The guidance is clearly defined in the issue.

PurnenduMIshra129th · 2025-01-31T16:51:18Z

@palisadoes let me first setup the goDaddy server on my machine.

palisadoes · 2025-01-31T17:21:39Z

You can't. You don't have access to the server in question and it's current configuration. You need login access to do so. This is what needs to be done.

API: Extremely Poor Docker Resource Utilization Efficiency #2730 (comment)

PurnenduMIshra129th · 2025-01-31T17:39:29Z

@palisadoesi think i need some credentials to login into server ? Can i get the credentials so that i can try?My steps would be first i will create docker image in production enviroment then i will set its limit to get minimum performance.If applicable then i can see any other changes that will be helpful

palisadoes · 2025-01-31T17:41:01Z

@palisadoesi think i need some credentials to login into server ? Can i get the credentials so that i can try?

Send me your SSH public key and the username you require in slack.

varshith257 · 2025-01-31T18:05:02Z

@palisadoes I think deployment is done of what @vasujain275 has shared to me and just left with replacing the dev image with prod image and then we are good to go i guess so far

PurnenduMIshra129th · 2025-01-31T18:24:43Z

@gautam-divyanshu i have now access to server so trying to what to do ? Not much of idea of this.

PurnenduMIshra129th · 2025-01-31T18:26:45Z

@palisadoes @varshith257 yes it running on server but i can see that still its memory limit is set to full .. see the screen shot

see the limit column.

palisadoes · 2025-01-31T20:24:36Z

@PurnenduMIshra129th

Is this the develop branch running on the server? That is known to work.
Also, is there any room left for talawa-admin to run in docker too? We cannot have the apps consuming all available resources.
Is it setup to:
1. Reload the sample DB every day?
2. Reload the app whenever there are updates to the develop branch?

These are fundamental questions to get the app running.

PurnenduMIshra129th · 2025-01-31T23:31:02Z

@palisadoes currently the docker is running on server . But the i can see limit is not set there for now it talawa-api have access to all ram and cpu the system have . So we have to set the limit in its compose file and deploy again. also For now i can see load is also not too high .once we update the limit in api side then we can run the admin . For Now as load is not more so we can run Talawa-admin. But it is better to first set the resource limit on both side then we can deploy this. And after we deploy both of this we can benchmark this as said by @varshith257 .

And i did not check the sample DB reloading and updating the develop branch.
Currently writing test case so bit of busy there.

see the limit column it have access to all of our available ram . so in critical condition it will utilize the whole ram . Then swap process will start to manage the process . And evantually load will increase if multiple user will enter in same time.

VanshikaSabharwal · 2025-02-01T04:06:36Z

Can i work on this issue @palisadoes @varshith257 ? So that it will be solved fast.

palisadoes · 2025-02-01T07:25:31Z

That means the docker file in develop is configured in a suboptimal way. It's probably true in Admin too.

PurnenduMIshra129th · 2025-02-01T07:47:41Z

@palisadoes Yes .it is true

palisadoes · 2025-02-01T15:55:00Z

@PurnenduMIshra129th

Since they need to be updated. Can you lead that effort?
Both develop* branches in admin will need to be updated

PurnenduMIshra129th · 2025-02-01T16:12:41Z

@palisadoes ok i will do

palisadoes · 2025-02-01T18:40:28Z

@VanshikaSabharwal has got access to the server. Please coordinate with her.
Create PRs to fix the resource allocation against this issue for all affected branches.

PurnenduMIshra129th · 2025-02-01T19:20:59Z

@palisadoes ok i will contact her in slack

adithyanotfound · 2025-02-02T03:21:09Z

Refactored DockerFile to improve efficiency talawa-admin#2607

I had previously made a PR that optimized the admin for production, but I can’t find the changes in the codebase. Were they reverted or overwritten?

palisadoes · 2025-02-02T04:25:04Z

@adithyanotfound

They were. We overwrote the develop-postgres branch on top of develop. This was because we had cloned develop-postgres from develop and stopped updates on develop at the same time. Your PR must have been missed in the review period.
Can you reapply the changes?

I'm sorry this happened. There are a lot of moving pieces to manage, and I missed this.

adithyanotfound · 2025-02-02T04:42:26Z

@palisadoes No worries! I’ll reapply the changes and open a new PR shortly.

PurnenduMIshra129th · 2025-02-02T07:04:54Z

@palisadoes @adithyanotfound can you tell what are the changes or in which file it was

PurnenduMIshra129th · 2025-02-02T07:07:30Z

@palisadoes to start the talawa-api in server only we have to start the container or after that is there anything we have to do?

palisadoes · 2025-02-02T13:50:31Z

@PurnenduMIshra129th

@VanshikaSabharwal and I managed to setup the API on the server using the development environment.

We created a /etc/cron.d/talawa-api file that explains the process

Please coordinate with her.

We need better online documentation on this too.

palisadoes · 2025-02-02T13:51:48Z

Ideally, we should be using production and not develop instances for the API and Admin on the server. Please try to get that working.

PurnenduMIshra129th · 2025-02-02T16:05:33Z

@palisadoes ok will complete that soon.

palisadoes added the bug Something isn't working label Dec 2, 2024

github-actions bot added feature request unapproved Unapproved for Pull Request labels Dec 2, 2024

varshith257 removed the unapproved Unapproved for Pull Request label Dec 3, 2024

varshith257 assigned prayanshchh Dec 3, 2024

palisadoes mentioned this issue Dec 3, 2024

Cloud Based API Instance for Developers #1428

Open

palisadoes changed the title ~~Extremely Poor Docker Resource Utilization Efficiency~~ API: Extremely Poor Docker Resource Utilization Efficiency Dec 4, 2024

palisadoes added good first issue Good for newcomers and removed feature request labels Dec 4, 2024

prayanshchh removed their assignment Dec 14, 2024

Cioppolo14 assigned PurnenduMIshra129th Dec 14, 2024

API: Extremely Poor Docker Resource Utilization Efficiency #2730

API: Extremely Poor Docker Resource Utilization Efficiency #2730

Comments

palisadoes commented Dec 2, 2024

prayanshchh commented Dec 3, 2024

varshith257 commented Dec 3, 2024

prayanshchh commented Dec 6, 2024 • edited Loading

palisadoes commented Dec 6, 2024

prayanshchh commented Dec 6, 2024

vasujain275 commented Dec 7, 2024

palisadoes commented Dec 7, 2024

palisadoes commented Dec 7, 2024

prayanshchh commented Dec 7, 2024

prayanshchh commented Dec 14, 2024

PurnenduMIshra129th commented Dec 14, 2024

PurnenduMIshra129th commented Dec 17, 2024

PurnenduMIshra129th commented Dec 17, 2024 • edited Loading

vasujain275 commented Dec 17, 2024 • edited Loading

PurnenduMIshra129th commented Dec 17, 2024 • edited Loading

palisadoes commented Dec 17, 2024

PurnenduMIshra129th commented Dec 17, 2024

palisadoes commented Dec 20, 2024

PurnenduMIshra129th commented Dec 21, 2024

bandhan-majumder commented Jan 30, 2025

palisadoes commented Jan 31, 2025

gautam-divyanshu commented Jan 31, 2025

palisadoes commented Jan 31, 2025 • edited Loading

PurnenduMIshra129th commented Jan 31, 2025

palisadoes commented Jan 31, 2025

PurnenduMIshra129th commented Jan 31, 2025

palisadoes commented Jan 31, 2025

PurnenduMIshra129th commented Jan 31, 2025 • edited Loading

palisadoes commented Jan 31, 2025

varshith257 commented Jan 31, 2025

PurnenduMIshra129th commented Jan 31, 2025

PurnenduMIshra129th commented Jan 31, 2025

palisadoes commented Jan 31, 2025

PurnenduMIshra129th commented Jan 31, 2025 • edited Loading

VanshikaSabharwal commented Feb 1, 2025

palisadoes commented Feb 1, 2025

PurnenduMIshra129th commented Feb 1, 2025

palisadoes commented Feb 1, 2025

PurnenduMIshra129th commented Feb 1, 2025

palisadoes commented Feb 1, 2025

PurnenduMIshra129th commented Feb 1, 2025

adithyanotfound commented Feb 2, 2025

palisadoes commented Feb 2, 2025 • edited Loading

adithyanotfound commented Feb 2, 2025

PurnenduMIshra129th commented Feb 2, 2025

PurnenduMIshra129th commented Feb 2, 2025

palisadoes commented Feb 2, 2025

palisadoes commented Feb 2, 2025

PurnenduMIshra129th commented Feb 2, 2025

prayanshchh commented Dec 6, 2024 •

edited

Loading

PurnenduMIshra129th commented Dec 17, 2024 •

edited

Loading

vasujain275 commented Dec 17, 2024 •

edited

Loading

PurnenduMIshra129th commented Dec 17, 2024 •

edited

Loading

palisadoes commented Jan 31, 2025 •

edited

Loading

PurnenduMIshra129th commented Jan 31, 2025 •

edited

Loading

PurnenduMIshra129th commented Jan 31, 2025 •

edited

Loading

palisadoes commented Feb 2, 2025 •

edited

Loading