-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance on deployment? #105
Comments
I am not sure how you measure the update rate. Normally the HTTP request only returns after the artifact has been analyzed, the BLOB has been stored and the channel aggregation has been performed. So I guess the transfer rate will be much higher and the average is then lowered by the final wait period, where the channel is being processed. If the channel gets more and more content, aggregating the channel might take more time. I do know several spots which could be improved, performance wise. It simply is a matter of time/effort. Most operations (like Maven and P2 access) are optimized in a way that reading is fast, while modifications might be slower. Well, a recommendation is that the more the database can cache, the faster the access will be. 200 bundles should not be an issue. Modifications on one channel will be performed sequentially. So more CPU cores will only help in parallel channel operations (on other channels) or other read operations (web ui, repository adapters). |
I deploy artifacts with maven-deploy (Groovy-script, which creates for every bundle within a folder a temporary pom file and deploys the artifact together with the generated pom-file). If I run the groovy-script for a folder with 20 jar-files/bundles, I can see, how the deployment becomes slow for every jar/pom file. Also jenkins-builds with a single deployment for every maven project (the target of the project) in a multi-module build needs a big amount of time for running. |
Well this highly depends on your channel configuration and the operations. Some operations require a full channel rebuild which includes extracting metadata from the BLOBs. This is the slowest operation. The normal "add" operation should be much quicker than this full channel rebuild though. In the optimal case it just processes the one artifact to add. But it may happen that a addition triggers other operations as well. For example adding an OSGi bundle will trigger the creation of two additional artifacts if the P2 metadata artifact is added and a full channel aggregation if the P2 repository channel is added. This requires the storage manager to load all artifact information in the end of the operation. Since with maven every file is uploaded independently, this can trigger a lot of operations and each one has to wait until the work is complete. To be honest I don't have statistical data about performance. Right now there is no facility in package drone which measures and records performance. But this should be one of the next steps in order to decide which areas need performance improvements. I guess one performance "bug" is loading artifacts when processing the channel. For example one step is to check generator artifacts for regeneration. Right now all artifacts are loaded from the database (without metadata though) and checked. Discarding irrelevant entries on the java code. This could be done by a proper JPA query. So guessing again what helps most, I would say configure Postgres to allow for caching access for reading artifacts. Again, right now proper JPA queries are missing. So in most cases all channel artifacts are read. |
Ok, so we come near to a reason for my performance problems. I use channels with P2 metadata generation and OSGi and P2 repository aspects. So I think, I run into the situation of full channel aggregation after each deployment. |
I am planning to make the milestone 2 for today. Including any the fix for P2. After that I will start to add some functionality for performance measurement. So that we can actually measure and don't need to guess ;-) |
So I did add a little bit of tracing and ran the stress tests again, which deploy bundles to some channels which already have some bundles. About 99% of the time is consumed by running the channel aggregator, which is ran twice for each maven upload (jar + pom). And about 60% of the time is required to scan for artifacts in order to aggregate. I will dig a bit more into this, just to let you know, that the first assumption (that the database operations might be the issue) seems correct. |
Thanks for your informations. So we can hope for some improvements (or better directions for postgresql-configuration) in the future... |
Yes, absolutely. However I am not sure what the time frame for this will be. The more I look at it, the more I think that it might be a good idea to actually cache a few parts in Package Drone itself. But such a change would require some deeper changes and I don't want to make them in the 0.10.x version. |
I see, that you add some profiling/monitoring functionality. So I can see in the future the results of configuration-changes in PostgreSQL. This can help for optimizing PostgreSQL-settings. |
Just an idea, if Alexander need a workaround for.
So, for a workaround (only), someone that deploys multiple bundles, could use this:
|
Very good idea. I would like to see some functionality like: freeze and thaw. Or suspend and resume like you described. Actually that second operation could be scheduled automatically after some time, or requested e externally. Requesting it twice should result in a no-op and be cheap. So before starting a new build you could always request it. However, during a build might be problematic if someone re-downloads artifacts which just git uploaded. So it should be an opt-in functionality, as you described! |
Yes, we should keep it as an option that someone could use. We should not break stuff by adding a new option. I thought about the same stuff (if the build process downloads something that was uploaded by the build process. This could also be solved, but could result in a lot of work.
I think for the first shot, disable / enable / trigger would be a good choice for the effort-benefit-factor. |
However, what I actually want to do is fix those performance bugs :-) I already got one. This will delay the 0.10.x release a little bit, but bring enormous speed improvements to the system even without caching. So I think it is worth it. Another one is already being worked on as well. Hopefully Monday, since I already head out for weekend today :-) A third, and last one for my test case night be a bit more tricky. Since it involves not only the database but also the blob store. However I have not looked in to this one. So maybe there is a simple fix for this as well. The most important step was to add profiling. It shows were the problems are and allows fixing them. |
Whatever you want. ;-) I already stated:
If you could solve the performance issues without that workaround: nice ;-) |
Actually I want both :-) a function like that always comes in handy at some point! |
I just uploaded the release 0.10.0-m3, which contains huge performance improvements. Since it also contains some changes it might be a good idea to keep a copy of 0.10.0-m2 😉 But you can switch between m2 and m3 as you like. |
I'm testing the new milestone for several hours, until now without new problems. |
Performance seems to be better, but I have got some problems with the generated content.xml and artifacts.xml for a channel. |
OK, I suggest you switch back. If you have any additional information I would be glad to have it. I will have a look at it tomorrow, hopefully I can wrap it up in a test case. |
I returned to 0.10.0M2. |
Yes, this can be since the jar files are only stored temporarily first. And added to the channel at a later time when the full information is present from the maven upload. |
On my package-drone instance 0.10.0M2 I deploy concurrently with Maven from multiple Jenkins-jobs into the same channel. |
How many artifact are in this channel? Is there any error on the console or the database log? |
In the console-log are only exceptions about aborted connection (because the maven deployment aborts). In the database-table "ARTIFACTS" are 10.000 rows, provided in different channels. |
With the new version 0.10.0m6 (improved cleanup-aspect) the deployment of 52 artifacts (in summary from 1733 to 1785 artifacts in one channel) needs 6:46min. This is ok for me and a huge improvement over older versions. Thanks for the new statistics-informations! |
I have created an empty channel and fill it sequentially (in a loop) with a Groovy-based script with normal Maven-deploy operations with many OSGi-bundles. The first artifacts are deployed quickly, but after some deploys (app. 30-40) the deployment becomes slower and slower. Especially the deployment of the (by groovy generated) pom-file or other small files is very slow (<5KB/sec)
Is this a problem of the internal pdrone-caches or the postgresql-database?
At start (empty channel) I get upload-rates of >500KB/sec, with a filled channel the upload-rate is 0,5...100KB/sec (higher for bigger files).
What are your experiences with heavy filled channels (>200 bundles)? Or is this a problem with deferred proceeding in the background (queues)? What do you recommend for memory (JVM) and CPU-cores?
The text was updated successfully, but these errors were encountered: