Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publishRecursive() performance #450

Open
2 tasks done
mfendeksilverstripe opened this issue Jun 6, 2024 · 1 comment
Open
2 tasks done

publishRecursive() performance #450

mfendeksilverstripe opened this issue Jun 6, 2024 · 1 comment

Comments

@mfendeksilverstripe
Copy link
Contributor

mfendeksilverstripe commented Jun 6, 2024

Module version(s) affected

1.13.7

Description

Publish recursive performance seems to have some gaps. We started replacing this method with something more performant and also by using asynchronous processing such as queued jobs. Most notable issues we found are:

  • long execution time
  • causing DB write issues such as timeouts and deadlocks (likely caused by the item above)

I've performed a benchmark on a content page that contains 21 content blocks, some with nested models and some with assets.

Benchmark for standard publishRecursive()

  • 5.33 seconds - First time publish (no models have live version)
  • 4.02 seconds - Single model modified on draft & published via page

Benchmark for customised publish action

  • 3.03 seconds - First time publish (no models have live version)
  • 1.10 seconds - Single model modified on draft & published via page

We found that the customised approach is much more resilient to DB deadlocks too. Processing thousands of queued jobs that execute this type of publish is fast and has almost no issues.

The performance gap is even wider if you go to pages with 100+ content blocks and if you introduce parallel processing via queued jobs.

How to reproduce

<?php

namespace App\Tasks;

use App\Pages\FeaturePage;
use Exception;
use Page;
use SilverStripe\Control\HTTPRequest;
use SilverStripe\Dev\BuildTask;
use SilverStripe\ORM\DataObject;
use SilverStripe\ORM\FieldType\DBDatetime;
use SilverStripe\Versioned\RecursivePublishable;
use SilverStripe\Versioned\Versioned;
use TractorCow\Fluent\Model\Locale;
use TractorCow\Fluent\State\FluentState;

class PerformantPublishTask extends BuildTask
{
    private static string $segment = 'performant-publish-task';

    /**
     * @var string
     */
    protected $title = 'Publish recursive benchmark task';

    /**
     * @var string
     */
    protected $description = 'Benchmark the performance of publish recursive and potentially provide better options.';

    /**
     * @param HTTPRequest|mixed $request
     */
    public function run(mixed $request): void
    {
        $type = $request->getVar('type');

        FluentState::singleton()->withState(function (FluentState $state) use ($type): void {// Comment out if your project doesn't have Fluent installed
            $defaultLocale = Locale::getDefault();

            // Optionally set locale if configured
            if ($defaultLocale) {
                $state->setLocale($defaultLocale->Locale);
            }

            echo 'Fetching page...' . PHP_EOL;

            /** @var FeaturePage $page */
            $page = FeaturePage::get()->byID(928);// Add your page type and ID

            $performantPublish = ($type === 'experimental');
            echo sprintf('Running publish with type %s...', $performantPublish ? 'experimental' : 'standard') . PHP_EOL;

            $start = microtime(true);

            if ($performantPublish) {
                $this->performantPublishRecursive($page);
            } else {
                $page->publishRecursive();
            }

            // Reformat to seconds
            $end = microtime(true) - $start;
            echo sprintf('%.2f', $end) . PHP_EOL;
        });
    }

    /**
     * @param Page $page
     * @return void
     * @throws Exception
     */
    private function performantPublishRecursive(Page $page): void
    {
        $now = DBDatetime::now()->Rfc2822();

        DBDatetime::withFixedNow($now, function () use ($page) {
            $modelsToPublish = [
                $page,
            ];

            /** @var DataObject|Versioned|RecursivePublishable $model */
            while ($model = array_shift($modelsToPublish)) {
                // We don't have anything to publish on a non-versioned model
                if (!$model->hasExtension(Versioned::class)) {
                    continue;
                }

                // Publish the current model if needed
                if (!$model->isPublished() || $model->stagesDiffer()) {
                    $model->publishSingle();
                }

                // Queue up owned models for publishing
                $ownedModels = $model
                    ->findOwned(false)
                    ->toArray();

                $modelsToPublish = array_merge($modelsToPublish, $ownedModels);
            }
        });
    }
}

Possible Solution

The provided code snippet is quite crude but it shows the opportunity for some options that might help with the performance of publishRecursive(). Here is a list of recommendations:

  • avoid using true recursion, the base performance of this is poor, use stack based recursion instead
  • add configuration to opt out of with translation wrapper, this might not be needed in some cases, if publishing of assets is involved the operation might be really slow due to slow shared drive on some infrastructure setups, translation wrap might cause DB deadlock and timeouts in such cases
  • add configuration to skip on using ChangeSet, this feature seems to have only very marginal use (mostly related to campaign admin) and it also produces DB records that have to be cleaned up regularly, versioned history seems to be working fine without it due to with fixed now

Validations

  • Check that there isn't already an issue that reports the same bug
  • Double check that your reproduction steps work in a fresh installation of silverstripe/installer (with any code examples you've provided)
@mfendeksilverstripe
Copy link
Contributor Author

mfendeksilverstripe commented Nov 5, 2024

Added a more refined implementation - we're currently using it for our fixtures generation and it yields about 30% speed increase which also includes extension points logic. This implementation is Fluent friendly.

class PerformantPublishService
{

    use Injectable;

    /**
     * @param DataObject|Versioned $model
     * @return void
     * @throws Exception
     */
    public function publishRecursiveWithoutChangeSet(DataObject $model): void
    {
        if (!$model->hasExtension(Versioned::class)) {
            // Model can't be published
            return;
        }

        if (!$model->isInDB()) {
            // Model isn't saved in the DB so we can't publish it
            return;
        }

        $modelsToPublish = $this->findModelsToPublish($model);
        $now = DBDatetime::now()->Rfc2822();

        DBDatetime::withFixedNow($now, static function () use ($model, $modelsToPublish): void {
            // get the last published version
            $original = $model->isPublished()
                ? Versioned::get_by_stage($model->baseClass(), Versioned::LIVE)->byID($model->ID)
                : null;

            $model->invokeWithExtensions('onBeforePublishRecursive', $original);

            /** @var DataObject|Versioned $modelToPublish */
            foreach ($modelsToPublish as $modelToPublish) {
                $modelToPublish->publishSingle();
            }

            $model->invokeWithExtensions('onAfterPublishRecursive', $original);
        });
    }

    private function findModelsToPublish(DataObject $model): array
    {
        $models = [
            $model,
        ];

        $modelsToPublish = [];

        /** @var DataObject|Versioned $model */
        while ($model = array_shift($models)) {
            // We won't be inspecting any models that aren't saved in the DB
            if (!$model->isInDB()) {
                continue;
            }

            // Mark model as needing publishing in case it needs to be published
            if ($model->hasExtension(Versioned::class) && (!$model->isPublished() || $model->stagesDiffer())) {
                $modelsToPublish[]= $model;
            }

            // Discover and add owned models for inspection
            $relatedModels = $model
                // We intentionally avoid "true" recursive traversal here as it's not performant
                // (often the cause of memory usage spikes and longer execution time due to deeper stack depth)
                // Instead we use "stack based" recursive traversal approach for better performance
                // which avoids the nested method calls
                ->findOwned(false)
                ->toArray();
            $models = array_merge($models, $relatedModels);
        }

        // Return the list in the reverse order, bottom first so we get to publish the root model as the last step
        return array_reverse($modelsToPublish);
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants