Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AECU is executed too often in AEMaaCS #228

Closed
royteeuwen opened this issue Apr 5, 2024 · 8 comments
Closed

AECU is executed too often in AEMaaCS #228

royteeuwen opened this issue Apr 5, 2024 · 8 comments

Comments

@royteeuwen
Copy link
Contributor

When using AECU in AEMaaCS, there seems to be an issue that the scripts are executed too many times. See screenshot showing an example of this. Probably because there are multiple author instances executing it in the same time?

Screenshot 2024-04-05 at 15 03 22

@nhirrle
Copy link
Collaborator

nhirrle commented Apr 5, 2024

The scripts need to be developed that they can run multiple times, yes.
See paragraph on https://github.com/valtech/aem-easy-content-upgrade?tab=readme-ov-file#startup-hook-since-600

@royteeuwen
Copy link
Contributor Author

royteeuwen commented Apr 5, 2024

@nhirrle hmm it seems to be very consistent in always executing it multiple times and it doesnt seem to detect correctly (anymore?) that this is happening :/. You can see it in my screenshot, almost all runs are like this. Could this maybe be improved somehow? Its really hard to write scripts that executes on multiple pages to not come into a state that its doing duplicate modifications and throwing exceptions around that. You’d have to refresh your resourceresolver constantly

@nhirrle
Copy link
Collaborator

nhirrle commented Apr 5, 2024

@royteeuwen can you provide some more details? is the script with the always selector? and any idea if this is with a recent aemaacs release? would be good if it can be verified with some sample scripts on a sandbox

@royteeuwen
Copy link
Contributor Author

@nhirrle no the script is not with an .always. selector. You can see that in my screenshot it starts the run of the same scripts twice (at 12:56:16 and 12:56:20) and didn't detect that it was already running. When the run of 12:56:20 started, one of the 7 scripts was already done by the run of 12:56:16, so that's why it states that there are only 6 scripts.

The result of the two runs happening at (almost) the exact same time is the following for the second run executing the same script at the exact same time:

javax.jcr.InvalidItemStateException: OakState0001: Unresolved conflicts in /content/my-site/de/demo/sprint-1/button/jcr:content/root/main-container/social-wrapper-container/content-container/button
	at org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:238)
	at org.apache.jackrabbit.oak.api.CommitFailedException.asRepositoryException(CommitFailedException.java:213)
	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.newRepositoryException(SessionDelegate.java:737)
	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.save(SessionDelegate.java:551)
	at org.apache.jackrabbit.oak.jcr.session.SessionImpl$9.performVoid(SessionImpl.java:459)
	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.performVoid(SessionDelegate.java:299)
	at org.apache.jackrabbit.oak.jcr.session.SessionImpl.save(SessionImpl.java:456)
	at com.adobe.granite.repository.impl.CRX3SessionImpl.save(CRX3SessionImpl.java:220)
	at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321)
	at Script1.run(Script1.groovy:42)
	at org.codehaus.groovy.vmplugin.v8.IndyInterface.selectMethod(IndyInterface.java:355)
	at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321)
	at be.orbinson.aem.groovy.console.impl.DefaultGroovyConsoleService.runScript(DefaultGroovyConsoleService.groovy:75)
	at de.valtech.aecu.core.service.AecuServiceImpl.executeScript(AecuServiceImpl.java:214)
	at de.valtech.aecu.core.service.AecuServiceImpl.execute(AecuServiceImpl.java:188)
	at de.valtech.aecu.core.service.AecuServiceImpl.executeWithInstallHookHistory(AecuServiceImpl.java:374)
	at de.valtech.aecu.core.service.AecuServiceImpl.executeWithInstallHookHistory(AecuServiceImpl.java:357)
	at de.valtech.aecu.startuphook.AecuCloudStartupService.startAecuMigration(AecuCloudStartupService.java:175)
	at de.valtech.aecu.startuphook.AecuCloudStartupService.checkAndRunMigration(AecuCloudStartupService.java:94)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.jackrabbit.oak.api.CommitFailedException: OakState0001: Unresolved conflicts in /content/my-site/de/demo/sprint-1/button/jcr:content/root/main-container/social-wrapper-container/content-container/button
	at org.apache.jackrabbit.oak.plugins.commit.ConflictValidator.failOnMergeConflict(ConflictValidator.java:115)
	at org.apache.jackrabbit.oak.plugins.commit.ConflictValidator.propertyAdded(ConflictValidator.java:84)
	at org.apache.jackrabbit.oak.spi.commit.CompositeEditor.propertyAdded(CompositeEditor.java:82)
	at org.apache.jackrabbit.oak.spi.commit.EditorDiff.propertyAdded(EditorDiff.java:81)

The script itself is not an .always, and the script executes the following, where it finds like at least 1000 results with the query:

xpathQuery("/jcr:root/content//*[" +
        "(@sling:resourceType='my-site/components/button/v1/button')" +
// IRL there are more resource types, removing for this ticket
        "]").each { node ->
    println "button path: " + node.getPath()

    if (node.hasProperty("linkURL")) {
        node.getProperty("linkURL").remove()
        println "Removed linkURL property from: " + node.getPath()
    }
    if (node.hasProperty("linkTarget")) {
        def linkTargetValue = node.getProperty("linkTarget").getString()
        node.setProperty("linkWindowTarget", linkTargetValue)

        node.getProperty("linkTarget").remove()
        println "Renamed linkTarget to linkWindowTarget in: " + node.getPath()
    }
}

session.save()

The thing I could do to "improve" it, is moving the session.save() inside the .each and doing a refresh every time. But this would make the script run a lot longer.

@nhirrle
Copy link
Collaborator

nhirrle commented Apr 5, 2024

Thanks for the details. I will have a look beginning of next week. there is also a related ticket #227

@nhirrle
Copy link
Collaborator

nhirrle commented Apr 8, 2024

Hi @royteeuwen
So issue is happening because scripts are executed on the cluster but the aecu code is not taken this into account.
We will need to migrate code to sling Jobs instead and execute them on the leader only.

This requires major changes and a new release.
More info - https://adapt.to/2021/presentations/adaptto2021-designing-a-cluster-aware-application-joerg-hoh.pdf

Unfortunately I can only commit on a fix within the next 4 weeks.

@royteeuwen
Copy link
Contributor Author

@nhirrle OK! Just make sure to not create the job on every AEM instance, because then you would still execute the rules x times

@nhirrle
Copy link
Collaborator

nhirrle commented Jun 3, 2024

one further observation: exception thrown during startup
03.06.2024 05:01:36.209 [cm-p23458-e585661-aem-author-6784cb8fb6-2hxg7] *INFO* [sling-threadpool-649d8a9a-d1be-415c-b890-89cb09b8d432-(apache-sling-job-thread-pool)-35-AECU Cloud Startup Job Queue(de/valtech/aecu/cloud/AecuStartupJobTopic)] de.valtech.aecu.startuphook.AecuStartupJobConsumer AECU migration started 03.06.2024 05:01:36.275 [cm-p23458-e585661-aem-author-6784cb8fb6-2hxg7] *ERROR* [sling-threadpool-649d8a9a-d1be-415c-b890-89cb09b8d432-(apache-sling-job-thread-pool)-35-AECU Cloud Startup Job Queue(de/valtech/aecu/cloud/AecuStartupJobTopic)] de.valtech.aecu.startuphook.AecuStartupJobConsumer Error while executing AECU migration de.valtech.aecu.api.service.AecuException: Path is invalid at de.valtech.aecu.core.service.AecuServiceImpl.findCandidates(AecuServiceImpl.java:117) [de.valtech.aecu.core:6.5.1.SNAPSHOT] at de.valtech.aecu.core.service.AecuServiceImpl.getFiles(AecuServiceImpl.java:97) [de.valtech.aecu.core:6.5.1.SNAPSHOT] at de.valtech.aecu.core.service.AecuServiceImpl.executeWithInstallHookHistory(AecuServiceImpl.java:363) [de.valtech.aecu.core:6.5.1.SNAPSHOT] at de.valtech.aecu.core.service.AecuServiceImpl.executeWithInstallHookHistory(AecuServiceImpl.java:357) [de.valtech.aecu.core:6.5.1.SNAPSHOT] at de.valtech.aecu.startuphook.AecuStartupJobConsumer.process(AecuStartupJobConsumer.java:32) [de.valtech.aecu.cloud.startup.hook:6.5.1.SNAPSHOT] at org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.process(JobConsumerManager.java:543) [org.apache.sling.event:4.3.14] at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJob(JobQueueImpl.java:351) [org.apache.sling.event:4.3.14] at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.access$100(JobQueueImpl.java:60) [org.apache.sling.event:4.3.14] at org.apache.sling.event.impl.jobs.queues.JobQueueImpl$1.run(JobQueueImpl.java:287) [org.apache.sling.event:4.3.14] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants