-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ui.content package no longer block replication queue on AEMaaCS #2523
Make ui.content package no longer block replication queue on AEMaaCS #2523
Conversation
cd13fb2
to
009ab7d
Compare
create path /var/acs-commons(nt:folder) | ||
|
||
# AEM classic does not know this system user, but creating it below system/acs-commons shouldn't do any harm | ||
create service user sling-distribution-importer with path system/acs-commons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwin Does this service user already exist on AEM CS? If it does, i assume this "creation" will noop right? (obviously, if it does exist OOTB - it wont exist at "/home/users/system/acs-commons")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, noop when user exists anywhere: https://github.com/apache/sling-org-apache-sling-jcr-repoinit/blob/5a14f4c3c96381f44310316689fdd4bd048bf436/src/main/java/org/apache/sling/jcr/repoinit/impl/UserVisitor.java#L62. On AEM Cloud it exists at /home/users/system/cq:services/internal/...
. Compare also with https://issues.apache.org/jira/browse/SLING-9857.
@kwin awesome! Before merge/release - just want to make sure it's released into the right semver based on current compatibility [1] Basically, 6.3+ will support the repo-init of /var/acs-commons and the ACS Commons service-worker ACLs, right? .. so we can do a 4.x release. [1] https://adobe-consulting-services.github.io/acs-aem-commons/pages/compatibility.html |
AEM 6.3.0 ships with org.apache.sling.repoinit.parser 1.1.0 and org.apache.sling.jcr.repoinit 1.1.2. The only thing which is not supported by the latter is intermediate path support (apache/sling-org-apache-sling-jcr-repoinit@d039439, being added with jcr.repoinit 1.1.8). Not sure in which SP that has been added... Also I would very much appreciate if you could test in AEMaaCS prior to release (I only tested against AEM Cloud SDK Quickstart) |
Pull Request Test Coverage Report for Build 6276
💛 - Coveralls |
Agh - probably should require the SP (though, I can't see in release notes what adds this support).. Feels too messy adding "acs-commons-foo-bar" outside of /home/users/system/acs-commons I imagine anyone on 6.3 should be on a later SP anyhow, so it should in practice be a non-breaking release (though I guess we should release as 5.0 anyhow -- which might be good to clearly call out AEM CS xompatability).. We can always release 6.x with the other big changes after more testing. And yes - I'll deploy to AEM CS in the cloud to make sure it builds/deploys So... I guess we're good with this, PR -- We just need to figure out what AEM 6.x + SP? versions are supported :) |
Just FYI - working through some unrelated issues getting this branch to build/deploy w CM to AEM CS |
Deploy step failing on Build Image step. I did have to add a missing opening quote in the repoinit script, and suppress some warnings around creating temp files to get it to get this far. It looks like there is a problem w/ replicating it still - though I haven't dug into it (just checked the status, and tossed the log here in case someone has some cycles to check it out) |
scripts=[ | ||
# these users and ACLs are only necessary on author | ||
create service user acs-commons-workflow-remover-service with path system/acs-commons | ||
set principal ACL for acs-commons-workflow-remover-service | ||
allow jcr:read, rep:write on /var/workflow/instances | ||
end | ||
|
||
create service user acs-commons-workflowpackagemanager-service with path system/acs-commons | ||
set principal ACL for acs-commons-workflowpackagemanager-service | ||
allow jcr:read on /var/workflow/packages | ||
end | ||
" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scripts=[ | |
# these users and ACLs are only necessary on author | |
create service user acs-commons-workflow-remover-service with path system/acs-commons | |
set principal ACL for acs-commons-workflow-remover-service | |
allow jcr:read, rep:write on /var/workflow/instances | |
end | |
create service user acs-commons-workflowpackagemanager-service with path system/acs-commons | |
set principal ACL for acs-commons-workflowpackagemanager-service | |
allow jcr:read on /var/workflow/packages | |
end | |
" | |
] | |
scripts=[" | |
# these users and ACLs are only necessary on author | |
create service user acs-commons-workflow-remover-service with path system/acs-commons | |
set principal ACL for acs-commons-workflow-remover-service | |
allow jcr:read, rep:write on /var/workflow/instances | |
end | |
create service user acs-commons-workflowpackagemanager-service with path system/acs-commons | |
set principal ACL for acs-commons-workflowpackagemanager-service | |
allow jcr:read on /var/workflow/packages | |
end | |
" | |
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing opening quote
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 71adee8.
Although the log does not contain details about why it is blocked (more should be visible in the authors log) I assume some other rights are missing.
It seems that these filter rules conflict with that
Would be good to have those limitations more clearly documented though.... |
I manually executed cp2fm (https://github.com/apache/sling-org-apache-sling-feature-cpconverter) on the ui.content package and got
The following JSON feature model was emitted:
The content package itself was not modified though in any way (i.e. it still contains the no longer necessary ACLs) I don't know how exactly the |
@kwin np - ill see if I can find someone to poke around under the covers on Monday to help us understand exactly whats breaking (and hopefully get it documented, if it should be a public-facing detail) Ill push the latest updates through the pipeline again as well to see if there's any new effect (good or bad :)) |
We should try to revert all changes related to ACLs and system users in this PR if we are sure that the cp2fm converter correctly extracts those (and they don't prevent deploying the content package later on). Then we would only need to maintain a repoinit for creation of the actual /var/acs-commons node. |
@kwin just a FYI - your latest built/deployed to successful clean AEM CS (vs. on-top of prior failure). I can find some time tonight to ensure the repoinits actaully did their job as well (nodes/acls/service users). LMK if you want to run any more updates through a pipeline. |
Would be good to first deploy 4.11.2 (usually first deployment runs fine despite var nodes) and then run the update on the mutable repository with this PR. Still I am wondering if the commit f8dcc55 fixed the build, or that was just coincidence. Would also be good to try to manually deploy the acs-aem-commons-ui.content package from author to publish and afterwards check the distribution queue. |
kk - i can try to recreate my envs and do the fresh installs. It seems positive that it succeeded to the previously-failing dev pipeline, and then the same commit build to Stage/Prod fine. |
@kwin shoot - ran it again on a brand new Dev env, and it failed similarly.. so i guess it was a coincidence it passed before :( |
The log does not contain the relevant information. Any chance you can get the error.log on the publish or at least the replication.log from the author? @davidjgonzalez Do you think you can somehow get further information about the limitations of mutable content packages? |
I made several more tests with modified acs-aem-commons-ui.content packages (in version 4.11.0) in the Cloud. With a package containing oak:index, home and the root ACLs I see the following errors
So it seems that also the rep:policy at the root node needs to be removed. |
@davidjgonzalez Can you try again with the updated PR? In case that still fails we would need the publish log to see the underlying issue. |
@davidjgonzalez Any update? A lot of people are waiting for a AEMaaCS compliant release.... |
Travis build is failing, any insight into the oakpal failure?
|
oakpal does not evaluate repoinit configurations. We probably need to fix oakpal or just disable those checks. |
I'm ok with either, but we can't push this PR through if it breaks the build. I'll leave the decision to you, I have no particular preference. |
Havent found anyone that's been able to dig around the backend. There are issues with some service users (but i don't expect that should fail the build?) There's also an issue with applying ACLs to /var/workflow/instances - as it thinks the path is non-existent. TBH, im not sure which of these errors in the log qualifies as a build-failing event. |
@davidjgonzalez Thanks for the log. It seems that |
I now removed the failing oakPal checks in 2eb7587 |
2eb7587
to
9019eaf
Compare
@kwin fyi - i had to fix a number of ERRORs being thrown, mostly around repoinit trying to set ACLs on paths that didnt exist yet, and service users that didnt exist when OSGi components were starting (basically moved more stuff out to repoinit). It looks like the ui.content package is still failing to publish
I do see
I suspect? that maybe that last line that lists the path as Im trying to shuffle a few things around in repoinit to see if it has any effect.. LMK if you see anything that stands out. I can make a PR back to you with the other changes i had to make too. |
The line
is Instead it is triggered by the default repoinit config (https://repo1.maven.org/maven2/com/adobe/aem/aem-sdk-api/2021.2.4944.20210221T230729Z-210128/, line 5763ff):
If the content distribution does still not work it must have another reason. That can only be seen in the publish log unfortunately. Can you attach that as well? Please also commit the other changes.... |
@kwin let me try again with a fresh pull from yours - i think you fixed the prior errors with missing service-users and paths that i had patched into my branch. Ill grab the Publish logs as well. |
[AEM CS Deployment] Removed failing test
[AEM CS Deployment] Suppress false positives during SonarQube scan
@kwin shoot - didnt work. I pushed some more fixes to your branch as a PR (some immediate OSGi components that required service users were throwing ERRORs; figured might as well fix them to rule them out as potential causes) Attaching the deploy log and the publish logs. I'm not seeing any errors in the publish logs, and whats logs seems reasonable (though I'm not that in tune w/ the inner workings of the deployment process) I've requested access to some backend systems to see if I can poke around myself, but in the meantime - not sure if you see anything of interest. Keep in mind I've done a few deployments today - so you'll want to true up timestamps off the deploy_step log:
|
The attached deploy_step log shows an error from the 3rd of March, while both author and publish error are from 4th of March. Can you attach the deploy_step log from the 4th of March? |
@kwin Oh wow - i just realized that ALL deploy_step logs for my pipeline executions have the same file name .. I always assumed that that number postfix (905717) was unique per execution... make sense why i had |
[AEM CS Deployment] Fixes to prevent ERRORs in bundle
mapping is available
@@ -0,0 +1,29 @@ | |||
scripts=[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved back to regular ACLs set via the content package.
The root access on root should be a granted to a user group via repoinit. The service user can be member of that group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean this entire repo unit? This script is to prevent ERRORs on bundle/component start. Ultimately they will be in repoinits anyhow..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how rep:Acls on mutable content are converted, as those must be set during the real pod execution.If they are converted that would indeed change the semantics as with content packages ACLs on non-existing content is silently ignored while for repoinit it leads to an exception....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double-checked that its fine (preferred) to use repoinit to manage the mutable space, so unless there are other concerns i vote to keep it as repoinit (since that's the ideal state anyhow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, agree, @davidjgonzalez do you remove the creation of the /etc/tags
path then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes - sorry, did a deployment to make sure it was ok. Let me check on it.
I don't think we can even put /etc/tags into ui.content package, since the mere presence of that node causes TagManager API to use it (/etc/tags) as the tag root.
…nager (but not local or Travis)
[AEM CS Deployment] Fixes and reversions
58f4f4f
into
Adobe-Consulting-Services:acs-aem-commons-5.0.0
* master: [maven-release-plugin] prepare for next development iteration [maven-release-plugin] prepare release acs-aem-commons-5.0.2 v5.0.2 release Don't set ACLs on potentially non-existing path /etc/tags (Adobe-Consulting-Services#2547) [maven-release-plugin] prepare for next development iteration [maven-release-plugin] prepare release acs-aem-commons-5.0.0 [maven-release-plugin] prepare for next development iteration [maven-release-plugin] prepare release acs-aem-commons-4.12.0 4.12.0 changelog Sorter (Update) (Adobe-Consulting-Services#2544) Adobe-Consulting-Services#2542 - Fixed VanityUrlAdjuster package location (Adobe-Consulting-Services#2543) ACS Redirect Manager (Adobe-Consulting-Services#2513) Adding (backwards compatible) options to the i18nprovider / injector to: (Adobe-Consulting-Services#2518) Add append option to dataimporter (Adobe-Consulting-Services#2535) Make sure show hide of tabs also works when coral panel is of region (Adobe-Consulting-Services#2540) Make ui.content package no longer block replication queue on AEMaaCS (Adobe-Consulting-Services#2523) Introduce marker interface for ClusterLeader (Adobe-Consulting-Services#2499) # Conflicts: # CHANGELOG.md
Give underlying system user write access to /var/acs-commons
Set other ACLs below /var only on Author
This closes #2341