Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate CI Failures #3064

Closed
cwperks opened this issue Jul 28, 2023 · 11 comments · Fixed by #3069
Closed

Investigate CI Failures #3064

cwperks opened this issue Jul 28, 2023 · 11 comments · Fixed by #3069
Assignees
Labels
bug Something isn't working untriaged Require the attention of the repository maintainers and may need to be prioritized

Comments

@cwperks
Copy link
Member

cwperks commented Jul 28, 2023

The PR queue is piling up with various issues seen on pull requests.

There is one known error causing flakiness in primarily the ComplianceAuditlogTest due to an issue described #3021 [1] which is being actively worked on by @DarshitChanpura.

As of writing, the top 3 PRs in the queue are failing with various different issues:

Getting to the root of CI issues and getting stable CI will help clear the PR queue from piling up

Related

@cwperks cwperks added bug Something isn't working untriaged Require the attention of the repository maintainers and may need to be prioritized labels Jul 28, 2023
@peternied
Copy link
Member

I've broken the windows wget install into a separate issue that I'll track down

@peternied
Copy link
Member

I'm seeing the following consistant failures, lets start with these:

test group / name failure count (4== all runs failed)
ciSecurityIntegrationTest -
org.opensearch.security.DataStreamIntegrationTests.testFieldMaskingOnDataStream 4
org.opensearch.security.DataStreamIntegrationTests.testFLSOnBackingIndicesOfDataStream 4
citest -
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testSourceFilter 4
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testInternalConfig 4
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testSourceFilterMsearch 4
dlicDlsflsTest -
org.opensearch.security.dlic.dlsfls.FlsExistsFieldsTest.testExistsField 4
org.opensearch.security.dlic.dlsfls.FlsFieldsTest.testFields2 4
org.opensearch.security.dlic.dlsfls.CustomFieldMaskedTest.testCustomMaskedGet 4
org.opensearch.security.dlic.dlsfls.FlsFieldsWcTest.testFields2 4
org.opensearch.security.dlic.dlsfls.DateMathTest.testSearch 4
org.opensearch.security.dlic.dlsfls.FlsFlatTests.testAllFlatFieldFlsApplied 4
org.opensearch.security.dlic.dlsfls.DateMathTest.testSearchWc2 4
org.opensearch.security.dlic.dlsfls.FlsFlatTests.testSeveralFlatFieldFlsApplied 4
org.opensearch.security.dlic.dlsfls.DfmOverwritesAllTest.testDFMRestrictedUser 4
org.opensearch.security.dlic.dlsfls.FlsFlatTests.testSingleFlatFieldFlsApplied 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterMinimalRoundtripSearchTests.testCcsDifferentConfig 4
org.opensearch.security.dlic.dlsfls.FlsFlatTests.testSingleFlatFieldFlsAppliedForLimitedResults 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterSearchTest.testCcs 4
org.opensearch.security.dlic.dlsfls.FlsIndexingTests.testAllIndexFlsApplied 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterSearchTest.testCcsDifferentConfigBoth 4
org.opensearch.security.dlic.dlsfls.FlsIndexingTests.testAllIndexFlsAppliedWithAlias 4
org.opensearch.security.dlic.dlsfls.FlsDlsTestAB.testDlsFlsAB 4
org.opensearch.security.dlic.dlsfls.FlsIndexingTests.testSeveralIndexFlsApplied 4
org.opensearch.security.dlic.dlsfls.CustomFieldMaskedComplexMappingTest.testComplexMappingSearch 4
org.opensearch.security.dlic.dlsfls.FlsIndexingTests.testSingleIndexFlsApplied 4
org.opensearch.security.dlic.dlsfls.DateMathTest.testSearchWc 4
org.opensearch.security.dlic.dlsfls.FlsIndexingTests.testSingleIndexFlsAppliedForLimitedResults 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterMinimalRoundtripSearchTests.testCcs 4
org.opensearch.security.dlic.dlsfls.FlsKeywordTests.testKeywordsAreAutomaticallyFiltered 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterSearchTest.testCcsDifferentConfig 4
org.opensearch.security.dlic.dlsfls.FlsTest.testFlsGet 4
org.opensearch.security.dlic.dlsfls.FlsDlsTestMulti.testDlsFls 4
org.opensearch.security.dlic.dlsfls.FlsTest.testFlsSearch 4
org.opensearch.security.dlic.dlsfls.DfmOverwritesAllTest.testDFMRestrictedAndUnrestrictedOneIndex 4
org.opensearch.security.dlic.dlsfls.IndexPatternTest.testSearch 4
org.opensearch.security.dlic.dlsfls.FieldMaskedTest.testMaskedGet 4
org.opensearch.security.dlic.dlsfls.IndexPatternTest.testSearchWc 4
org.opensearch.security.dlic.dlsfls.DlsFlsCrossClusterMinimalRoundtripSearchTests.testCcsDifferentConfigBoth 4
org.opensearch.security.dlic.dlsfls.IndexPatternTest.testSearchWcRegex 4
org.opensearch.security.dlic.dlsfls.CustomFieldMaskedTest.testCustomMaskedSearch 4
org.opensearch.security.dlic.dlsfls.MFlsTest.testFlsMGetSearch 4

@peternied
Copy link
Member

I'm going to tackle citest

  • @RyanL1997 could you look into the ciSecurityIntegrationTest tests
  • @scrawfor99 could you look into the dlicDlsflsTest tests

@peternied
Copy link
Member

Note; there are still these tests that are intermittently failing, please address them in # of failure order after addressing the consistent failures

Row Labels Count of Failed Test
ciSecurityIntegrationTest -
org.opensearch.security.IntegrationTests.testDeleteByQueryDnfof 1
org.opensearch.security.HttpIntegrationTests.testBulk 1
org.opensearch.security.SlowIntegrationTests.testDelayInSecurityIndexInitialization 1
org.opensearch.security.HttpIntegrationTests.testHTTPBasic 1
org.opensearch.security.TransportUserInjectorIntegTest.testSecurityUserInjection 1
org.opensearch.security.SystemIntegratorsTests.testInjectedAdminUser 1
org.opensearch.security.HttpIntegrationTests.testTenantInfo 3
citest -
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testWriteHistory 1
org.opensearch.security.auditlog.compliance.RestApiComplianceAuditlogTest.testAutoInit 1
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testUpdate 1
org.opensearch.security.util.SettingsBasedSSLConfiguratorV4Test.testPemClientAuth 1
org.opensearch.security.multitenancy.test.MultitenancyTests.testMtMulti 1
com.amazon.dlic.auth.http.jwt.keybyoidc.SingleKeyHTTPJwtKeyByOpenIdConnectAuthenticatorTest.mismatchedAlgTest 1
org.opensearch.security.SecurityAdminTests.testSecurityAdminHostnameVerificationNotEnforced 1
org.opensearch.security.auditlog.compliance.ComplianceAuditlogTest.testComplianceEnable 2
dlicDlsflsTest -
org.opensearch.security.dlic.dlsfls.CCReplicationTest.testReplication 1

@stephen-crawford
Copy link
Contributor

Issue is not from the REST API change.

@cwperks
Copy link
Member Author

cwperks commented Jul 28, 2023

I believe these test failures are a result of the Lucene upgrade in core: opensearch-project/OpenSearch#8668

Some audit log events are working like access granted or access denied, but its failing to log document read events.

Looking into this further now.

Could any of the recent changes in IndexSearcher relate? https://github.com/apache/lucene/commits/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

Used here: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/DlsFlsFilterLeafReader.java#L232

@reta
Copy link
Collaborator

reta commented Jul 28, 2023

Thanks @cwperks , looking into it!

@peternied
Copy link
Member

We are seeing issues in DLS/FLS & Audit logging because the way we detect field access is through the DlsFlsFilterLeafReader which looks like no longer works in the same with the most recent version of lucene. This class is used to notify the audit logging systems after a document read has been completed so it can save the document get audit event and it should be of little surprise that the DLS/FLS systems depend on knowing what fields were accessed.

This is what is impacting all of the different classes of tests that are consistently failing [comment] - same root cause. Intermittent tests still need consideration.


Pretty sure the underlying reason is coming from changes to IndexReader [link] document(...) methods have been marked as deprecated. We will need to use the alternative paths / adapt for them.

  /**
   * Expert: visits the fields of a stored document, for custom processing/loading of each field. If
   * you simply want to load all fields, use {@link #document(int)}. If you want to load a subset,
   * use {@link DocumentStoredFieldVisitor}.
   *
   * @deprecated use {@link #storedFields()} to retrieve one or more documents
   */
  @Deprecated
  public abstract void document(int docID, StoredFieldVisitor visitor) throws IOException;

@peternied
Copy link
Member

Haha- well @cwperks you beat me to it. I say that is definitely what is going on.

@reta
Copy link
Collaborator

reta commented Jul 29, 2023

@cwperks thanks a lot for the investigation, are you looking into the fix or you need a hand there? (I started to look but haven't yet reached the cause and/or solution), thank you

@cwperks
Copy link
Member Author

cwperks commented Jul 29, 2023

Hey @reta , I just opened up a draft PR a second ago: #3069

It was a Lucene-related change that broke the build, but it wasn't the upgrade to 9.8 that broke the security plugin's build. It was this PR: opensearch-project/OpenSearch#7792

willyborankin pushed a commit that referenced this issue Jul 31, 2023
…3069)

There are multiple PRs in core affecting the security plugin that the
security plugin needs to adapt to.

- opensearch-project/OpenSearch#7792
- opensearch-project/OpenSearch#8826
- opensearch-project/OpenSearch#8668

I am opening a Draft PR that includes a fix for the Lucene-related test
failures which was caused by
opensearch-project/OpenSearch#7792

Resolves: #3064

Signed-off-by: Craig Perkins <[email protected]>
cwperks added a commit to cwperks/security that referenced this issue Jul 31, 2023
…pensearch-project#3069)

There are multiple PRs in core affecting the security plugin that the
security plugin needs to adapt to.

- opensearch-project/OpenSearch#7792
- opensearch-project/OpenSearch#8826
- opensearch-project/OpenSearch#8668

I am opening a Draft PR that includes a fix for the Lucene-related test
failures which was caused by
opensearch-project/OpenSearch#7792

Resolves: opensearch-project#3064

Signed-off-by: Craig Perkins <[email protected]>
(cherry picked from commit 08d1734)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged Require the attention of the repository maintainers and may need to be prioritized
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants