Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop accessing System Indices directly in REST tests #62501

Open
14 of 38 tasks
gwbrown opened this issue Sep 16, 2020 · 19 comments
Open
14 of 38 tasks

Stop accessing System Indices directly in REST tests #62501

gwbrown opened this issue Sep 16, 2020 · 19 comments
Labels
:Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. Meta :ml/Transform Transform :ml Machine learning :Security/Security Security issues without another label Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:ML Meta label for the ML team Team:Security Meta label for security team >test Issues or PRs that are addressing/adding tests

Comments

@gwbrown
Copy link
Contributor

gwbrown commented Sep 16, 2020

As part of formalizing System Indices in Elasticsearch (#50251), we will be restricting access to System Indices via the REST interface, except by means of purpose-built APIs. For example, accessing .watches directly will not be allowed - all interaction with the .watches index via the REST API should use the purpose-built Watcher APIs.

If it is necessary to access System Indices via the REST API today, an API should be added to fulfill that need rather than using generic APIs to do so.

There are a number of places in our REST tests which access System Indices directly. These tests will be modified by #60945 to avoid failures by consuming the deprecation warning as necessary, but that is not a long-term solution. The tests below should be evaluated to determine if there is a necessary API that is currently missing, and if so, add that API and convert the test to use it. If not, we should determine an alternate method for testing the functionality which does not directly access System Indices via the REST API.

Tests/test-related methods which currently access System Indices directly via the REST API:

Test Framework

Tasks

  • delete_by_query/80_slices.yml - Multiple slices with wait_for_completion=false
  • delete_by_query/80_slices.yml - Multiple slices with rethrottle
  • reindex/80_slices.yml - Multiple slices with wait_for_completion=false
  • reindex/80_slices.yml - Multiple slices with rethrottle
  • update_by_query/80_slices.yml - Multiple slices with wait_for_completion=false
  • update_by_query/80_slices.yml - Multiple slices with rethrottle
  • rolling_upgrade .. upgraded_cluster/10_basic.yml - Find a task result record from the old cluster

Transforms

  • TransformUsageIT.testUsage
  • TransformConfigurationIndexIT.testDeleteConfigurationLeftOver
  • TranformInternalIndexIT.testUpdateDeletesOldTransformConfig
  • TransformRestTestCase.wipeTransforms
  • TranformRobustnessIT.beEvilAndDeleteTheTransformIndex
  • rolling-upgrade .. upgraded_cluster/80_transform_jobs_crud.yml - Test index mappings for latest internal index and audit index

ML

Watcher

Security

  • FullClusterRestartIT.testSecurityNativeRealm
  • users/10_basic.yml - Test put user with password hash
@gwbrown gwbrown added >test Issues or PRs that are addressing/adding tests :Delivery/Build Build or test infrastructure Meta :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. :ml Machine learning :Security/Security Security issues without another label :Data Management/Watcher :ml/Transform Transform labels Sep 16, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Build)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Security)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Watcher)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Task Management)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@elasticmachine elasticmachine added Team:Core/Infra Meta label for core/infra team Team:Security Meta label for security team Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Sep 16, 2020
@hendrikmuhs
Copy link

As for some of the transform tests: Some tests challenge the robustness of the system (e.g. beEvilAndDeleteTheTransformIndex), another verifies mappings. I am not arguing that all transform tests should be kept as they are, but at least some have a good reason(the others should avoid using system indices).

It would be good to have a way to mark single tests as valid case, similar to @SuppressWarnings (that excludes yaml tests, I think that's fine).

@Mpdreamz
Copy link
Member

cc @elastic/clients-team this will affect us as well. expand_wildcards won't be allowed to touch system indices anymore either so a new dedicated API to delete all system API's needs to be introduced so we can reliably run yaml rest spec setup/teardown.

@gwbrown
Copy link
Contributor Author

gwbrown commented Sep 21, 2020

Some tests challenge the robustness of the system (e.g. beEvilAndDeleteTheTransformIndex)

Validating the behavior of the system in the event of a user doing something they shouldn't like this is a valid use case for either using this parameter and/or ignoring deprecation warnings as necessary for as long as this behavior is allowed. However, our intent is to disable access to System Indices via REST entirely, so these tests would no longer be valid at that time (because the user can no longer be evil and delete the transform index). In that case, we should note that the test case can be removed when it is no longer possible to access System Indices via REST.

another verifies mappings

These cases still run into the same issue - eventually, it will not be possible to access these indices via the general-purpose index APIs. For these, I can think of a few options - we could either move the mapping verification into Elasticsearch and expose it as an endpoint (e.g. POST _ml/verify_index_mappings), or add a special purpose API that returns the mappings of these indices.

@droberts195
Copy link
Contributor

so we can reliably run yaml rest spec setup/teardown.

This raises a very important question. At the moment all indices are removed in between YAML tests. This is done in the test client. What is the plan for system indices in between YAML tests? Will they eventually not be cleaned up in between YAML tests like user-visible indices? Or will there be a special API to delete all system indices in between YAML tests? Basically, the problem Martijn outlined doesn't just affect the language clients maintained by the clients team but also the test harness used within the Elasticsearch repo, and the solution should be the same for both.

@gwbrown
Copy link
Contributor Author

gwbrown commented Sep 21, 2020

System Indices will be wiped between REST tests (both YAML and Java) at the same point they are today. Exactly how we're going to implement this when the allow_system_index_access flag is removed isn't fully nailed down yet, but it's likely to take the form of a special API to delete all system indices, sort of a "factory reset"-type API. In any event, we (the team implementing System Indices functionality, see assignees of #50251) will handle any API changes necessary to Elasticsearch and the test infrastructure in the Elasticsearch repo to clean up System Indices as they are today and ensure the same mechanism is exposed for the Clients team.

Reworking the tests to not wipe system indices between tests would be a large undertaking, and has not been something we've even seriously considered thus far.

@hendrikmuhs
Copy link

I think a "factory reset" API should be added as a must to the meta issue before eventually locking the access to system indices. We see users doing a manual reset by deleting the indices very often in ML. A challenge to that: For a factory reset any jobs(ml/transform/rollup/...) must be stopped. You also might not want to reset everything but only parts. I think it makes sense to have a reset per feature and a global reset that calls all of them.

@droberts195
Copy link
Contributor

I like the idea of a "factory reset" API. That would make our cleanup in between tests as simple as "call the factory reset API". At the moment we have lots of bits of cleanup code that are written differently for internal cluster tests, REST tests and integration tests that use the node client. If that logic was moved into Elasticsearch server-side then the client-side cleanup code would be trivial. This would also help the language clients in the long run (after the initial pain of switching over). At the moment they have to try to mimic the various bits of cleanup code we run in every language.

A challenge to that: For a factory reset any jobs(ml/transform/rollup/...) must be stopped.

Yes agreed.

To properly "factory reset" would involve more than simply deleting all the system indices. In fact, having a documented API that just deletes all system indices could cause problems. For ML if system indices are deleted before persistent tasks are cancelled then those persistent tasks become hard to deal with. Also, ML has a mix of hidden and system indices, and deleting the system indices while leaving the hidden indices behind will cause weird effects if ML is used again afterwards.

So a factory reset would look something like:

  1. Cancel all persistent tasks
  2. Wait for all pending tasks to complete (so the side effects of cancelling the persistent tasks don't interfere with the index deletion)
  3. Delete all hidden indices whose names begin with .
  4. Delete all system indices

@gwbrown
Copy link
Contributor Author

gwbrown commented Sep 22, 2020

@hendrikmuhs @droberts195 I've opened #62778 to cover that issue, and linked it from the ESRestTestCase.wipeAllIndices point above. Let's continue discussion over there. I'll also add this as a topic in this week's System Indices sync.

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Oct 1, 2020
This test accesses system indices for 2 reasons.

First, it creates a filter that has a different type. This was done
to assert that filter is not returned from the APIs. However,
now that access to the `.ml-meta` index is restricted,
it is not really a concern.

Second, it creates a `.ml-meta` index without mappings to test
the get API does not fail due to lack of mappings on a sorted field,
namely the `filter_id`. Once again, this test is less useful once
system indices have restricted access.

Relates elastic#62501
martijnvg added a commit that referenced this issue Dec 4, 2020
Make use of the new query watches api to simply list watches or
to delete all watches using the delete watch api.

Relates to #62501
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 4, 2020
Backporting elastic#65835 to 7.x branch.

Make use of the new query watches api to simply list watches or
to delete all watches using the delete watch api.

Relates to elastic#62501
martijnvg added a commit that referenced this issue Dec 4, 2020
Backporting #65835 to 7.x branch.

Make use of the new query watches api to simply list watches or
to delete all watches using the delete watch api.

Relates to #62501
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 4, 2020
Make use for the cluster state api to check the index.format setting
of the .watches index instead of querying the index settings api.

This removes the last direct usage of .watches index mentioned in elastic#62501.
martijnvg added a commit that referenced this issue Dec 7, 2020
Make use for the cluster state api to check the index.format setting
of the .watches index instead of querying the index settings api.

This removes the last direct usage of .watches index mentioned in #62501.
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 7, 2020
Backport elastic#65884 to 7.x branch.

Make use for the cluster state api to check the index.format setting
of the .watches index instead of querying the index settings api.

This removes the last direct usage of .watches index mentioned in elastic#62501.
martijnvg added a commit that referenced this issue Dec 7, 2020
Backport #65884 to 7.x branch.

Make use for the cluster state api to check the index.format setting
of the .watches index instead of querying the index settings api.

This removes the last direct usage of .watches index mentioned in #62501.
@mark-vieira mark-vieira removed :Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team labels Jan 29, 2021
@mark-vieira
Copy link
Contributor

FYI, I'm removing the Delivery team label here since the focus here is on specific test implementations not testing infrastructure.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Mar 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@dakrone dakrone removed :Data Management/Watcher Team:Data Management Meta label for data/management team labels May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. Meta :ml/Transform Transform :ml Machine learning :Security/Security Security issues without another label Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:ML Meta label for the ML team Team:Security Meta label for security team >test Issues or PRs that are addressing/adding tests
Projects
None yet
Development

No branches or pull requests

9 participants