Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge lastest code from author #3

Merged
merged 105 commits into from
Aug 31, 2019
Merged

Conversation

dengweisysu
Copy link
Owner

  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS and architecture that we support?
  • If you are submitting this code for a class then read our policy for that.

Andrey Ershov and others added 30 commits August 26, 2019 12:17
Today if non-TLS record is received on TLS port generic exception will
be logged with the stack-trace.
SSLExceptionHelper.isNotSslRecordException method does not work because
it's assuming that NonSslRecordException would be top-level.
This commit addresses the issue and the log would be more concise.
Adding some logging to track down #45953 and making the failing assertion log more detail
The snapshot status when blocking can still be INIT in rare cases when
the new cluster state that has the snapshot in `STARTED` hasn't yet
become visible.
Fixes #45917
This is essentially the same issue fixed in #43362 but for http request
version instead of the request method. We have to deal with the
case of not being able to parse the request version, otherwise
channel closing fails.

Fixes #43850
This commit refactors the S3 credentials tests in 
RepositoryCredentialsTests so that it now uses a single 
node (ESSingleNodeTestCase) to test how secure/insecure 
credentials are overriding each other. Using a single node 
makes it much easier to understand what each test is actually 
testing and IMO better reflect how things are initialized.

It also allows to fold into this class the test 
testInsecureRepositoryCredentials which was wrongly located 
in S3BlobStoreRepositoryTests. By moving this test away, the 
S3BlobStoreRepositoryTests class does not need the 
allow_insecure_settings option anymore and thus can be 
executed as part of the usual gradle test task.
This commit enhances logging for 2 cases:

1. If non-TLS enabled node receives transport message from TLS enabled
node on transport port.
2. If non-TLS enabled node receives HTTPs request on transport port.
Since #45473, we trim translog below the local checkpoint of the safe
commit immediately if soft-deletes enabled. In
testRestoreLocalHistoryFromTranslog, we should have a safe commit after
recoverFromTranslog is called; then we will trim translog files which
contain only operations that are at most the global checkpoint.

With this change, we relax the assertion to ensure that we don't put
operations to translog while recovering history from the local translog.
Since credentials are required to access such a repository, and these
repositories are accessed over an encrypted protocol (https), this
commit adds support to consider S3-backed artifact repositories as
secure. Additionally, we add tests for this functionality.
This adds support for verifying that snippets with the `console-result`
language are valid json. It also switches the response snippets on the
`docs/get` page from `js` to `console-result` which will allow clients
to provide "alternatives" for them like they can now do with
`// CONSOLE` snippets.
This adds a pipeline aggregation that calculates the cumulative
cardinality of a field.  It does this by iteratively merging in the
HLL sketch from consecutive buckets and emitting the cardinality up
to that point.

This is useful for things like finding the total "new" users that have
visited a website (as opposed to "repeat" visitors).

This is a Basic+ aggregation and adds a new Data Science plugin
to house it and future advanced analytics/data science aggregations.
This commit introduces PKI realm delegation. This feature
supports the PKI authentication feature in Kibana.

In essence, this creates a new API endpoint which Kibana must
call to authenticate clients that use certificates in their TLS
connection to Kibana. The API call passes to Elasticsearch the client's
certificate chain. The response contains an access token to be further
used to authenticate as the client. The client's certificates are validated
by the PKI realms that have been explicitly configured to permit
certificates from the proxy (Kibana). The user calling the delegation
API must have the delegate_pki privilege.

Closes #34396
The native process requires that there be a non-zero number of rows to analyze. If the flag --rows 0 is passed to the executable, it throws and does not start.

When building the configuration for the process we should not start the native process if there are no rows.

Adding some logging to indicate what is occurring.
* [ML] add supported types to no fields error message

* adding supported types to logger debug
)

 * Add support for a Range field ValuesSource, including decode logic for range doc values and exposing RangeType as a first class enum
 * Provide hooks in ValuesSourceConfig for aggregations to control ValuesSource class selection on missing & script values
 * Branch aggregator creation in Histogram and DateHistogram based on ValuesSource class, to enable specialization based on type.  This is similar to how Terms aggregator works.
 * Prioritize field type when available for selecting the ValuesSource class type to use for an aggregation
…IT (#45978)

SearchRestCancellationIT aborts an http request, and then checks that
the corresponding search task has been cancelled on the server-side.
There are no guarantees that the task has already been marked cancelled
after the `cancel` calls returns, and there is no easy wait for that.

This commit introduces an assertBusy to try and wait for the search task
to be marked cancelled.

Closes #45911
This commit starts from the simple premise that the use of node settings
in blob store repositories is a mistake. Here we see that the node
settings are used to get default settings for store and restore throttle
rates. Yet, since there are not any node settings registered to this
effect, there can never be a default setting to fall back to there, and
so we always end up falling back to the default rate. Since this was the
only use of node settings in blob store repository, we move them. From
this, several places fall out where we were chaining settings through
only to get them to the blob store repository, so we clean these up as
well. That leaves us with the changeset in this commit.
* Streamline GS search topic.

* Added missing comma.

* Update docs/reference/getting-started.asciidoc 

Co-Authored-By: István Zoltán Szabó <[email protected]>
Currently we use a custom CopyBytesSocketChannel for interfacing with
netty. We have integration tests that use this channel, however we never
verify the read and write behavior in the face of potential partial
writes. This commit adds a test for this behavior.
Today we create new engines under IndexShard#mutex. This is not ideal
because it can block the cluster state updates which also execute under
the same mutex. We can avoid this problem by creating new engines under
a separate mutex.

Closes #43699
Some generics were specified at too fine-grained a level.
* [DOCS] Streamlined GS aggs section.

* Update docs/reference/getting-started.asciidoc

Co-Authored-By: James Rodewig <[email protected]>
The root project uses the base plugin to get a clean task, but does not
actually need the assemble task. This commit changes the root project to
use the lifecycle-base plugin, which while still creating the assemble
task, won't add any dependencies to it.
We recently added a check to `ESIntegTestCase` in order to verify that
no http channels are being tracked when we close clusters and the
REST client. Close listeners though are invoked asynchronously, hence
this check may fail if we assert before the close listener that removes
the channel from the map is invoked.

With this commit we add an `assertBusy` so we try and wait for the map
to be empty.

Closes #45914
Closes #45955
jasontedor and others added 29 commits August 29, 2019 08:53
This commit adds AdoptOpenJDK to the testing matrix.
This uses strict validation for SLM policy ids, similar to what we use
for index names.

Resolves #45997
* Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard
* Inspired by #39657
This renames the "data-science" plugin to "analytics".
Also removes the enabled flag
* [ML] Regression dependent variable must be numeric

This adds a validation that the dependent variable of a regression
analysis must be numeric.

* Address review comments and fix some problems

In addition to addressing the review comments, this
commit fixes a few issues I found during testing.

In particular:

- if there were mappings for required fields but they were
not included we were not reporting the error
- if explicitly included fields had unsupported types we were
not reporting the error

Unfortunately, I couldn't get those fixed without refactoring
the code in `ExtractedFieldsDetector`.
…nded max scores. (#46105)

When a query contains a mandatory clause that doesn't track the max score per
block, we disable the max score optimization. Previously, we were doing this by
wrapping the collector with a FilterCollector that always returned
ScoreMode.COMPLETE.

However we weren't adjusting totalHitsThreshold, so the collector could still
call Scorer#setMinCompetitiveScore. It is against the method contract to call
setMinCompetitiveScore when the score mode is COMPLETE, and some scorers like
ReqOptSumScorer throw an error in this case.

This commit tries to disable the optimization by always setting
totalHitsThreshold to max int, as opposed to wrapping the collector.
This commit removes the `classic` similarity from code and docs in master (8.0). The `classic` similarity cannot be used on indices created after 7.0.

Closes #46058
This commit expands the documented directory layout of the rpm and deb
packages to include the bundled jdk.

closes #45150
Currently in production instances of Elasticsearch we set a couple of
system properties by default. We currently do not apply all of these
system properties in tests. This commit applies these properties in the
tests.
This commit removes the oxymoron of insecure secure settings from the
code base. In particular, we remove the ability to set the access_key
and secret_key for S3 repositories inside the repository definition (in
the cluster state). Instead, these settings now must be in the
keystore. Thus, it also removes some leniency where these settings could
be placed in the elasticsearch.yml, would not be rejected there, but
would not be consumed for any purpose.
This commit modifies the HTTP server used in S3BlobStoreRepositoryTests 
so that it randomly returns server errors for any type of request executed by
 the SDK client. It is now possible to verify that the repository tests are s
uccessfully completed even if one or more errors were returned by the S3 
service in response of a blob upload, a blob deletion or a object listing request 
etc.

Because injecting errors forces the SDK client to retry requests, the test limits
 the maximum errors to send in response for each request at 3 retries.
This commit forbids settings that are not in any namespace, all setting
names must now contain a dot.
…#45964)

If a node is misconfigured to talk to remote node HTTP port (instead of
transport port) eventually it will receive an HTTP response from the
remote node on transport port (this happens when a node sends
accidentally line terminating byte in a transport request).
If this happens today it results in a non-friendly log message and a
long stack trace.
This commit adds a check if a malformed response is HTTP response. In
this case, a concise log message would appear.
The test assumption was calling the wrong method resulting in a URL
encoding before returning the data.

Closes #44970
When recovering a shard locally, we use a translog snapshot from
newSnapshotFromGen which consists of all readers from a certain
generation. In the test, we use newSnapshotFromMinSeqNo for the
expectation. The snapshot of this method includes only readers
containing operations in the requesting range.

Closes #46022
* Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581
* Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons
   * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization
* Fixes #41581
This commit moves the plugin.mandatory settings from the plugin
directory page in the docs to the installing plugins page in the docs.
This commit takes the reworking of plugin.mandatory docs even farther by
taking this setting to its own page.
Some netty behavior is controlled by system properties. While we want to
test with the defaults for Elasticsearch for most tests, within netty we
want to ensure these netty settings exhibit correct behavior. This
commit adds variants of test and integTest tasks for netty which set the
unpooled and direct buffer pooled allocators.

relates #45881
Unfortunately, #42791 destabilized SLM tests because those tests use
rate limiting the snapshot write rate to a very low value globally.
Now that the various files in a snapshot get uploaded in parallel
this can lead to a few threads in parallel way overshooting the low
value throughput value used by the rate limiter and then making it
wait for minutes which times out the tests that then try to abort
the snapshot (see #21759 for details, aborting a snapshot only
happens when writing bytes to the repository).

For now the old behavior of the test from before my changes can
be restored by moving to a single threaded snapshot pool but
we should find a better way of testing the SLM behaviour here in
a follow-up.
@dengweisysu dengweisysu merged commit fbbf8b5 into dengweisysu:master Aug 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.