Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge latest code from author #2

Merged
merged 59 commits into from
Aug 26, 2019
Merged

Conversation

dengweisysu
Copy link
Owner

No description provided.

dnhatn and others added 30 commits August 21, 2019 18:14
If soft-deletes is enabled, we will trim translog above the local
checkpoint of the safe commit immediately. However, if the translog
durability is async, the last commit might not be the safe commit as the
local checkpoint won't advance until translog is synced. Therefore, we 
need to verify translog stats busily.

Closes #45801
Relates #45473
Today we can release a Store using CancellableThreads. If we are holding
the last reference, then we will verify the node lock before deleting
the store. Checking node lock performs some I/O on FileChannel. If the
current thread is interrupted, then the channel will be closed and the
node lock will also be invalid.

Closes #45237
* correct version bounds to 7.4 for cat.alias rest tests

* re-enable bwc tests
…bs on S3 (#45383)

This commit adds tests to verify the behavior of the S3BlobContainer and 
its underlying AWS SDK client when the remote S3 service is responding 
errors or not responding at all. The expected behavior is that requests are 
retried multiple times before the client gives up and the S3BlobContainer 
bubbles up an exception.

The test verifies the behavior of BlobContainer.writeBlob() and 
BlobContainer.readBlob(). In the case of S3 writing a blob can be executed 
as a single upload or using multipart requests; the test checks both scenario 
by writing a small then a large blob.
…erty is specified (#45662)

Follow-up of #45626.
Now we always output transport.publish_address with CNAME and log
deprecation warning if es.transport.cname_in_publish_address property
is specified.
This commit also adds a test which will fail once Elasticsearch version is
changed to 9, to make sure we remove the property when doing
reversioning.

Closes #39970
Follow up on #32806.

The system property es.http.cname_in_publish_address is deprecated
starting from 7.0.0 and deprecation warning should be added if the
property is specified.
This PR will go to 7.x and master.
Follow-up PR to remove es.http.cname_in_publish_address property
completely will go to the master.
Follow up on #32806.

The system property es.http.cname_in_publish_address is deprecated
starting from 7.0.0 and deprecation warning should be added if the
property is specified.
This commit goes to 7.x and master.
Follow-up PR to remove es.http.cname_in_publish_address property
completely will go to the master.
* [DOCS] Add template docs to scripts. Reorder template examples.

* Adds a 'Search template' section to the 'How to use scripts' chapter.
  This links to the 'Search template' chapter for detailed info and
  examples.

* Reorders and retitles several examples in the 'Search template'
  chapter. This is primarily to make examples for storing, deleting, and
  using search templates more prominent.

* Change <templatename> to <templateid>
Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).
Follow-up of #45616.

Starting with 8.0.0 es.http.cname_in_publish_address setting support is
completely removed.
In internal test clusters tests we check that wiping all indices was acknowledged
but in REST tests we didn't.
This aligns the behavior in both kinds of tests.
Relates #45605 which might be caused by unacked deletes that were just slow.
* [ML][Transforms] unifying logging, adding some more logging

* using parameterizedMessage instead of string concat

* fixing bracket closure
* Added HLRC support for PinnedQueryBuilder

Related  #44074
* [ML] Adding data frame analytics stats to _usage API

* making the size of analytics stats 10k
…thrown when the process doesn't start correctly. (#45846)
Two examples had swapped the order of lang and code when creating a
script.

Relates #43884
Today, when rolling a new translog generation, we block all write
threads until a new generation is created. This choice is perfectly 
fine except in a highly concurrent environment with the translog 
async setting. We can reduce the blocking time by pre-sync the 
current generation without writeLock before rolling. The new step 
would fsync most of the data of the current generation without 
blocking write threads.

Close #45371
This commit namespaces the existing processors setting under the "node"
namespace. In doing so, we deprecate the existing processors setting in
favor of node.processors.
In case of an in-progress snapshot this endpoint was broken because
it tried to execute repository operations in the callback on a
transport thread which is not allowed (only generic or snapshot
pool are allowed here).
jasontedor and others added 29 commits August 22, 2019 19:34
This commit enables testing against JDK 14.
Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.
In the Sys V init scripts, we check for Java. This is not needed, since
the same check happens in elasticsearch-env when starting up. Having
this duplicate check has bitten us in the past, where we made a change
to the logic in elasticsearch-env, but missed updating it here. Since
there is no need for this duplicate check, we remove it from the Sys V
init scripts.
This commit changes the tests added in #45383 so that the fixture that 
emulates the S3 service now sometimes consumes all the request body 
before sending an error, sometimes consumes only a part of the request 
body and sometimes consumes nothing. The idea here is to beef up a bit 
the tests that writes blob because the client's retry logic relies on 
marking and resetting the blob's input stream.

This pull request also changes the testWriteBlobWithRetries() so that it 
(rarely) tests with a large blob (up to 1mb), which is more than the client's 
default read limit on input streams (131Kb).

Finally, it optimizes the ZeroInputStream so that it is a bit more effective 
(now works using an internal buffer and System.arraycopy() primitives).
Closing a `RemoteClusterConnection` concurrently with trying to connect
could result in double invoking the listener.

This fixes
RemoteClusterConnectionTest#testCloseWhileConcurrentlyConnecting

Closes #45845
* [ML][Transforms] fix doSaveState check

* removing unnecessary log statement
Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
`reindexing` or `analyzing` state.

This means that when the task is `stopped` there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.

In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.

This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.

When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).
…#45688)

This commits makes all the async methods in the high level client return the `Cancellable` object that the low level client now exposes.

Relates to #45379 
Closes #44802
This PR modifies the logic in IngestService to preserve the original content type 
on the IndexRequest, such that when a document with a content type like SMILE 
is submitted to a pipeline, the resulting document that is persisted will remain in 
the original content type (SMILE in this case).
This fixes two bugs:
- A recently introduced bug where an NPE will be thrown if a catch block is 
empty.
- A long-time bug where an NPE will be thrown if multiple catch blocks in a 
row are empty for the same try block.
If two translog syncs happen concurrently, then one can return before
its operations are marked as persisted. In general, this should not be
an issue; however, peer recoveries currently rely on this assumption.

Closes #29161
AbstractSimpleTransportTestCase.testTransportProfilesWithPortAndHost
expects a host to only have a single IPv4 loopback address, which isn't
necessarily the case. Allow for >= 1 address.
* [ML][Transforms] adjusting when and what to audit

* Update DataFrameTransformTask.java

* removing unnecessary audit message
The processors setting was deprecated in version 7.4.0 of Elasticsearch
for removal in Elasticsearch 8.0.0. This commit removes the processors
setting.
Now that processors is no longer a valid Elasticsearch setting, this
commit removes translation for it in the Docker entrypoint.
This commit deprecates the pidfile setting in favor of node.pidfile.
Now that the deprecation of pidfile has been backported to 7.4.0, this
commit adjusts the version-conditional logic in cluster formation tasks
for setting pidfile versus node.pidfile.
The TransportAction class has several ways to execute the action, some
of which will create a task. This commit removes those non task aware
variants in favor of handling task creation inside NodeClient for local
actions.
The pidfile setting was deprecated in version 7.4.0 of Elasticsearch for
removal in Elasticsearch 8.0.0. This commit removes the pidfile setting.
This commit allows the Transport Actions for the SSO realms to
indicate the realm that should be used to authenticate the
constructed AuthenticationToken. This is useful in the case that
many authentication realms of the same type have been configured
and where the caller of the API(Kibana or a custom web app) already
know which realm should be used so there is no need to iterate all
the realms of the same type.
The realm parameter is added in the relevant REST APIs as optional
so as not to introduce any breaking change.
@dengweisysu dengweisysu merged commit 5097ed2 into dengweisysu:master Aug 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.