Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bind to non-localhost for transport in some cases #82973

Merged

Conversation

jkakavas
Copy link
Member

When enrolling a new node to an existing cluster, we sometimes need to
bind transport to non-localhost addresse. If the other nodes of the
cluster are on different hosts than this node, then the default
configuration of binding transport layer to localhost will prevent this
node to join the cluster even after "successful" enrollment.
We check the non-localhost transport addresses that we receive during
enrollment and if any of these are not in the list of non-localhost IP
addresses that we gather from all interfaces of the current host, we
assume that at least some other node in the cluster runs on another host.

  • If we don't find any non localhost addresses on all interfaces of the
    current host, we don't attempt to bind to non localhost addresses as this
    would result on an error on startup.
  • If the transport layer addresses we found out in enrollment are all
    localhost, we cannot be sure where we are still on the same host, but we
    assume that as it is safer to do so and do not bind to non localhost for
    this node either.

When enrolling a new node to an existing cluster, we sometimes need to
bind transport to non-localhost addresse. If the other nodes of the
cluster are on different hosts than this node, then the default
configuration of binding transport layer to localhost will prevent this
node to join the cluster even after "successful" enrollment.
We check the non-localhost transport addresses that we receive during
enrollment and if any of these are not in the list of non-localhost IP
addresses that we gather from all interfaces of the current host, we
assume that at least some other node in the cluster runs on another host.
- If we don't find any non localhost addresses on all interfaces of the
current host, we don't attempt to bind to non localhost addresses as this
would result on an error on startup.
- If the transport layer addresses we found out in enrollment are all
localhost, we cannot be sure where we are still on the same host, but we
assume that as it is safer to do so and do not bind to non localhost for
this node either.
@jkakavas jkakavas added >bug :Security/Security Security issues without another label v8.0.0 labels Jan 24, 2022
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Jan 24, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

boolean inEnrollmentMode,
List<String> transportStringAddresses,
InetAddress[] localIP
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found the method just a bit hard to follow.

I believe conceptually it goes over the non-loopback node transport (publish) addresses, which it gets from the node enroll API, and checks if all of them are an addresses on the current host (Network.getAllAddresses).

We can agree to a rewrite in a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can agree to a rewrite in a follow-up.

lets do it now! This is a small enough PR, no need to do the work again in a follow up. Do you have any concrete suggestions on what to change ? Comments ? Anything you'd prefer to change in code ?

Your description is spot on:

  1. If this is not an enrolling node, we should only bind to localhost ( default behavior )
  2. We get all non loopback addresses of the current host.
    a. If there are all localhost, we bind this node to localhost too for the transport layer, otherwise the node will fail to even start
  3. We get all transport addresses we found out in the enrollement token, the URI - InetAddress.getByName dance is necessary because we have strings of host:port and we want to get the host
    a. If any of these, is not an IP address that we have on the current host, we assume there is a node of the cluster that is not on the current host, and as such we bind transport to non-localhost so that it can talk to that node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a concrete suggestion:

    static boolean anyRemoteHostNodeAddress(List<String> allNodesTransportPublishAddresses, InetAddress[] allHostAddresses) {
        final List<InetAddress> allAddressesList = Arrays.asList(allHostAddresses);
        for (String nodeStringAddress : allNodesTransportPublishAddresses) {
            try {
                final URI uri = new URI("http://" + nodeStringAddress);
                final InetAddress nodeAddress = InetAddress.getByName(uri.getHost());
                if (false == nodeAddress.isLoopbackAddress() && false == allAddressesList.contains(nodeAddress)) {
                    // this node's address is on a remote host
                    return true;
                }
            } catch (URISyntaxException | UnknownHostException e) {
                // I don't know what to do with invalid ip addresses here. Ignoring
            }
        }
        return false;
    }

I think this captures the point 3 above. We do the String->InetAddress conversion in multiple places, and would be reasonable to factor out, but we don't have to deal with that now.
Point 2. I think is covered by the hostSettingValue method, which will avoid to bind to _site_ in any case, if the host doesn't have "site" addresses configured.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 2. I think is covered by the hostSettingValue method, which will avoid to bind to site in any case, if the host doesn't have "site" addresses configured.

good point!

Here is a concrete suggestion:

Sounds good to me, I incorporated your suggestion

@@ -325,7 +325,7 @@ private static void outputInformationToConsole(
+ ", using the enrollment token that you generated."
);
} else if (Strings.isEmpty(nodeEnrollmentToken)) {
builder.append(infoBullet + " Configure other nodes to join this cluster:");
builder.append(infoBullet + " Configure other nodes to join this cluster:");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you're tackling #82740 (comment) .
If it is important to you, take care to align the leading spacing of subsequent rows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is important to you, take care to align the leading spacing of subsequent rows.

I don't think that matters much ( in my opinion at least ). I just wanted to make sure that the words do not collide with the unicode symbols ( not sure if we can fix that universally either but I thought I'd try and we can always iterate based on feedback )

Copy link
Contributor

@albertzaharovits albertzaharovits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticsearchmachine
Copy link
Collaborator

Hi @jkakavas, I've created a changelog YAML for you.

@jkakavas jkakavas added the auto-backport Automatically create backport pull requests when merged label Jan 25, 2022
@jkakavas jkakavas merged commit fbcc9d5 into elastic:master Jan 25, 2022
jkakavas added a commit to jkakavas/elasticsearch that referenced this pull request Jan 25, 2022
When enrolling a new node to an existing cluster, we sometimes need to
bind transport to non-localhost addresse. If the other nodes of the
cluster are on different hosts than this node, then the default
configuration of binding transport layer to localhost will prevent this
node to join the cluster even after "successful" enrollment.
We check the non-localhost transport addresses that we receive during
enrollment and if any of these are not in the list of non-localhost IP
addresses that we gather from all interfaces of the current host, we
assume that at least some other node in the cluster runs on another host.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.0

jkakavas added a commit that referenced this pull request Jan 25, 2022
When enrolling a new node to an existing cluster, we sometimes need to
bind transport to non-localhost addresse. If the other nodes of the
cluster are on different hosts than this node, then the default
configuration of binding transport layer to localhost will prevent this
node to join the cluster even after "successful" enrollment.
We check the non-localhost transport addresses that we receive during
enrollment and if any of these are not in the list of non-localhost IP
addresses that we gather from all interfaces of the current host, we
assume that at least some other node in the cluster runs on another host.
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Jan 26, 2022
* upstream/master: (762 commits)
  [DOCS] Add note to that log4j customization is outside the support scope (elastic#82668)
  Batch Index Settings Update Requests (elastic#82896)
  [DOCS] Delete pipeline containing stored script (elastic#83102)
  Try again to fix changelog areas after reorg (elastic#83100)
  Bind to non-localhost for transport in some cases (elastic#82973)
  [DOCS] Reuse multi-level `join` warning (elastic#82976)
  Remove unnecessary CopyOnWriteHashMap class (elastic#83040)
  Adjust changelog categories after reorg (elastic#83087)
  [DOCS] Fix typo in `action.destructive_requires_name` breaking change (elastic#83085)
  Stack Monitoring: Add Enterprise Search monitoring index templates (elastic#82743)
  [DOCS] Fix stored script example snippet (elastic#83056)
  [DOCS] Re-add network traffic para to `term` query (elastic#83047)
  [DOCS] Rename example stored script (elastic#83054)
  [ML][DOCS] Add Trained model APIs to the REST APIs index (elastic#82791)
  [ML] Update running process when global calendar changes (elastic#83044)
  [Transform] Fix condition on which the transform stops processing buckets (elastic#82852)
  [DOCS] Fixes field names in ML sum functions. (elastic#83048)
  [ML] fix NLP tokenization never_split handling around punctuation (elastic#82982)
  Construct dynamic updates directly via object builders (elastic#81449)
  Emit trace.id into audit logs (elastic#82849)
  ...

# Conflicts:
#	client/rest-high-level/src/test/java/org/elasticsearch/client/IndicesClientIT.java
#	client/rest-high-level/src/test/java/org/elasticsearch/client/documentation/ILMDocumentationIT.java
#	server/src/main/java/org/elasticsearch/action/admin/indices/rollover/Condition.java
#	server/src/test/java/org/elasticsearch/action/admin/indices/rollover/ConditionTests.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/RolloverActionTests.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/TimeseriesLifecycleTypeTests.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/WaitForRolloverReadyStepTests.java
@pugnascotia pugnascotia removed the v8.0.0 label Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Security/Security Security issues without another label Team:Security Meta label for security team v8.0.0-rc2 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants