Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUD-4046 Deploy on dual-stack ocp4.8 fails #269

Closed
wants to merge 1 commit into from

Conversation

buuhsmead
Copy link

On a dual-stack OpenShift cluster the command 'hostname -i' gives two IPs back. We only need one IP, either IPv4 or IPv6, to build JBOSS_HA_ARGS.

Please make sure your PR meets the following requirements:

[ x] Pull Request title is properly formatted: [CLOUD-4046] Subject
[ x] Pull Request contains link to the JIRA issue
https://issues.redhat.com/browse/CLOUD-4046
[ x] Pull Request contains a description of the issue
On a dual-stack openshift cluster the command 'hostname -i' returns both IPs (IPv4 and IPv6). To fill IP_ADDR. That var IP_ADDR is directly used to build the var JBOSS_HA_ARG. So we get a double IP while it should be one IP.
[x ] Pull Request does not include fixes for issues other than the main ticket
[ x] Attached commits represent units of work and are properly formatted
[ x] You have read and agreed to the Developer Certificate of Origin (DCO) (see CONTRIBUTING.md)

@@ -289,7 +289,7 @@ generate_dns_ping_config() {

configure_ha_args() {
# Set HA args
IP_ADDR=`hostname -i`
IP_ADDR=`hostname -i| cut -d " " -f1`
Copy link
Contributor

@jfdenise jfdenise Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buuhsmead , I am not sure that we can take the first IP address. The server is by default configured to prefer IPv4. In case we have multiple IP addresses we should get the ipv4 I guess. @luck3y could add more there.

We have multiple places where hostname command is used to retrieve the address, seems that we should handle them all.
For example:
https://github.com/wildfly/wildfly-cekit-modules/blob/master/jboss/container/wildfly/launch/messaging/added/launch/messaging.sh#L80
https://github.com/wildfly/wildfly-cekit-modules/blob/master/jboss/container/wildfly/launch/keycloak/added/keycloak.sh#L794 (this one is a left over and should be removed).

Even in the wildfly-s2i project: https://github.com/wildfly/wildfly-s2i/blob/master/wildfly-cloud-legacy-galleon-pack/feature-pack/src/main/resources/packages/wildfly.s2i.common/content/bin/openshift-launch.sh#L8
although this one is overridden by HA args.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfdenise,
I overlooked the prefer IPv4 requirement.
I have another, not so nice, solution.

hostname -i | tr " " "\n" | sort -r | head -n 1

Is that a way of thinking or is it better to search for a (complete) different solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buuhsmead +1 to what @jfdenise has said. Your most recent idea will return one address, it may not be the ipv4 address though, which inside of OpenShift is likely the one you want. hostname -I (capital i) will get you some of the way but the ordering is not guaranteed. I'll have to have a think about how we should do this in a safer way.

Copy link
Contributor

@jfdenise jfdenise Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could detect the ':' character. An IPv4 address will never contain the ':' character.
What do you think of this function that echo the first non ipv6 address it finds:

function get_host_name() {
 local host=
 for x in $(hostname -i)
 do
  if [[ $x =~ ":" ]]; then
   continue
  else
   host=$x
   break
  fi
 done
 echo "$host"
}

We could add this function to https://github.com/wildfly/wildfly-cekit-modules/blob/master/jboss/container/wildfly/launch-config/os/added/launch/launch-common.sh
And have all places where we call hostname be replaced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfdenise I think that looks better, but I think it shouldn't be called get_host_name() as its really looking to return a v4 ip. Perhaps call it get_host_ipv4()? Also we could consider having an env variable for when this doesn't work in some environment and allows the ip / hostname to be set via a var, something like WILDFLY_BIND_HOST_ADDR maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the name.
For the env variable, I would say that, for current WildFly s2i architecture, we could keep what we have. User can set the JBOSS_HA_IP env variable already.

For WildFly s2i V2, we are keeping backward compatibility (so JBOSS_HA_IP is still recognized when using the cloud feature-pack).
When using vanilla wildfly feature-pack, the new launcher script (present in the image) defines SERVER_PUBLIC_BIND_ADDRESS that fallbacks on hostname -i (that we would replace by a call to a similar function).
So I feel that we would be good.

@@ -287,16 +287,22 @@ generate_dns_ping_config() {
echo "${config}"
}

get_first_ip4() {
ip -4 -o route get to 1 | sed -n 's/.*src \([0-9.]\+\).*/\1/p'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this new attempt! The 'ip' command is not present in our images. Although we could install it, it would be great to just use commands that exist at the operating system level.
You can try : docker run -it --rm quay.io/wildfly/wildfly-centos7:latest bash
Then type your command in the shell.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfdenise this is an attempt with the available commands inside the wildly image. There is a python 2.7.5 available ;-).

cat get-ip4.py

import socket
h_name = socket.gethostname()
IP_addres = socket.gethostbyname(h_name)
print("Host Name is:" + h_name)
print("Computer IP Address is:" + IP_addres)
[jboss@wildfly ~]$ hostname -I
10.131.0.159 fd02:0:0:4::9f
[jboss@wildfly ~]$ hostname -i
fd02:0:0:4::9f 10.131.0.159
[jboss@wildfly ~]$ python get-ip4.py
Host Name is:wildfly
Computer IP Address is:10.131.0.159

I am running on a dual-stack OCP4.9 via 'kubectl run -i --tty --rm wildfly --image=quay.io/wildfly/wildfly-centos7:latest --restart=Never -- bash'

Maybe it can be a possible alternative to use the above python code instead of installing the command /usr/bin/ip.

another 2ct ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the address from inside the container from /proc, these can probably be filtered and 127.0.0.1 removed, perhaps?

awk '/32 host/ { print f } {f=$2}' <<< "$(</proc/net/fib_trie)" | sort -n | uniq | grep -v '127.0.0.'

Copy link
Contributor

@luck3y luck3y Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I like the idea of adding another dep on python, even though it is there -- its something we may want to not have a dependency in a future image. Shell or installing ip via RPM would both be preferable IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luck3y your '/proc/net/fib_trie' is working nicely.

[jboss@wildfly ~]$ awk '/32 host/ { print f } {f=$2}' <<< "$(</proc/net/fib_trie)" | sort -n | uniq | grep -v '127.0.0.'
10.131.0.159

And is working with the current available commands. No need to install '/usr/bin/ip' or adding dependency on python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buuhsmead , FYI the future images will not contain python. Seems that this approach will allow us to get rid-off the hostname command.
Would you mind to add the new function to https://github.com/wildfly/wildfly-cekit-modules/blob/master/jboss/container/wildfly/launch-config/os/added/launch/launch-common.sh and replace all occurrences of hostname in all scripts?
Thank-you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem. Will work on it this week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buuhsmead Thanks! This looks a lot better now, I left some comments on a review.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I have all the request in this PR. So it can be reviewed then I assume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @buuhsmead , would you mind to have "just the changes" related to this PR. Any formatting changes (or changes unrelated to the fix) be removed? That will help a lot reviewing. One thing to do also would be to squash all your commits into a single onr. Thank-you!

@buuhsmead
Copy link
Author

Add the function 'get_host_ipv4' into launch_common.sh and replaced all 'hostname -i'

$elytron_assert
$oidc_elytron
$ejb_config" >> ${CLI_SCRIPT_FILE}
echo "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect these spaces are here on purpose to pad the next string that might be added?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to return the WS between the quotes.

@@ -889,7 +889,7 @@ function add_route_with_default_port() {
fi
fi
done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WS changes like this are fine, but it is preferred they're done in a seperate PR to seperate them from the functional changes.

# For this function we need to return a single ipv4 address
function get_host_ipv4() {
awk '/32 host/ { print f } {f=$2}' <<< "$(</proc/net/fib_trie)" | sort -n | uniq | grep -v '127.0.0.'
Copy link
Contributor

@luck3y luck3y Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part that worries me here is still when > 1 address is returned. I think it would be much more reliable to get the addresses, and if there are > 1 of them log a warning such as 'WARNING: get_host_ipv4() returned X ipv4 addresses, only the first (192.168.45.12) will be used. To override this value, please set JBOSS_HA_IP.

Something like that, anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luck3y It was a challenge, but we have a warning message. ;-)

@buuhsmead
Copy link
Author

Sorry, but I am stuck at creating one commit out of the several I have already done. I am not able to squash, noob problem. Can you please advice how to proceed? Even if I have to start over, I am willing todo so.

@buuhsmead buuhsmead requested a review from luck3y November 3, 2021 08:05
@luck3y
Copy link
Contributor

luck3y commented Nov 3, 2021

Sorry, but I am stuck at creating one commit out of the several I have already done. I am not able to squash, noob problem. Can you please advice how to proceed? Even if I have to start over, I am willing todo so.

Hi @buuhsmead. To squash, here's what you usually do:

$ git log
(look for your first commit in the chain here, and take the hash)
$ git rebase -i e31d7739b89392d88235b9866c3a024825f1d196
(an editor should appear with the commits listed, leave the first commit at pick and change the previous ones to s (squash)
$ git push -f
(since you've squashed you have to force push to update your github branch.)

Let me know if you need any more details here.

@buuhsmead
Copy link
Author

@luck3y thanks a bunch. That was the help I needed.
Now I hope I can pass the review ;-)

@jfdenise
Copy link
Contributor

jfdenise commented Nov 9, 2021

@buuhsmead , we still have some formatting changes. If you could remove them. Thank-you.

@buuhsmead
Copy link
Author

@jfdenise Hope it is good now? I did a WS edit and a squash. Good that you kept on pushing me to get it right, thanks.

# For this function we need to return a single ipv4 address
function get_host_ipv4() {
mapfile < <(awk '/32 host/ { print f } {f=$2}' <<< "$(</proc/net/fib_trie)" | sort -n | uniq | grep -v '127.0.0.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried locally and have a syntax error for this line.
Could you add some comments to explain what the command is doing, it is not obvious when looking at the command. Specially the awk part.
Is the variable named MAPFILE?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAPFILE is the default variable name used by Bash when using the builtin function mapfile. https://www.geeksforgeeks.org/mapfile-command-in-linux-with-examples/

For the awk part I actually just copied over from @luck3y. It is a awk block pattern '/pattern/ {command}'
the /32 host/ block-pattern filters out lines containing 32 or host from the output of /proc/net/fib_trie

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated / added info just above the function definition.

@jfdenise
Copy link
Contributor

Hi @buuhsmead do we have update for this issue? Thank-you.

@jfdenise
Copy link
Contributor

jfdenise commented Dec 1, 2021

@buuhsmead , don't you see a syntax error when running the get_host_ipv4 function?

@buuhsmead
Copy link
Author

@jfdenise Sorry, but I do no see an error message. Could you pass the error message you are seeing?.
I am testing it on hardware ('CentOS Linux 8') and inside a Pod ('Red Hat Enterprise Linux 8.5 (Ootpa)') running on OCP4.9.4 .

@jfdenise
Copy link
Contributor

jfdenise commented Dec 1, 2021

I see the error on my local RHEL7 and when executing a script containing only the function on the image.
./test.sh: line 2: syntax error near unexpected token <' ./test.sh: line 2: mapfile < <(awk '/32 host/ { print f } {f=$2}' <<< "$(</proc/net/fib_trie)" | sort -n | uniq | grep -v '127.0.0.')'

To test on the image:
docker run -it --rm quay.io/wildfly/wildfly-centos7:latest bash
In another shell:
docker container ls (You check the container ID)
docker cp test.sh [container id]:/tmp/test.sh
Then back to first shell
sh /tmp/test.sh

@buuhsmead
Copy link
Author

@jfdenise Thanks for the way you test it and show it to me. I got the same error when using 'sh' but not when using 'bash'. It is not a sh function but a bash function (mapfile).

bash /tmp/test.sh -- no error
sh /tmp/test.sh -- gives an error

So that is the difference. Then I have to find something else for the warning message. The warning message let me switch over to mapfile, overlooking the /bin/sh.

Can I have some time to refactor it into /bin/sh version?

@jfdenise
Copy link
Contributor

jfdenise commented Dec 1, 2021

@buuhsmead , that is strange, my sh is actually a link to bash. I think that is simply a syntax error that can be handled.
BTW: ping s not present in the image. Your first approach should work by adjusting it.

@buuhsmead
Copy link
Author

@jfdenise ping is available in the image used for testing quay.io/wildfly/wildfly-centos7:latest

Ok, let's assume ping is not available. I will have another look at it.

@jfdenise
Copy link
Contributor

jfdenise commented Dec 1, 2021

@buuhsmead , I think you can fix your original command.
Yes, ping is in centos7 image. Some other images are using the same module and don't contain ping.

@jfdenise
Copy link
Contributor

@buuhsmead , thanks to your investigation we are planning to offer support for dual stack and IPv6. I will close your PR.
The 0.23.x PR is: #369
Thank-you again, that really help the IPv4 support!

@jfdenise jfdenise closed this Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants