Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64 flake: iptables - port forwarding ipv4 - sctp #433

Closed
Luap99 opened this issue Oct 6, 2022 · 4 comments · Fixed by #610
Closed

aarch64 flake: iptables - port forwarding ipv4 - sctp #433

Luap99 opened this issue Oct 6, 2022 · 4 comments · Fixed by #610

Comments

@Luap99
Copy link
Member

Luap99 commented Oct 6, 2022

not ok 48 firewalld - port forwarding ipv4 - sctp
# (from function `assert' in file test/helpers.bash, line 282,
#  from function `run_nc_test' in file test/helpers.bash, line 561,
#  from function `test_port_fw' in file test/helpers.bash, line 489,
#  in test file test/200-bridge-firewalld.bats, line 231)
#   `test_port_fw proto=sctp' failed
#  nsenter -n -m -w -t 9513 ip link set lo up
#  nsenter -n -m -w -t 9513 dbus-daemon --address=unix:path=/tmp/netavark_bats.xyCUHe/netavark-firewalld --print-pid --config-file=/var/tmp/netavark/test/testfiles/firewalld-dbus.conf
# 9524
# firewalld pid: 9525
#  nsenter -n -m -w -t 9513 firewall-cmd --state
# not running
# [ rc=252 ]
#  nsenter -n -m -w -t 9513 firewall-cmd --state
# running
# {
#   "container_id": "jRjOH1qLZBfsQizL9OBdJjIDlqGakNjhHoVDsXD4dlvIfn54G0GU8anz6uBAFRzO",
#   "container_name": "name-OQLhfplYp6",
#   "port_mappings": [
#     {
#       "host_ip": "",
#       "container_port": 14140,
#       "host_port": 13895,
#       "range": 1,
#       "protocol": "sctp"
#     }
#   ],
#   "networks": {
#     "podman1": {
#       "static_ips": [
#         "10.107.37.189"
#       ],
#       "interface_name": "eth0"
#     }
#   },
#   "network_info": {
#     "podman1": {
#       "name": "podman1",
#       "id": "ed82e3a703682a9c09629d3cf45c1f1e7da5b32aeff3faf82837ef4d005356e6",
#       "driver": "bridge",
#       "network_interface": "podman1",
#       "subnets": [
#         {"subnet":"10.107.37.0/24","gateway":"10.107.37.1"}
#       ],
#       "ipv6_enabled": true,
#       "internal": false,
#       "dns_enabled": true,
#       "ipam_options": {
#         "driver": "host-local"
#       }
#     }
#   }
# }
#  nsenter -n -m -w -t 9513 ./bin/netavark setup /proc/9515/ns/net
# {"podman1":{"dns_search_domains":["dns.podman"],"dns_server_ips":["10.107.37.1"],"interfaces":{"eth0":{"mac_address":"b2:d9:aa:05:97:68","subnets":[{"gateway":"10.107.37.1","ipnet":"10.107.37.189/24"}]}}}}
#  nsenter -n -m -w -t 9513 nc -4 --sctp 10.107.37.1 13895
# #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# #|     FAIL: ncat received data
# #| expected: 'riNSXmdQSy'
# #|   actual: ''
# #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I have seen this twice now, just filling this as issue so we can keep track.

https://cirrus-ci.com/task/5590240982204416
https://cirrus-ci.com/task/6384006940852224

@Luap99
Copy link
Member Author

Luap99 commented Dec 19, 2022

now seen with tcp as well, https://cirrus-ci.com/task/4785774147665920

@Luap99
Copy link
Member Author

Luap99 commented Jan 20, 2023

failed several times in the nightly cron job as reported by @lsm5: https://cirrus-ci.com/task/6728664411275264

@Luap99
Copy link
Member Author

Luap99 commented Jan 24, 2023

This is happening a lot recently, we should look into it.
cc @baude @mheon @flouthoc

@cevich
Copy link
Member

cevich commented Jan 31, 2023

Happened twice in a row again on yesterday's netavark/main Cirrus-Cron job. Always aarch64 (running on AWS).

Luap99 added a commit to Luap99/netavark that referenced this issue Mar 7, 2023
I am not not 100% sure because I was unable to reproduce even with
hack/get_ci_vm.sh. This flakes permanently in CI so there must be a
different in the environment which causes it.

However looking at the code we spawn the nc listener in the background
so there is no guarantee that it will already listen when we make the nc
connect call. To fix this we use the wait_for_port logic to ensure the
port is bound.
For now there is no sctp support, the /proc file format for it is
completely different and I didn't want to spend more time on it so I
just added a sleep and call it good enough.

Fixes containers#433

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/netavark that referenced this issue Mar 9, 2023
I am not not 100% sure because I was unable to reproduce even with
hack/get_ci_vm.sh. This flakes permanently in CI so there must be a
different in the environment which causes it.

However looking at the code we spawn the nc listener in the background
so there is no guarantee that it will already listen when we make the nc
connect call. To fix this we use the wait_for_port logic to ensure the
port is bound.
For now there is no sctp support, the /proc file format for it is
completely different and I didn't want to spend more time on it so I
just added a sleep and call it good enough.

Fixes containers#433

Signed-off-by: Paul Holzinger <[email protected]>
flouthoc pushed a commit to flouthoc/netavark that referenced this issue Apr 24, 2023
I am not not 100% sure because I was unable to reproduce even with
hack/get_ci_vm.sh. This flakes permanently in CI so there must be a
different in the environment which causes it.

However looking at the code we spawn the nc listener in the background
so there is no guarantee that it will already listen when we make the nc
connect call. To fix this we use the wait_for_port logic to ensure the
port is bound.
For now there is no sctp support, the /proc file format for it is
completely different and I didn't want to spend more time on it so I
just added a sleep and call it good enough.

Fixes containers#433

Signed-off-by: Paul Holzinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants