Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node critical. Synced check and then report HTTP request failed: Get /dev/null: unsupported protocol scheme #17809

Open
sdvdxl opened this issue Jun 20, 2023 · 5 comments

Comments

@sdvdxl
Copy link

sdvdxl commented Jun 20, 2023

Overview of the Issue

Reproduction Steps

  1. init docker swarm

  2. create docker stack, docker-compose

docker-compose
version: '3.8'
services:
  consul:
    hostname: consul
    image: "harbor.hekr.me/iotos/consul:1.15.3"
    deploy:
      replicas: 1
      placement:
        max_replicas_per_node: 1
        constraints: [node.role == manager]
    ports:
      - "8500:8500"
      - "8300:8300"
      - "8301:8301"
      - "8302:8302"
      - "8600:8600"
    volumes:
      - consulData:/consul/data
    networks:
      iot-os-network:
      #ipv4_address: 172.20.0.2
    command: agent -server -bootstrap-expect 1 -ui -bind '{{ GetPrivateInterfaces | include "network" "172.20.0.0/24" | attr "address" }}' -client=0.0.0.0

networks:
  iot-os-network:
    ipam:
      config:
        - subnet: 172.20.0.0/24

volumes:
  consulData:
  mongoData:
  redisData:
  minioData:
  clickhouseData:
  logsData:
  driversData:
  confData:
  mysqlData:
  zookeeperData:
  ibosData:
  1. run some days
  2. logs show:

Synced check "2R9qN31gaZdi9fySX8RiWD4ujhS"
2023/06/13 16:10:12 [WARN] agent: Check "2R9qN31gaZdi9fySX8RiWD4ujhS" HTTP request failed: Get /dev/null: unsupported protocol scheme ""

  1. need execute curl -X PUT http://127.0.0.1:8500/v1/agent/check/deregister/2R9qN31gaZdi9fySX8RiWD4ujhS deregister to recover

Consul info for both Client and Server

Client info
agent:
    check_monitors = 0
    check_ttls = 0
    checks = 1
    services = 2
build:
    prerelease =
    revision = 7ce982ce
    version = 1.15.3
    version_metadata =
consul:
    acl = disabled
    bootstrap = true
    known_datacenters = 1
    leader = true
    leader_addr = 172.20.0.38:8300
    server = true
raft:
    applied_index = 2792157
    commit_index = 2792157
    fsm_pending = 0
    last_contact = 0
    last_log_index = 2792157
    last_log_term = 6
    last_snapshot_index = 2785461
    last_snapshot_term = 6
    latest_configuration = [{Suffrage:Voter ID:0ebd7757-8fe9-9bae-b624-2e21a087c6c2 Address:172.20.0.38:8300}]
    latest_configuration_index = 0
    num_peers = 0
    protocol_version = 3
    protocol_version_max = 3
    protocol_version_min = 0
    snapshot_version_max = 1
    snapshot_version_min = 0
    state = Leader
    term = 6
runtime:
    arch = amd64
    cpu_count = 8
    goroutines = 157
    max_procs = 8
    os = linux
    version = go1.20.4
serf_lan:
    coordinate_resets = 0
    encrypted = false
    event_queue = 1
    event_time = 6
    failed = 0
    health_score = 0
    intent_queue = 0
    left = 0
    member_time = 1
    members = 1
    query_queue = 0
    query_time = 1
serf_wan:
    coordinate_resets = 0
    encrypted = false
    event_queue = 0
    event_time = 1
    failed = 0
    health_score = 0
    intent_queue = 0
    left = 0
    member_time = 1
    members = 1
    query_queue = 0
    query_time = 1
Client agent HCL config
Server info
agent -server -bootstrap-expect 1 -ui -bind '{{ GetPrivateInterfaces | include "network" "172.20.0.0/24" | attr "address" }}' -client=0.0.0.0

Operating system and Environment details

docker info

Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.7.1-docker)
scan: Docker Scan (Docker Inc., v0.12.0)

Server:
Containers: 42
Running: 15
Paused: 0
Stopped: 27
Images: 35
Server Version: 20.10.12
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: error
NodeID: m739hilgnx2hjv9a9jylyjisi
Is Manager: true
Node Address: 211.66.32.176
Manager Addresses:
211.66.32.176:2377
Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
Default Runtime: runc
Init Binary: docker-init
containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-1062.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 62.74GiB
Name: gzic-lsjnglpt-2
ID: N6FK:ZIYH:XWFU:FFQE:SZZI:GQGM:RSB5:HAIY:XHVZ:SFTY:H3SW:TJKD
Docker Root Dir: /data/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://xxxxx.mirror.aliyuncs.com/
https://xxxx.mirror.swr.myhuaweicloud.com/
Live Restore Enabled: false

os info

Linux gzic-lsjnglpt-2 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Log Fragments

image
image
image
image

@huikang
Copy link
Collaborator

huikang commented Jun 22, 2023

@sdvdxl , thanks for reporting. I noticed that the following command recovers the issue

need execute curl -X PUT http://127.0.0.1:8500/v1/agent/check/deregister/2R9qN31gaZdi9fySX8RiWD4ujhS deregister to recover

Could you help clarify the definition of the check 2R9qN31gaZdi9fySX8RiWD4ujhS? The screenshot shows it's Node value is "consul", but the serviceName is "".

@sdvdxl
Copy link
Author

sdvdxl commented Jun 26, 2023

I don't know where it came from, but my actively registered service is iot-xx. When this check is deregistered, it may generate a new one like this after a while, without serviceName, reporting the same error

@phil-lavin
Copy link

phil-lavin commented Jul 1, 2023

We have just seen this failure across over 100 nodes. There's a failing health check across all of them called 2Rxye2uPfKB1LhGyfsmDR4n3Rdy. We don't know where this came from - it just appeared today. It isn't present on non-failing nodes. fwiw, all of the failing nodes run Nomad.

image

De-registering the check on affected nodes recovers them: curl --request PUT "http://${CONSUL_HTTP_ADDR}/v1/agent/check/deregister/2Rxye2uPfKB1LhGyfsmDR4n3Rdy"

@phil-lavin
Copy link

phil-lavin commented Jul 1, 2023

We are starting to think this is as a result of a 'security' scanner looking for CVE-2022-29153. Very probably the nuclei scanner: projectdiscovery/nuclei-templates#6488. The signature of the bad check which gets created is exactly consistent with the above-mentioned PR

Issue raised on the nuclei-templates repo: projectdiscovery/nuclei-templates#7595

@phil-lavin
Copy link

Confirmed with our security folks that this was a Nuclei scan being conducted against our infrastructure, from a box inside the network. If others are seeing this erroneous /dev/null check, ensure you don't have Nuclei running inside your network and also ensure that your Consul agents are not directly accessible from the public Internet as this may be a result of a malicious 3rd party scanning your infrastructure.

Nuclei have pushed a fix to make the test more sane and also mark it as intrusive: projectdiscovery/nuclei-templates#7597

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants