Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.23.0 causes nxdomain error #764

Open
Kalaww opened this issue Feb 27, 2025 · 12 comments
Open

1.23.0 causes nxdomain error #764

Kalaww opened this issue Feb 27, 2025 · 12 comments

Comments

@Kalaww
Copy link

Kalaww commented Feb 27, 2025

I have upgraded hackney from 1.20 to 1.23 and I noticed some of my http requests are failing with nxdomain in my gitlab ci job. I am using hackney through tesla lib.

Trace

[hackney trace 80 <0.798.0> 2025:02:27 10:33:43 4446] request 
   Content: [{module,hackney},
             {line,313},
             {method,get},
             {url,{hackney_url,hackney_tcp,http,<<"rabbitmq:15672">>,
                               <<"/api/vhosts">>,<<"/api/vhosts">>,<<>>,<<>>,
                               "rabbitmq",15672,<<>>,<<>>}},
             {headers,[{<<"authorization">>,<<"Basic Z3Vlc3Q6Z3Vlc3Q=">>},
                       {<<"content-type">>,<<"application/json">>}]},
             {body,[]},
             {options,[{pool,default}]}]
[hackney trace 60 <0.798.0> 2025:02:27 10:33:43 4447] no proxy env setup, request without proxy 
   Content: [{module,hackney},{line,695}]
[hackney trace 60 <0.798.0> 2025:02:27 10:33:43 4448] connect 
   Content: [{module,hackney_connect},
             {line,32},
             {transport,hackney_tcp},
             {host,"rabbitmq"},
             {port,15672},
             {dynamic,true}]
[hackney trace 80 <0.798.0> 2025:02:27 10:33:43 4450] no socket in the pool 
   Content: [{module,hackney_pool},{line,88},{pool,default}]
[hackney trace 60 <0.798.0> 2025:02:27 10:33:43 4452] happy eyeballs, try to connect using IPv6 
   Content: [{module,hackney_happy},
             {line,32},
             {hostname,"rabbitmq"},
             {port,15672},
             {timeout,8000}]
[hackney trace 80 <0.798.0> 2025:02:27 10:33:43 4464] connect error 
   Content: [{module,hackney_pool},
             {line,108},
             {pool,default},
             {error,{error,nxdomain}}]
[hackney trace 80 <0.798.0> 2025:02:27 10:33:43 4464] connect error 
   Content: [{module,hackney_connect},{line,238}]
@Kalaww
Copy link
Author

Kalaww commented Feb 27, 2025

I have tested with 1.22.0 and same issue.
Also, when I run my tests locally using localhost instead of rabbitmq for the hostname it works fine.

@Lgdev07
Copy link

Lgdev07 commented Feb 27, 2025

Hi, we also have the same problem for v1.22 / v1.23

@benoitc
Copy link
Owner

benoitc commented Feb 27, 2025

well how can I reproduce? How do you define the ip attached to this host name? on which os?

@Kalaww
Copy link
Author

Kalaww commented Feb 27, 2025

I am sorry for the lazy description. I managed to reproduced it in a docker compose. Create a file test.ex and a docker compose file with the content you can find below and run this docker compose up --build | grep elixir.
If you change the hackney version to 1.20.1, the issue is not there.
Also I am using elixir 1.18 and otp 26.

test.ex

Mix.install([{:hackney, "== 1.23.0"}])

:hackney.request(
  :get,
  "http://rabbitmq:15672/api/vhosts",
  [{"Authorization", "Basic Z3Vlc3Q6Z3Vlc3Q="}],
  <<>>,
  []
)
|> IO.inspect()

docker-compose.yaml

services:
  rabbitmq:
    image: rabbitmq:3.13-management-alpine
    ports:
      - "5672:5672"
      - "15672:15672"
    environment:
      RABBITMQ_DEFAULT_USER: guest
      RABBITMQ_DEFAULT_PASS: guest

  elixir:
    image: elixir:1.18-otp-26-alpine
    depends_on:
      - rabbitmq
    command: sh -c "elixir /test.ex"
    volumes:
      - ./test.ex:/test.ex

@halfdan
Copy link

halfdan commented Feb 28, 2025

elixir-1    | ==> mix_install
elixir-1    | ===> Analyzing applications...
elixir-1    | ===> Compiling certifi
elixir-1    | ===> Analyzing applications...
elixir-1    | ===> Compiling parse_trans
elixir-1    | ===> Analyzing applications...
elixir-1    | ===> Compiling metrics
elixir-1    | ===> Analyzing applications...
elixir-1    | ===> Compiling hackney
elixir-1    | {:ok, 200,
elixir-1    |  [
elixir-1    |    {"cache-control", "no-cache"},
elixir-1    |    {"content-length", "220"},
elixir-1    |    {"content-security-policy",
elixir-1    |     "script-src 'self' 'unsafe-eval' 'unsafe-inline'; object-src 'self'"},
elixir-1    |    {"content-type", "application/json"},
elixir-1    |    {"date", "Fri, 28 Feb 2025 13:48:37 GMT"},
elixir-1    |    {"server", "Cowboy"},
elixir-1    |    {"vary", "accept, accept-encoding, origin"}
elixir-1    |  ], #Reference<0.3583026400.1675624450.69706>}

This strangely works for me with the provided example above (run on an M1 Mac)

The bug sounds like #758 which was fixed (for me) with 1.23.0.

@sandromehic
Copy link

I have similar issue and it seems to be linked to the fact that docker compose might not be configured to run with IPv6. I was able to reproduce this (:nxdomain errors with hackney 1.22+), but I'm still not sure if it's my local docker configuration or something with hackney.

Is there a way to configure hackney not to use IPv6? Just to confirm that this is causing the issue? I'm an Elixir dev, so if there is a way to put something in config.exs that hackney will pick up by default would be great

@sandromehic
Copy link

I was able to solve my issue, the fault seems to have been with the configuration of my DNS server. After some digging I found out that on a different network I couldn't reproduce the issue, but it was happening on my home wifi.

Looking at hackney code, I tried running inet_res:getbyname('localhost', 'a'). and that would give me the nxdomain error when running it on my home network. By changing the DNS server that my wifi uses, this error is no longer there, and I can't reproduce the error with hackney anymore.

It is still strange, because all the other things (like curl) were working normally and I never had issue with my ISP DNS server, but it looks like there is something particular about it that is giving trouble to erlang's inet_res module 🤷‍♂

@Kalaww maybe this can help you fix your issue as well

@benoitc
Copy link
Owner

benoitc commented Feb 28, 2025

@sandromehic which os is used inside the docker instance?

@sandromehic
Copy link

@sandromehic which os is used inside the docker instance?

alpine, elixir:1.15.4-otp-25-alpine image exactly.

However, I had the issue even on my host machine, so it looks like it was more related to my entire home network configuration, that to a particular docker setting.

When running something like this on my host machine (with both erlang OTP 25 and 27) I would still get the nxdomain error:

rebar3 shell
1> inet_res:getbyname('localhost', 'a').

@alvarodoofinder
Copy link

I can confirm this error also in a project of Elixir we have.

@benoitc
Copy link
Owner

benoitc commented Mar 4, 2025

@alvarodoofinder do you have any logs, setup to share that allows to reproduce the issue?

@alvarodoofinder
Copy link

@benoitc Yes, but its complicated. Some of members of the team where not suffering this error, one of them even was suffering only in house, not in the office. The problem emerges SOMETIMES when starts locally project, and I can provide this log:

Image

In the project the line from sync_permissions.ex:24:

@impl Mix.Task
  @spec run([binary]) :: any
  def run(_args) do
    Application.load(:XXXPROJECT)
    Enum.each(@required_apps, &Application.ensure_all_started/1)
    Sync.sync_permissions()
  rescue
    e ->
      Logger.error("#{inspect(e)}")

      Sentry.capture_message("Error synchronizing permissions",
        extra: %{error: inspect(e)}
      )

      :ok
  end

Of course instances ARE started before syncPermissions(), and this ending calling this function:

@spec do_query(binary, map) :: response()
  def do_query(query, variables \\ %{}) do
    case Cortex.query("#{@host}/internal/graphql", query, variables,
           connection_opts: @options,
           headers: [projectmastertoken: @token]
         ) do
      {:ok, %Cortex.Response{data: data, errors: []}} ->
        {:ok, data}

      {:ok, %Cortex.Response{errors: err}} ->
        {:error, err}

      {:error, err} ->
        # http errors
        {:error, err}
    end
  end

Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants