-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot configure Stackdriver output plugin #761
Comments
Hi @theFroh, looking at the error I see the following
that means that the plugin could not establish a network connection with Google services, please validate in your end that your system can reach the following HTTPs end-points: |
Hey @edsiper, The machine definitely has outbound access, and in particular, those two end-points are definitely accessible from the machine:
Cheers for assisting! |
would you please trace debug messages with 'Log_Level trace' (in [SERVICE] section) and share the output ? |
No worries, that only really adds a JWT signature printout, though.
The JWT signature has a payload containing (with our correct account name removed): {
"iss": "<STATS SERVICE ACCOUNT>@<PROJECT NAME>.iam.gserviceaccount.com",
"scope": "https://www.googleapis.com/auth/logging.write",
"aud": "https://www.googleapis.com/oauth2/v4/token",
"exp": 1537150012,
"iat": 1537147012
} And header: {
"alg": "RS256",
"typ": "JWT"
} I can't check if the JWT itself is valid as I've not got the secret or public key to verify with. |
I will try to replicate the problem in a 16.04 box, I tested again in my 18.04 and works fine. |
no issues here, if you generate a new token file does it works ? |
What is providing your 16.04 testing box? Mine is just a standard, run of the mill VPS; not provided by AWS or the like. To generate a new token, I've followed the following steps from Google as they seem the most applicable:
This reports the same Am I missing any steps here, or misinterpretting any of the documentation, whether on Fluent Bit's or Google's end? EDIT: I have also just nabbed the JWT signature from the logs again; it is definitely referencing the correct account in there. |
I deployed fluentbit 0.14 in K8S cluster. The important config is the env variable From the fluent-bit-ds.yaml file:
The above config need to have a secrete created like so: kubectl create secret generic --namespace=kube-system stackdriver-service-account --from-file=./stackdriver-service-account.json I mostly following instruction from here: swapped the elasticsearch OUTPUT with stackdriver. But I also tried the simple configmap suggested here: https://docs.fluentbit.io/manual/output/stackdriver Got the StackDriver authentication working I believe:
Problem is I don't see logs in my stackdriver project. The final configmap I use is:
I did not set the env variables such as SERVICE_ACCOUNT_EMAIL & SERVICE_ACCOUNT_SECRET because I already have GOOGLE_SERVICE_CREDENTIALS setup. I did not set resource to global thinking this is already the default. Is there any other logs I can get to dig in more? Don't know what else to try at this point. |
@stevenarvar look under |
@theFroh this is definitely a issue with not being able to hit the google api servers from the box. Please check your connectivity from the box to those services. I was getting the same error and once I enabled the traffic to go through it works. Although in the beginning of the pod I do get a few errors but afterwards it works. The reason for initial connection failure in my env is I am running istio and those pods have to init before the traffic is routed correctly. I have tested with I had to enable traffic to the following urls:
logs:
|
is there any extra information that we could add to the documentation ? or is it good to close the ticket ? |
@edsiper the two domains should be added to the docs. And in the logging it should print the full url to which the access was deined or the request failed at, for example |
@varun-da Just in response to your own reply before, definitely understand that it is a likely cause, but the first thing we checked off in this issue was connectivity from the box to those two addresses. I can confirm I still have connectivity. I'm still hitting the issue, though:
Cheers for the assistance! |
@theFroh the next step I would take is making a call using curl with verbosity andd using the JWT token to the googleapis.com server to get the oauth2 token from that box. perhaps @edsiper can point to the documentation for doing this. I think I found it: https://developers.google.com/identity/protocols/OAuth2ServiceAccount Example from the page, I added the
This would definitely help in debugging this further. |
@varun-da Ah, that's definitely a great way to test here. Running it myself with the token as reported in the logs yields a success in my books:
Which doesn't really clear anything up unfortunately. I wonder how Fluentbit's networking differs. |
+1, I am hit by this too. I get a 200 when I do the curl with the JWT token copied from the logs, and the same oauth error from fluentbit logs. |
I'm getting the exact same thing:
|
How did other folks resolve this?
i've tracked the error back to this line: Line 324 in ba0e6c5
i don't know what can cause flb_upstream_conn_get to fail...
|
I was never able to. |
that specific upstream connection error is a TCP connection error reaching the HTTPS end-point. |
Thanks for the pointer @edsiper |
@jakeswenson did you try tls.debug N ?: https://docs.fluentbit.io/manual/configuration/tls_ssl If you try to do the same thing in a Linux box does it works ? I am wondering if is there any issue on BSD that needs to be fixed. |
@edsiper i just tried with that setting and i am seeing not new output. Does
I ran with here is my config
also i ran a i can try to find a linux box to try this on, but it may take some time... until then it seems like the error is in the http library after dns but before actually sending a packet.... any thoughts @edsiper? |
we use a pretty common libc function to resolve DNS: https://github.com/fluent/fluent-bit/blob/master/src/flb_network.c#L215 hmm not sure what can be since at least you should see a warning or error message. |
i've been able to patch a build my own version of fluent bit to print a bit more logging to try and find where the error is. |
EINVAL = invalid argument, which function returned that ? connect () ?
…On Thu, Mar 28, 2019 at 4:16 PM Jake Swenson ***@***.***> wrote:
i've been able to patch a build my own version of fluent bit to print a
bit more logging to try and find where the error is.
https://github.com/fluent/fluent-bit/blob/master/src/flb_network.c#L311
this line is failing with errno 22 (EINVAL)
i have no idea why or what this means... any thoughts @edsiper
<https://github.com/edsiper>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#761 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAWkNhktng7HtyeJI8WyHAFtjD_V6OFjks5vbT9VgaJpZM4WigHJ>
.
--
Eduardo Silva
Blog: http://edsiper.linuxchile.cl
Twitter: @edsiper <http://twitter.com/edsiper>
OSS: http://monkey-project.com | http://duda.io | http://fluentbit.io
<http://monkey-project.com>
|
yes, |
This appears to be related to ipv6. If I turn off ipv6 support as follows, things work as expected.
|
Wait what? @sebbacon thanks for testing disabling ipv6 fixes. I think that it's a poor experience if instead of the plugin filtering the ipv6 address if it doesn't support it that I'd have to go modify my machine to disable ipv6 to run fluent-bit? |
Also i can verify that i have ipv6 enabled (on loopback...) and that google (obviously) has an AAAA record: # host www.googleapis.com
www.googleapis.com is an alias for googleapis.l.google.com.
googleapis.l.google.com has address 172.217.3.202
googleapis.l.google.com has address 172.217.14.202
googleapis.l.google.com has address 172.217.14.234
googleapis.l.google.com has IPv6 address 2607:f8b0:400a:803::200a # ifconfig
lo0: flags=8048<LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128 tentative
inet6 fe80::1%lo0 prefixlen 64 tentative scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
groups: lo
epair1b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
inet 10.0.51.50 netmask 0xffff0000 broadcast 10.0.255.255
nd6 options=1<PERFORMNUD>
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
groups: epair |
In the environment in which priority set IPv6 higher than IPv4, I fined that it failed to establish upstream connection to oauth2 and stackdriver logging, so I reported #1348. The fixes has been merged into v1.2.
You can specify However, oauth2 is a little different. |
In addition, out_bigquery plugin probably has the same problem. |
Signed-off-by: Eduardo Silva <[email protected]>
thanks everyone for the report, I've added ipv6 mode to out_bigquery on 466191c |
Signed-off-by: Eduardo Silva <[email protected]>
i'm built and ran fluent-bit
config:
i doesn't matter if i configure is there anything else i can do to help debug this? |
looks like the output above don't have trace messages, would you please re-run it ? (I see the trace enabled in the config, but I don't see it in the output) |
@edsiper as i'm sure you know I'm certain it's not building that by default, and i need to read up on how its enabled using the options framework Are there any log lines in particular you're looking for from tracing? |
FYI: Stackdriver output plugin has been improved heavily the latest team (thanks to Google team involvement in the project), I am closing this ticket. Pls create a new one if you still faces an issue. |
I am still seeing this in 1.7. The stackdriver plugin logs nothing even at trace. |
for new issues please open a new ticket. FYI: v1.7.6 was tested extensible with Stackdriver on Google Cloud: 10 hours run sending 150k messages per second, no issues found. |
Signed-off-by: Patrick Stephens <[email protected]>
Bug Report
Describe the bug
I have followed the configuration guide for Stackdriver in the manual, but have had no success in establishing a connection to Stackdriver.
To Reproduce
fluent-bit
on an Ubuntu 16.04 LTS box/etc/google/auth/
/etc/td-agent-bit/td-agent-bit.conf
to include:systemctl restart td-agent-bit.service
systemctl status td-agent-bit.service
:Expected behavior
I expected authentication to succeed against Stackdriver.
Your Environment
[OUTPUT]
section as described above.I had to comment out the
Plugins_File plugins.conf
line as this file does not exist by default and I couldn't find any documentation on the intended contents of such a file. (I also attempted putting the[OUTPUT]
config forstackdriver
into this file, as well as just leaving the file blank)stackdriver
output plugin.Additional context
I'm trying to use
fluent-bit
to consume and send through server stats from a VPS we have, that is not part of our Google Cloud cluster.The text was updated successfully, but these errors were encountered: