Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error fetching the log events from Auth0 #110

Closed
hobbsh opened this issue Oct 25, 2023 · 50 comments
Closed

error fetching the log events from Auth0 #110

hobbsh opened this issue Oct 25, 2023 · 50 comments
Labels
bug Something isn't working

Comments

@hobbsh
Copy link
Contributor

hobbsh commented Oct 25, 2023

What happened?

Seeing this in the logs upon setup, unsure what the issue could be as it's not terribly descriptive.

ts=2023-10-25T21:31:53.365Z caller=log.go:168 level=error msg="error collecting event logs metrics from the selected Auth0 tenant" error="error fetching the log events from Auth0: failed to send the request: context canceled"
  1. Created Machine-to-Machine app in Auth0 with documented grants
  2. Gave exporter access to the credentials at runtime
  3. I see authorization successes in Auth0 UI

What should have happened?

Request would have ideally succeeded

Reproduction

Running with internal helm chart because existing one has a bug, but that should not be getting in the way here.

@hobbsh hobbsh added the bug Something isn't working label Oct 25, 2023
@tfadeyi
Copy link
Owner

tfadeyi commented Oct 26, 2023

It's odd, do you mind listing the grants you have set? Is read:logs present?

Were there any additional --flags or values passed to the exporter?

Did you manually make a request to the exporter? Or was it Prometheus?

You deployed the exporter and that it errored on the first request? Is that correct?

@hobbsh
Copy link
Contributor Author

hobbsh commented Oct 26, 2023

Grants: read:user_idp_tokens, read:clients and read:logs

Flags:

  - args:
    - export
    - --web.listen-address
    - "9301"
    - --tls.disabled

I have tried manually making a request but only / loads, I cannot get /metrics to load. The only thing I see in the logs is the above error, besides very occasional 429 errors.

@hobbsh
Copy link
Contributor Author

hobbsh commented Oct 26, 2023

So I looked a bit closer at the config and it looks like setting --auth0.from to a more recent date helped the request succeed. I guess we have too many logs looking back to the default, maybe the something worth updating to be like now-7d or something.

Now I am running into a different error that appears to be an actual bug:

ts=2023-10-26T16:59:29.705Z caller=log.go:168 level=error msg="error collecting event logs metrics from the selected Auth0 tenant" error="error fetching the log events from Auth0: failed to unmarshal response payload: json: cannot unmarshal number into Go struct field logWrapper.user_id of type string"

@tfadeyi
Copy link
Owner

tfadeyi commented Oct 26, 2023

Would you be able to change the permission to add: read:users rather than read:user_idp_tokens?

@hobbsh
Copy link
Contributor Author

hobbsh commented Oct 26, 2023

That did not seem to have any effect. Kicked the pod to be sure.

@tfadeyi
Copy link
Owner

tfadeyi commented Oct 26, 2023

The error seems to come from the auth0 go-client https://github.com/auth0/go-auth0/blob/main/management/log.go#L160.
By any chance do you have a non sensitive example log, you can share? if not it's ok, can try and see if I can reproduce it

@hobbsh
Copy link
Contributor Author

hobbsh commented Oct 27, 2023

I'll see what I can do, but so far looking through logs, nothing is sticking out other than some logs not containing a value for user_id (not the int type the the error suggests). Is there a way to make the exporter print the offending log? I'm surprised Auth0's own SDK can't handle its own logs 😬

@tfadeyi
Copy link
Owner

tfadeyi commented Oct 27, 2023

Thank you! In the Auth0 log dashboard for your tenant you should be able to try this lucene query user_id: > 0 or user_id: > -2147483648 maybe it will return the log.

Is there a way to make the exporter print the offending log?

unfortunately not, the call to the auth0 sdk seem to fail before returning any logs. I could add a log statement and maybe get the previous logs before the problem log.

@tfadeyi
Copy link
Owner

tfadeyi commented Oct 27, 2023

By any chance are you using a custom Auth0 connection for your users? https://auth0.com/docs/manage-users/user-accounts/identify-users

@hobbsh
Copy link
Contributor Author

hobbsh commented Oct 27, 2023

The queries returned no results but we are using a custom database connection for users, yes.

@tfadeyi
Copy link
Owner

tfadeyi commented Oct 30, 2023

I've tried replicate the issues by adding a custom connector to my auth0 tenant, and try set the user_id to an integer but the system automatically adds "auth0|1" as a prefix.

It doesn't seem possible to for a user_id to be an int.
by any chance are you using a unique identity provider?

I'm going to open an issue in the auth0 go client

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 8, 2023

Under Identity Provider Attributes for a specific user in the Auth0 dashboard, I see user_id is set to auth0|<some_int> and in the object under identities, I see something like:

[
  {
    "user_id": "<some_int>",
    "provider": "auth0",
    "connection": "Username-Password-Authentication",
    "isSocial": false
  }
]

I don't know off the top of my head if it's common for user_id in identities to be in this format - it's still technically a string though.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 8, 2023

@tfadeyi Is it possible to make modifications to log out what value was encountered or is this all wrapped up in go-auth0?

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 8, 2023

Under Identity Provider Attributes for a specific user in the Auth0 dashboard, I see user_id is set to auth0|<some_int> and in the object under identities, I see something like:

[
  {
    "user_id": "<some_int>",
    "provider": "auth0",
    "connection": "Username-Passwo

Hey @hobbsh thank you for reporting that instance, I think as you said it is still a string so it should still be able to be unmarshalled correctly. I

@tfadeyi Is it possible to make modifications to log out what value was encountered or is this all wrapped up in go-auth0?

Unfortunately, It is all abstracted by the go-auth0. I've created an issue auth0/go-auth0#303 but I'm still waiting for a reply.

I think an alternative would be to generate a client for auth0 separate from go-auth0. It would require quite a bit of effort.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 15, 2023

@tfadeyi I don't know why it took me this long to do it but I build/ran it locally with the same parameters as in kubernetes and it works no problem. What could the discrepancy be?

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 15, 2023

So when you ran it locally you didn't encounter the unmarshalling error?

I did make some changes to the request timeouts, that are available in the v0.2.2 release, but I’m not sure if those solve the issue.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 15, 2023

The problem seemed to be not including the TOKEN env var. I just so happened to export it by accident using the env-dev.sh script locally before setting CLIENT_ID/CLIENT_SECRET instead. I noticed your helm chart has it set even when using a Machine-to-Machine Application, so I just set it to the same thing as the CLIENT_SECRET (wondering if it could just be anything) and it worked.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 15, 2023

On a side note, I am still seeing context canceled logs on occasion using 0.2.2 and wondering if I should increase either the Prometheus scrape interval or somehow set a longer Auth0 scrape interval.

ts=2023-11-15T22:58:12.310Z caller=log.go:168 level=error msg="error collecting event logs metrics from the selected Auth0 tenant" error="error fetching the users from Auth0: failed to send the request: context canceled"

It also looks like Prometheus is hitting its scrape timeout as well.

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 16, 2023

I noticed your helm chart has it set even when using a Machine-to-Machine Application, so I just set it to the same thing as the CLIENT_SECRET (wondering if it could just be anything) and it worked.

I see, so you still encountered the unmarshalling issue when TOKEN was not set, but once you set the TOKEN as CLIENT_SECRET it worked, is that correct?
So in your local environment you have TOKEN, CLIENT_SECRET and CLIENT_ID set right?

I noticed your helm chart has it set even when using a Machine-to-Machine Application

Yeah, it simplified the helm chart, it leaves the decision to the exporter code to decided which credentials to use.
In the exporter client credentials have priority over the token.

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 16, 2023

On a side note, I am still seeing context canceled logs on occasion using 0.2.2 and wondering if I should increase either the Prometheus scrape interval or somehow set a longer Auth0 scrape interval.

I thought the changes from 0.2.2 might help in this scenario but it seems like it takes too much time to fetch the tenant users.

What are your current scraping period and timeout?

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 16, 2023

What are your current scraping period and timeout?

I set interval to 60s and scrapeTimeout to 20s on the ServiceMonitor and that seems to do the trick. Average duration looks like it's ~15s (scrapeTimeout defaults to 10s).

I see, so you still encountered the unmarshalling issue when TOKEN was not set, but once you set the TOKEN as CLIENT_SECRET it worked, is that correct? So in your local environment you have TOKEN, CLIENT_SECRET and CLIENT_ID set right?

Correct and it's also set in our Kubernetes deployment as well.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 17, 2023

So rolling this out to our upper environments, I am seeing timeouts again. I wonder if it would make sense to either make the exporter not look back for logs at all and start from time.now() and/or be able to filter by client name. Additionally it might be helpful to be able to see more details about exactly why it's timing out - the error message is implying that there are too many logs but we don't what that number is, for instance.

ts=2023-11-17T00:40:51.032Z caller=log.go:168 level=error msg="Could not finish fetching all the log events, too many might be present, from Auth0 in the given request context, try adding the --auth0.from" error="failed to send the request: context canceled"
ts=2023-11-17T00:40:51.275Z caller=log.go:168 level=error msg="error collecting event logs metrics from the selected Auth0 tenant" error="error fetching the users from Auth0: failed to send the request: context canceled"

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 17, 2023

I should have a couple hours to look into these. I'll update the logs with the number of logs fetched before erroring.

What value did you assign to auth0.from?

I think adding the client filter would be a good idea, I'll look at how to implement it.

I'm still trying to check why it worked when you set the token and client credentials. If you don't mind could I see your redacted exporter configuration?

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 17, 2023

Last night I set it to 11/17 which it had just turned over to in UTC so the look back should have been minimal, however after running ok for a couple hours, the timeouts came back.

As for the config:

     container:
       args:
         - "export"
         - "--tls.disabled"
         - "--auth0.from"
         - "2023-11-17"
         - "--log.level"
         - "debug"

And setting DOMAIN, TOKEN, CLIENT_ID and CLIENT_SECRET via the environment (the last 3 being sourced from SecretsManager).

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 18, 2023

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 19, 2023

Still the same issue unfortunately. So if the exporter relies on --auth0.from, what happens when the pod restarts and the checkpoint is now potentially days/weeks/months in the past? Is that how this would work? Shouldn't it default to something dynamic and recent to overcome this?

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 19, 2023

Do you mind posting the error log again?

Currently if the --auth0.from is not set it defaults to time.Now() and the checkpoint is fetched by getting the last log of the previous day, if no logs are present in the previous day, it keeps going back one day until it finds one.

For a max attempt of 30days back, which I think is the regular retention period for logs in Auth0.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 20, 2023

Ah, so I didn't remove auth0.from previously - just did and things are looking much better! I'll see how things look in a couple days. Thank you so much for your support!

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 20, 2023

We did get an abnormal spike in activity/logs overnight, which I'm sure contributed to it as collection started having issues around that time. It doesn't appear like we crossed 12k logs in the past 24 hours until a few hours ago though.

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 20, 2023

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 21, 2023

It ran for ~2 hours then the user_id log came back

error collecting event metrics from the selected Auth0 tenant" error="error fetching the log events from Auth0: failed to 
unmarshal response payload: json: cannot unmarshal number into Go struct field logWrapper.user_id of type string

If I restart the pod, it starts running again.

@tfadeyi
Copy link
Owner

tfadeyi commented Nov 21, 2023

Did the error only occurred once? When it occurred did the exporter completely stopped working? Is it able to answer queries from the Prometheus?

I'm asking because the exporter should be able to recover from errors in the requests

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 21, 2023

The error repeats at every collection interval for Auth0 and could be a red herring. The Prometheus scraping fails at this point too - the target (ServiceMonitor) is considered down as far as Prometheus is concerned because it is blowing past the scrapeTimeout. kubectl port-forward and trying to hit localhost:9301/metrics takes a very long time (almost 2 minutes).

curl localhost:9301/metrics  0.00s user 0.01s system 0% cpu 1:46.12 total

The exporter does not seem to be resource constrained but I am bumping requests to see if it helps.

@hobbsh
Copy link
Contributor Author

hobbsh commented Nov 21, 2023

Looking at the metrics after curl'ing locally, it looks like that one request 5x'd the memory usage on the pod.

@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 5, 2023

@tfadeyi Unfortunately after 2.5 hours (almost exactly), metrics stop flowing. With v0.2.5 I at least get a Panic now. It picks back up fine after the pod restarts (in the case I just saw, it was rescheduled) which also makes me think that somehow this condition is not causing the health check endpoint to report unhealthy.

{"time":"2023-12-05T20:46:42.399349475Z","level":"-","prefix":"echo","file":"recover.go","line":"120","message":"[PANIC RECOVER] interface conversion: interface {} is nil, not []*management.User goroutine 20524 [running]:\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export.Recover.RecoverWithConfig.func14.1.1()\n\tgithub.com/labstack/echo/[email protected]/middleware/recover.go:100 +0x150\npanic({0xafbda0?, 0xc00037a420?})\n\truntime/panic.go:914 +0x21f\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).collect(0xc000214dc0, {0x15195e0, 0xc000d35d60}, 0xbf522e?)\n\tgithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter/exporter.go:176 +0x65e\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export.(*exporter).metrics.func12({0x1524020, 0xc00020c3c0})\n\tgithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter/exporter.go:108 +0xf6\ngithub.com/labstack/echo/v4.(*Echo).add.func1({0x1524020, 0xc00020c3c0})\n\tgithub.com/labstack/echo/[email protected]/echo.go:582 +0x4b\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export.Timeout.TimeoutWithConfig.TimeoutConfig.ToMiddleware.func15.1({0x1524020, 0xc00020c3c0})\n\tgithub.com/labstack/echo/[email protected]/middleware/timeout.go:113 +0x6e\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export.func1.Middleware.func1({0x1524020, 0xc00020c3c0})\n\tgithub.com/tfadeyi/auth0-simple-exporter/pkg/logging/logging.go:65 +0x106\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export.Recover.RecoverWithConfig.func14.1({0x1524020, 0xc00020c3c0})\n\tgithub.com/labstack/echo/[email protected]/middleware/recover.go:131 +0x119\ngithub.com/labstack/echo/v4.(*Echo).ServeHTTP(0xc00034c000, {0x1517780?, 0xc000d3ae00}, 0xc00020aa00)\n\tgithub.com/labstack/echo/[email protected]/echo.go:669 +0x399\nnet/http.serverHandler.ServeHTTP({0xc000266600?}, {0x1517780?, 0xc000d3ae00?}, 0x6?)\n\tnet/http/server.go:2938 +0x8e\nnet/http.(*conn).serve(0xc00025e1b0, {0x15195a8, 0xc0004ba300})\n\tnet/http/server.go:2009 +0x5f4\ncreated by net/http.(*Server).Serve in goroutine 67\n\tnet/http/server.go:3086 +0x5cb\n\ngoroutine 1 [semacquire, 231 minutes]:\nsync.runtime_Semacquire(0xc00034a330?)\n\truntime/sema.go:62 +0x25\nsync.(*WaitGroup).Wait(0xb3f100?)\n\tsync/waitgroup.go:116 +0x48\ngolang.org/x/sync/errgroup.(*Group).Wait(0xc000339fc0)\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:53 +0x25\ngithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter.(*exporter).Export(0xc000214dc0)\n\tgithub.com/tfadeyi/auth0-simple-exporter/pkg/exporter/server.go:140 +0x1225\ngithub.com/tfadeyi/auth0-simple-exporter/cmd.serveExporterCmd.func2(0xc0000dab00?, {0xbd142a?, 0x4?, 0xbd142e?})\n\tgithub.com/tfadeyi/auth0-simple-exporter/cmd/exporter.go:51 +0x59b\ngithub.com/spf13/cobra.(*Command).execute(0xc0001a6000, {0xc00007c210, 0x3, 0x3})\n\tgithub.com/spf13/[email protected]/command.go:983 +0xabc\ngithub.com/spf13/cobra.(*Command).ExecuteC(0x19ca780)\n\tgithub.com/spf13/[email protected]/command.go:1115 +0x3ff\ngithub.com/spf13/cobra.(*Command).Execute(...)\n\tgithub.com/spf13/[email protected]/command.go:1039\ngithub.com/spf13/cobra.(*Command).ExecuteContext(...)\n\tgithub.com/spf13/[email protected]/command.go:1032\ngithub.com/tfadeyi/auth0-simple-exporter/cmd.Execute({0x15195a8?, 0xc00007c000})\n\tgithub.com/tfadeyi/auth0-simple-exporter/cmd/root.go:44 +0x8d\nmain.main()\n\tgithub.com/tfadeyi/auth0-simple-exporter/main.go:44 +0x112\n\ngoroutine 21 [syscall, 231 minutes]:\nos/signal.signal_recv()\n\truntime/sigqueue.go:152 +0x29\nos/signal.loop()\n\tos/signal/signal_unix.go:23 +0x13\ncreated by os/signal.Notify.func1.1 in goroutine 1\n\tos/signal/signal.go:151 +0x1f\n\ngoroutine 5 [select, 231 minutes]:\nos/signal.NotifyContext.func1()\n\tos/signal/signal.go:288 +0x66\ncreated by os/signal.NotifyContext in goroutine 1\n\tos/signal/signal.go:287 +0x165\n\ngoroutine 67 [IO wait]:\ninternal/poll.runtime_pollWait(0x7fc4869b0c60, 0x72)\n\truntime/netpoll.go:343 +0x85\ninternal/poll.(*pollDesc).wait(0xc00022a100?, 0x4?, 0x0)\n\tinternal/poll/fd_poll_runtime.go:84 +0x27\ninternal/poll.(*pollDesc).waitRead(...)\n\tinternal/poll/fd_poll_runtime.go:89\ninternal/poll.(*FD).Accept(0xc00022a100)\n\tinternal/poll/fd_unix.go:611 +0x2ac\nnet.(*netFD).accept(0xc00022a100)\n\tnet/fd_unix.go:172 +0x29\nnet.(*TCPListener).accept(0xc0000bc0c0)\n\tnet/tcpsock_posix.go:152 +0x1e\nnet.(*TCPListener).\n"}

@tfadeyi
Copy link
Owner

tfadeyi commented Dec 5, 2023

I'll make a new release with a fix, hopefully this fixes the issue.
https://github.com/tfadeyi/auth0-simple-exporter/pkgs/container/auth0-simple-exporter/154921705?tag=v0.2.6

@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 6, 2023

Thanks! The panic stopped using 0.2.6 but I'm still seeing this log unfortunately:

ts=2023-12-06T16:45:26.032Z caller=log.go:168 level=error users_found=0 msg="Request was terminated by the client,the exporter could not finish polling the Auth0 user client to fetch the tenant users.Please increase the client timeout" error="failed to send the request: context canceled"

@tfadeyi
Copy link
Owner

tfadeyi commented Dec 6, 2023

I'll be adding a new flag to disable the metric https://github.com/tfadeyi/auth0-simple-exporter/pull/132/files#diff-5ba0a0c121dd1ea9d455c9163c3d6ca5d477d225c3ed873668aa7138e51e5e0a, and I'll make a release.

Is the exporter still not exporting log metrics?

@tfadeyi
Copy link
Owner

tfadeyi commented Dec 6, 2023

There should be a new release https://github.com/tfadeyi/auth0-simple-exporter/pkgs/container/auth0-simple-exporter
v0.3.0-rc.2, it should have the flag to disable the metric metrics.users.enabled on the chart.

@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 7, 2023

So far so good, been running for the longest continuous stretch it ever has! 🤞

@tfadeyi
Copy link
Owner

tfadeyi commented Dec 9, 2023

@hobbsh are things still stable with the exporter?

@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 10, 2023

@tfadeyi Looking great, thanks for working through this with me!

@tfadeyi
Copy link
Owner

tfadeyi commented Dec 10, 2023

Perfect! Then I'll make a full stable release soon!

@tfadeyi tfadeyi closed this as completed Dec 11, 2023
@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 18, 2023

@tfadeyi Unfortunately the exporter is still hanging (although after much longer periods of operation). Still timing out fetching logs (seems to correlate with spikes in activity but not always). I'm wondering a couple things:

  • Would it make sense to add a client.timeout argument?
  • Can the current fetch time be logged so we know how long it takes and to inform the timeout setting?
ts=2023-12-18T17:05:45.104Z caller=log.go:168 level=error logs_events_found=9261 from="2023-12-18 01:52:11.445 +0000 UTC" msg="Request was terminated by the client,the exporter could not finish polling the Auth0 log client to fetch the tenant logs.Please increase the client timeout or try adding the --auth0.from flag" error="failed to send the request: context canceled"

@tfadeyi tfadeyi reopened this Dec 18, 2023
@tfadeyi
Copy link
Owner

tfadeyi commented Dec 18, 2023

@hobbsh i think we can make the client timeout a flag/value.

currently the timeout is disabled, So I think the Prometheus timeouts must have ended the request.

I'll add more logs about the fetch times.

I'll see if I can also add a whitelist flag for the Auth0 clients to fetch metrics from.

By any chance do you have a general idea of how much time it passes for the issue to occur?

@hobbsh
Copy link
Contributor Author

hobbsh commented Dec 19, 2023

By any chance do you have a general idea of how much time it passes for the issue to occur?

Looks like anywhere between 48-96 hours. Last instance lasted for ~90 hours. I am also still curious if this is something the healthcheck could catch, as it seems like something just gets stuck. I've also been looking at resource usage too to see if I can correlate anything and usage does interesting in periods where the exporter stops functioning. Usage in the grand scheme is very small but comparing functional vs. non-functional definitely paints a clear picture. The elevated sections are when the exporter stopped functioning.

Screenshot 2023-12-18 at 10 34 23 PM

tfadeyi added a commit that referenced this issue Feb 2, 2024
Better logic around the exporter health.

Related #110

Signed-off-by: oluwole fadeyi <[email protected]>
tfadeyi added a commit that referenced this issue Feb 2, 2024
Better logic around the exporter health.

Related #110

Signed-off-by: oluwole fadeyi <[email protected]>
tfadeyi added a commit that referenced this issue Feb 2, 2024
Better logic around the exporter health.

Related #110

Signed-off-by: oluwole fadeyi <[email protected]>
tfadeyi added a commit that referenced this issue Feb 2, 2024
Better logic around the exporter health.

Related #110

Signed-off-by: oluwole fadeyi <[email protected]>
@tfadeyi
Copy link
Owner

tfadeyi commented Feb 7, 2024

@hobbsh sorry for the lack of updates, if you are running the exporter, I've made a new release to include additional health checks.

@tfadeyi tfadeyi closed this as completed Mar 9, 2024
@hobbsh
Copy link
Contributor Author

hobbsh commented Mar 10, 2024

@tfadeyi I've been running it for several weeks and it looks much better, thanks so much for the follow up and dedication to this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants