-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
monitor-opentelemetry-exporter: Unhandled rejection on network issues or 500 from AI /v2/track endpoint #12851
Comments
Ideally, something could be added to CI to test all azure-sdk for js modules: #12609 |
#12856 I built @microsoft/opentelemetry-exporter-azure-monitor package from the default branch after cloning and was able to use |
Is my understanding correct that you are saying you are unable to verify the fix because of the change in available APIs? |
Are you in the teams chat with your preview customers who were instructed
to use the hook you just removed? This will be unusable for everyone.
…On Thu, Dec 10, 2020, 10:45 PM Jeff Fisher ***@***.***> wrote:
#12856 <#12856> I built
@microsoft/opentelemetry-exporter-azure-monitor package from the default
branch after cloning and was able to use rush and the included npm
scripts to build and pack a .tgz, which upon installing, had a different
public interface than the previously published module.
Is my understanding correct that you are saying you are unable to verify
the fix because of the change in available APIs?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#12851 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIEMKEXYUVIBAHUWDVMGOTSUGIV5ANCNFSM4UVH4XOQ>
.
|
@xirzec: I pinged Matthew McCleary in the Teams chat to ask for you and the azure-monitor team to be invited. He's out of office until the 16th. We're going to be refocusing out efforts to the Open Telemetry Exporter because the cross-team development effort between AI and Monitor teams seems to be in disarray (or suffering from holiday vacation times that overlap with our January shipping schedule). FWIW, Not much chatting going on there, I think that everyone has moved on because this is unusable in its current state. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @sameergMS, @dadunl. Issue Details
Describe the bug To Reproduce
Expected behavior Additional context
|
I reviewed the changes in #12563 and made a small test app to try out My test app just kept generating random spans while it ran, and when disconnecting from WiFi I saw some of the expected errors, e.g.
And while it does seem like it should handle ENOTFOUND as retriable, at least it didn't bring down the app and after reconnecting to WiFi, events started sending successfully once again. So I think this issue is resolved by #12563, though I do understand that #12856 is blocking you until it is resolved. |
This issue would be resolved in next release of the exporter. Created a task to add retry logic when there are network issues |
Fix available in latest release of the package |
@hectorhdzg: @xirzec: @ramya-rao-a: I’m sorry that this isn’t the correct channel, however, I don’t have permission to open support tickets as I’m an outside contractor. I had to switch away from Open Telemetry after earlier versions caused a production outage. Now we’re out of preview and have been trying for several months reproduce a small fraction of what we had with Open Telemetry using Application Insights. I’m running against an issue with the AI ingestion endpoint completely disregarding certain properties I send depending on device type and SDK version. There is a conflict between App Center continuous backup and Application Insights. It appears the only way to send device/location/user data is with the continuous export as the ingestion endpoint doesn't process any of the well-known azure defined properties that we send depending on the device type and SDK version specified. None of this is documented for customers. I fear that the solution involves no less than two product teams and we have already dedicated hundreds of hours of engineering time working around backend issues on your side that support fails to address. Support met with SMEs and the solution provided was to send less data and use eslint; which I found to be offensive and inappropriate. We need an SRE or engineering resource and for Case #12012282500654 (which the manager on the project had to open for me) to be escalated please. Matt McCleary said he would escalate the ticket for us a few months ago but I no longer have access to message him on The Open Telemetry Preview teams chat. Thanks for your time/help. |
@Dawgfan ^ |
Hey @jmealo, Assuming the problem is with using the Can you consider logging a GitHub issue here with the details of the problem so that we have the history and the context to help you? |
new readme file for resourcemover (Azure#12851)
@azure/monitor-opentelemetry-exporter
1.0.0-preview.6
Linux
Describe the bug
Errors during the transmission or persistence of spans crashes the process under observation with an unhandled rejection rather than retrying the request or raising an error.
To Reproduce
Steps to reproduce the behavior:
1a. Get a 500 internal server error from
/v2/track
AI ingestion endpoint -- it will throw an unhandled rejection with a value of 2.Expected behavior
For all azure-sdk js packages not to throw unhandled rejections/exceptions and to handle internal errors internally or provide error handling mechanisms for end-users. At the very least a use-able stack trace would be nice.
Additional context
There are unreleased fixes committed by @markwolff on GitHub that have not been published to NPM.
The text was updated successfully, but these errors were encountered: