forked from open-telemetry/opentelemetry-specification
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Performance and Blocking specification (open-telemetry#130)
* Add Performance and Blocking specification Performance and Blocking specification is specified in a separate document and is linked from Language Library Design principles document. Implements issue: open-telemetry#94 * PR fix (open-telemetry#94). - Write about Metrics & Logging to cover entire API - Write about shut down / flush operations - Leave room for blocking implementation options (should not block "as default behavior") - Grammar & syntax fix * PR fix (open-telemetry#94). - Not limit for tracing, metrics. * PR fix (open-telemetry#94). - Mentioned about inevitable overhead - Shutdown may block, but it should support configurable timeout also * PR fix (open-telemetry#94) - s/traces/telemetry data/ - Syntax fix Co-Authored-By: Yang Song <[email protected]> * PR fix (open-telemetry#130) - Remove duplication with open-telemetry#186 - Mention about configurable timeout of flush operation * PR fix (open-telemetry#130) - Not specify default strategy (blocking or information loss)
- Loading branch information
1 parent
5e2a1e4
commit f5518ea
Showing
2 changed files
with
52 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Performance and Blocking of OpenTelemetry API | ||
|
||
This document defines common principles that will help designers create language libraries that are safe to use. | ||
|
||
## Key principles | ||
|
||
Here are the key principles: | ||
|
||
- **Library should not block end-user application by default.** | ||
- **Library should not consume unbounded memory resource.** | ||
|
||
Although there are inevitable overhead to achieve monitoring, API should not degrade the end-user application as possible. So that it should not block the end-user application nor consume too much memory resource. | ||
|
||
See also [Concurrency and Thread-Safety](concurrency.md) if the implementation supports concurrency. | ||
|
||
### Tradeoff between non-blocking and memory consumption | ||
|
||
Incomplete asynchronous I/O tasks or background tasks may consume memory to preserve their state. In such a case, there is a tradeoff between dropping some tasks to prevent memory starvation and keeping all tasks to prevent information loss. | ||
|
||
If there is such tradeoff in language library, it should provide the following options to end-user: | ||
|
||
- **Prevent information loss**: Preserve all information but possible to consume many resources | ||
- **Prevent blocking**: Dropping some information under overwhelming load and show warning log to inform when information loss starts and when recovered | ||
- Should provide option to change threshold of the dropping | ||
- Better to provide metric that represents effective sampling ratio | ||
- Language library might provide this option for Logging | ||
|
||
### End-user application should be aware of the size of logs | ||
|
||
Logging could consume much memory by default if the end-user application emits too many logs. This default behavior is intended to preserve logs rather than dropping it. To make resource usage bounded, the end-user should consider reducing logs that are passed to the exporters. | ||
|
||
Therefore, the language library should provide a way to filter logs to capture by OpenTelemetry. End-user applications may want to log so much into log file or stdout (or somewhere else) but not want to send all of the logs to OpenTelemetry exporters. | ||
|
||
In a documentation of the language library, it is a good idea to point out that too many logs consume many resources by default then guide how to filter logs. | ||
|
||
### Shutdown and explicit flushing could block | ||
|
||
The language library could block the end-user application when it shut down. On shutdown, it has to flush data to prevent information loss. The language library should support user-configurable timeout if it blocks on shut down. | ||
|
||
If the language library supports an explicit flush operation, it could block also. But should support a configurable timeout. | ||
|
||
## Documentation | ||
|
||
If language specific implementation has special characteristics that are not described in this document, such characteristics should be documented. |