Standard system metrics and semantic conventions#119
Standard system metrics and semantic conventions#119bogdandrutu merged 18 commits intoopen-telemetry:masterfrom
Conversation
james-bebbington
left a comment
There was a problem hiding this comment.
LGTM. It may also be worth defining the data type (Int64 or Double)
jlegoff
left a comment
There was a problem hiding this comment.
Julien from New Relic - I work on our infrastructure product and I have a couple of comments / questions. Sorry if they are obvious, I'm still getting up to speed with OTEL!
jmacd
left a comment
There was a problem hiding this comment.
This looks great to me. I especially like "usage" and "utilization" as standard names.
jkwatson
left a comment
There was a problem hiding this comment.
definitely good enough to be approved as an OTEP and move on to the spec itself.
|
I believe this PR is ready to be merged but when writing this up for the specs repo, it would be good to add a convention for process counts (with "state" = running / inactive) |
| |----------------------|-------|-----------------|----------|---------|-----------------------------------| | ||
| |system.cpu.time |seconds|SumObserver |Double |state |idle, user, system, interrupt, etc.| | ||
| | | | | |cpu |1 - #cores | | ||
| |system.cpu.utilization|1 |UpDownSumObserver|Double |state |idle, user, system, interrupt, etc.| |
There was a problem hiding this comment.
s/UpDownSumObserver/ValueObserver
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <jbebbington@google.com> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com> * reflow text Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: James Bebbington <jbebbington@google.com> Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <jbebbington@google.com> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com> * reflow text Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: James Bebbington <jbebbington@google.com> Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <jbebbington@google.com> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com> * reflow text Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: James Bebbington <jbebbington@google.com> Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
…s#119) * standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
See open-telemetry/opentelemetry-specification#651. This OTEP proposes some standard system metric names as well as semantic conventions for naming system/runtime metrics. This mostly follows the work done in #108 and the Collector. I left a few TODOs and open questions, the biggest things being standard runtime metrics and process metrics.