Update WSGI & Flask integrations to follow new semantic conventions #299
Update WSGI & Flask integrations to follow new semantic conventions #299c24t merged 3 commits intoopen-telemetry:masterfrom
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #299 +/- ##
==========================================
- Coverage 85.56% 84.16% -1.40%
==========================================
Files 33 33
Lines 1600 1636 +36
Branches 178 193 +15
==========================================
+ Hits 1369 1377 +8
- Misses 185 204 +19
- Partials 46 55 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
codeboten
left a comment
There was a problem hiding this comment.
Couple of comments, nothing blocking.
| # pylint:disable=too-many-branches,too-many-return-statements | ||
| if code < 100: | ||
| return StatusCanonicalCode.UNKNOWN | ||
| if code <= 299: |
There was a problem hiding this comment.
Not sure if it would be worth adding a check for 200 at the top since i suspect that's going to be by large the status code we hit. Possibly a premature optimization though
There was a problem hiding this comment.
I wouldn't want to guess. 204 No Content is probably also a popular response code to POST requests and the like. I think before that I'd rather do things like the classical _StatusCanonicalCode=StatusCanonicalCode default parameter to avoid accessing globals.
There was a problem hiding this comment.
It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.
There was a problem hiding this comment.
You mean a dict with 500 entries?
| "http.server_name": environ["SERVER_NAME"], | ||
| "http.scheme": environ["wsgi.url_scheme"], | ||
| "host.port": int(environ["SERVER_PORT"]), | ||
| } |
There was a problem hiding this comment.
Is there ever a case where any of these environment variables wouldn't be present? If so using environ.get may be better, unless raising an exception is preferable
There was a problem hiding this comment.
According to PEP3333, these are required: https://www.python.org/dev/peps/pep-3333/#environ-variables. I guess there might be nonconforming implementations out there. I have no strong opinion here. Maybe it would be better to come up with a convention of using try:/except: to make sure the integrations don't break applications.
| if flavor.upper().startswith(_HTTP_VERSION_PREFIX): | ||
| flavor = flavor[len(_HTTP_VERSION_PREFIX) :] | ||
| if flavor: | ||
| result["http.flavor"] = flavor |
There was a problem hiding this comment.
Could use setifnotnone here
There was a problem hiding this comment.
Unfortunately not, since I use "" as second argument to get above, in order to unconditionally use upper and startswith. I think there is really no way to save a conditional here.
Conflicts: ext/opentelemetry-ext-flask/src/opentelemetry/ext/flask/__init__.py ext/opentelemetry-ext-flask/tests/test_flask_integration.py ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py ext/opentelemetry-ext-wsgi/tests/test_wsgi_middleware.py
c24t
left a comment
There was a problem hiding this comment.
LGTM, and I think it couldn't hurt to protect against non-PEP3333-compliant envs as in https://github.com/open-telemetry/opentelemetry-python/pull/299/files#r350626307.
|
@Oberon00 let me know when you think this is ready to merge |
|
Unless you think I should address the non-PEP3333-compliant environ safety issues first, I think this is ready (but it might cause merge conflicts for the named tracers PR #301). |
toumorokoshi
left a comment
There was a problem hiding this comment.
Approved, looks like you had enough, but I guess another one wouldn't hurt :)
| "http.status_code": 404, | ||
| "http.status_text": "NOT FOUND", | ||
| }, | ||
| {"http.status_code": 404, "http.status_text": "NOT FOUND"}, |
There was a problem hiding this comment.
random and more general question: is it needed to have "status_text"? feels to me like it adds a lot to the payload and is often identical.
There was a problem hiding this comment.
No, the status text is optional. But note that the integration does not insert static text here but actual "output" from the WSGI application. EDIT: See spec: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md#common-attributes
| # pylint:disable=too-many-branches,too-many-return-statements | ||
| if code < 100: | ||
| return StatusCanonicalCode.UNKNOWN | ||
| if code <= 299: |
There was a problem hiding this comment.
It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.
| elif not url.startswith(environ["wsgi.url_scheme"] + ":"): | ||
| # Something fishy is in RAW_URL. Let's fall back to request_uri() | ||
| url = wsgiref_util.request_uri(environ) | ||
| def setifnotnone(dic, key, value): |
There was a problem hiding this comment.
I think we should follow the underscore naming convention, even for this small method name.
| result["http.target"] = target | ||
| else: | ||
| url = wsgiref_util.request_uri(environ) | ||
| result["http.url"] = wsgiref_util.request_uri(environ) |
There was a problem hiding this comment.
it feels to me like http.url could be decomposed into http.target and http.host or something along those lines: otherwise if I wanted the target, I would have to know and check for url in the case that target did not exist, and perform a split to extract that myself.
There was a problem hiding this comment.
Yeah, we could decompose, but I see no benefit in it. Who do you mean with "myself" above? I think the back-end is a better place for such logic, the integrations should send information to the back-end as "raw" as they want as more raw usually means more info.
Updates flask & WSGI integration to follow new semantic conventions for HTTP as of open-telemetry/opentelemetry-specification#263 (see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md).
The HTTP client (requests) integration will need an update too, and for that the wsgi
opentelemetry.ext.wsgi.http_status_to_canonical_codewill need to be used by the requests-instrumentation -- either by just adding a dependency (since ext-wsgi is and will stay dependency-free itself, this would technically not be a problem), or by splitting this function into a new package. This will be done in a follow-up PR.