-
Notifications
You must be signed in to change notification settings - Fork 890
Description
high-level problem
To create a new request-aware greenlet within a flask request, we copy the existing flask-req-ctx and push a new req-ctx onto the req-ctx-stack. When this greenlet+req-ctx ends, it tries to call the patched _teardown_request within flask-instrumentor, which throws errors because the greenlet doesn't "own" the span, and therefore shouldn't be allowed to end it.
Describe your environment
I created a test repo with all of the source code for reproduction: https://github.com/matthewgrossman/otel-flask-repro. The important file is the app.py.
From trying a few different versions, it doesn't seem that the pinned deps matter too much, but here's a requirements.txt on macOS 13.0.1:
gevent==22.10.2
flask==1.1.4
opentelemetry-instrumentation-flask
opentelemetry-sdk
opentelemetry-api
If you'd prefer, I could turn this env into a docker image to make reproing easier for others. They key deps are flask, gevent, and opentelemetry-instrumentation-flask, and I've noticed this problem in every variation of those deps that I've tried.
Steps to reproduce
$ python3.8 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ python -m flask run -p 8080in other shell:
curl localhost:8080/test -H 'x-test-header: val`In the running shell with the flask service, you should see an error like:
(venv) ~/src/otel-flask-repro » python -m flask run -p 8080
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://127.0.0.1:8080/ (Press CTRL+C to quit)
Failed to detach context
Traceback (most recent call last):
File "/Users/mgrossman/src/otel-flask-repro/venv/lib/python3.8/site-packages/opentelemetry/context/__init__.py", line 157, in detach
_RUNTIME_CONTEXT.detach(token) # type: ignore
File "/Users/mgrossman/src/otel-flask-repro/venv/lib/python3.8/site-packages/opentelemetry/context/contextvars_context.py", line 50, in detach
self._current_context.reset(token) # type: ignore
ValueError: <Token var=<ContextVar name='current_context' default={} at 0x10f888040> at 0x10f88f240> was created in a different Context
Failed to detach context
Traceback (most recent call last):
File "/Users/mgrossman/src/otel-flask-repro/venv/lib/python3.8/site-packages/opentelemetry/context/__init__.py", line 157, in detach
_RUNTIME_CONTEXT.detach(token) # type: ignore
File "/Users/mgrossman/src/otel-flask-repro/venv/lib/python3.8/site-packages/opentelemetry/context/contextvars_context.py", line 50, in detach
self._current_context.reset(token) # type: ignore
ValueError: <Token var=<ContextVar name='current_context' default={} at 0x10f888040> at 0x10f88f180> was created in a different Context
127.0.0.1 - - [15/Dec/2022 11:17:54] "GET /test HTTP/1.1" 200 -
What is the expected behavior?
I wouldn't expect to see this error message.
What is the actual behavior?
This error message is caused by the greenlet's flask-req-ctx trying to call _teardown_request for the overall parent flask-req-ctx. This causes issues with contextvars and span ownership.
When a new request comes into the flask application, FlaskInstrumentor set the current span+ctxmanager on the flask.request.environ, seen here: flask-instrumentor
When we create a new greenlet, we copy in the current contextvars context, as well as create a new flask-req-context based upon the existing flask-req-context. When this greenlet ends, the teardown gets called and attempts to call activation.__exit__. This causes the error seen in the above reproduction steps.
Just a few lines above, there already is a guard clause meant to stop this exact problem: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation/opentelemetry-instrumentation-flask/src/opentelemetry/instrumentation/flask/__init__.py#L434-L435.
However, this only ends up protecting us in situations where the ending-req-ctx has a fresh flask.request.environ, like with a test_request_context (noted in the comments of the guard clause). When creating greenlets based on the existing flask request context, this guard clause doesn't sufficiently realize that this ending-req-ctx should also escape early, and not attempt to end the span
Additional context
In the reproduction source code, I provided a workaround. When you create a new greenlet based on the current request context, first pop off the _ENVIRON_ACTIVATION_KEY to ensure the code does go through the guard clause (workaround source code). Commenting out the existing wrap_fn_in_req_context__broken and commenting-in the wrap_fn_in_req_context__workaround removes the error.
However, I'm hoping there's some way to fix such that my code shouldn't need to pop a specific key from an environ before spawning greenlets. A couple options:
- Somehow indicate in the environ which flask-req-ctx-id ended up creating the span. When calling
_teardown_request, escape early if the currentflask-req-ctxdoesn't match the original req-ctx that created the span in the first place - ?? open to suggestions!