Conversation
toumorokoshi
left a comment
There was a problem hiding this comment.
Thanks! Almost LGTM, just missing a couple licenses in files.
| if isclass(estimator): | ||
| name = estimator.__name__ | ||
| else: | ||
| name = estimator.__class__.__name__ |
There was a problem hiding this comment.
is it worth guarding and raising and exception in this case? it looks like this is assuming that it's a BaseEstimator object.
There was a problem hiding this comment.
Whether a class or object is passed, we want the name of the class for naming the span. This handles both.
There was a problem hiding this comment.
Yep! That's clear to me. I was just wondering if you need additional constraints on whether the object / class is an estimator.
| def implement_spans_fn(func: Callable): | ||
| @wraps(func) | ||
| def wrapper(*args, **kwargs): | ||
| with get_tracer(__name__, __version__).start_as_current_span( |
There was a problem hiding this comment.
nitpick: this code looks like it could be refactored and shared with the wrapper code above.
| name="{cls}.{func}".format(cls=name, func=func.__name__), | ||
| ) as span: | ||
| if span.is_recording(): | ||
| for key, val in attributes.items(): |
There was a problem hiding this comment.
tracing the call tree, it looks like this facility only supports static attributes to annotate an instrumentation.
I would wonder if it's valuable to add a callable edition, where it will pass in the object that is starting the operation, so additional information can be extracted (e.g. some parameterization of the model)?
It's not like this couldn't be added later, just a thought.
| @@ -0,0 +1,40 @@ | |||
| import numpy as np | |||
There was a problem hiding this comment.
test code also requires the license in the header. could you add that?
| @@ -0,0 +1,175 @@ | |||
| from sklearn.ensemble import RandomForestClassifier | |||
c28e9f4 to
b5f8973
Compare
toumorokoshi
left a comment
There was a problem hiding this comment.
LGTM! Thanks. It's really cool to have instrumentation even for machine learning systems.
|
|
||
| ## Unreleased | ||
|
|
||
| - Initial release |
There was a problem hiding this comment.
would it be worth adding the PR link here?
instrumentation/opentelemetry-instrumentation-sklearn/setup.cfg
Outdated
Show resolved
Hide resolved
Co-authored-by: Leighton Chen <lechen@microsoft.com>
Description
Provides an opentelemetry instrumentation package for sklearn models, instrumenting internal spans at the estimator level. The motivation is to provide observability into machine learning models that run for realtime predictive applications that have many complex transformers and predictors.
The instrumentor adds spans to sklearn estimators according to a set of default estimator methods (namely
fit,predict,predict_probaandtransform) and other configuration parameters that determine how spans are implemented through the model hierarchy. The default configuration also handlesPipelineandFeatureUnionhierarchies. Since sklearn's API is easily extended, the configuration parameters allow for custom model hierarchy traversal, allowing spans to be implemented in custom estimators as well.Type of change
How Has This Been Tested?
There are multiple tests for instrumentation on/off, span attributes, and configuration args.
Checklist:
Originally open-telemetry/opentelemetry-python#1054