Author: Kaushal Kumar
Please describe the end goal of this project
We want to come to a conclusion on all the fields that should be present in query group stats. Sine the _node/stats is already quite large and given that we will have [0-100] query groups available in the cluster.
Though the number of query groups are limited but these query groups account node level metrices pertaining to a query group, hence actual number of such stat objects in the cluster will be #dataNodes * #queryGroups.
If the cluster is large e,g; dataNodes = 200 and queryGroups = 50 then these objects will be 10000 which is a lot and can potentially make the stats output hard to fathom.
The schema for the single query group stat I am proposing is something like the following
{
"queryGroupId": {
"completions": Long,
"rejections": Long,
"CPU": { "currentUsage": Double, "cancellations": Long },
"Memory": { "currentUsage": Double, "cancellations": Long }
}
apart from resource usage in this all the metric values are cumulative counters since the process start time.
Keeping these things in mind I am more inclined towards keeping the stats API for this separate.
I think the feature related stats should only be provided when explicitly asked either using
- queryParam
- PathParam
But currently if a feature is enabled and has stats then they are returned by default. If a client is consuming the node/stats and then upgrades to a OS version which has additional stats object present in the response it can break the client code.
Supporting References
#12342
Issues
#12342
Related component
Search:Resiliency
Author: Kaushal Kumar
Please describe the end goal of this project
We want to come to a conclusion on all the fields that should be present in query group stats. Sine the
_node/statsis already quite large and given that we will have [0-100] query groups available in the cluster.Though the number of query groups are limited but these query groups account node level metrices pertaining to a query group, hence actual number of such stat objects in the cluster will be
#dataNodes * #queryGroups.If the cluster is large e,g;
dataNodes = 200andqueryGroups = 50then these objects will be10000which is a lot and can potentially make the stats output hard to fathom.The schema for the single query group stat I am proposing is something like the following
apart from resource usage in this all the metric values are cumulative counters since the process start time.
Keeping these things in mind I am more inclined towards keeping the stats API for this separate.
I think the feature related stats should only be provided when explicitly asked either using
But currently if a feature is enabled and has stats then they are returned by default. If a client is consuming the
node/statsand then upgrades to a OS version which has additional stats object present in the response it can break the client code.Supporting References
#12342
Issues
#12342
Related component
Search:Resiliency