Skip to content

Tracing instrumentation at deep search path#6

Closed
dzane17 wants to merge 4 commits intomainfrom
new-tracing
Closed

Tracing instrumentation at deep search path#6
dzane17 wants to merge 4 commits intomainfrom
new-tracing

Conversation

@dzane17
Copy link
Copy Markdown
Owner

@dzane17 dzane17 commented Jan 6, 2024

Description

Tracing instrumentation at deep search path

Added spans at:

  1. Coordinator node
  2. Coordinator node search phases
  3. Shard query and fetch phases

Testing

% curl -X PUT "localhost:9200/test2?pretty" -H 'Content-Type: application/json' -d'                                                                    
{                                                  
  "settings": {
    "number_of_shards": 10
  }
}'

% curl -X POST "localhost:9200/_bulk?pretty" -H 'Content-Type: application/json' -d'
{ "index" : { "_index" : "test2", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "test2", "_id" : "2" } }
{ "field2" : "value2" }
'

% curl -XGET 'localhost:9200/_search?pretty&phase_took' -H 'X-Opaque-Id: my-id' -H 'Content-Type: application/json' -d'{
 "query": { "match_all": {} }
}'

Screenshot 2023-12-13 at 9 57 15 AM

Sample Span Details

1. Coordinator node

Screenshot 2023-12-13 at 2 10 42 PM

{
   "traceId":"e522537a055adc6fc59667b58347942c",
   "spanId":"5b42e305497d91f8",
   "parentSpanId":"dab8a74cc0bb10c1",
   "name":"coordinatorRequest",
   "kind":2,
   "startTimeUnixNano":"1702333271222480100",
   "endTimeUnixNano":"1702333271229240384",
   "attributes":[
      {
         "key":"thread.name",
         "value":{
            "stringValue":"opensearch[88665a158598.ant.amazon.com][transport_worker][T#8]"
         }
      },
      {
         "key":"total_hits",
         "value":{
            "stringValue":"4 hits"
         }
      },
      {
         "key":"shards",
         "value":{
            "stringValue":"{total:2, successful:2, skipped:0, failed:0}"
         }
      }
   ],
   "status":{
      
   }
}
2. Coordinator node phase

Screenshot 2023-12-13 at 2 11 40 PM

{
   "traceId":"e522537a055adc6fc59667b58347942c",
   "spanId":"c8ef2d7ad2584a6c",
   "parentSpanId":"5b42e305497d91f8",
   "name":"coordinatorQuery",
   "kind":2,
   "startTimeUnixNano":"1702333271223157629",
   "endTimeUnixNano":"1702333271225489595",
   "attributes":[
      {
         "key":"thread.name",
         "value":{
            "stringValue":"opensearch[88665a158598.ant.amazon.com][transport_worker][T#8]"
         }
      }
   ],
   "status":{
      
   }
}
3. Shard phase

Screenshot 2023-12-13 at 2 14 18 PM

{
   "traceId":"e522537a055adc6fc59667b58347942c",
   "spanId":"637c47eaa1d956da",
   "parentSpanId":"c8ef2d7ad2584a6c",
   "name":"shardQuery",
   "kind":2,
   "startTimeUnixNano":"1702333271224104730",
   "endTimeUnixNano":"1702333271224698102",
   "attributes":[
      {
         "key":"thread.name",
         "value":{
            "stringValue":"opensearch[88665a158598.ant.amazon.com][search][T#3]"
         }
      },
      {
         "key":"index",
         "value":{
            "stringValue":"test1"
         }
      },
      {
         "key":"shard_id",
         "value":{
            "intValue":"0"
         }
      }
   ],
   "status":{
      
   }
}

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dzane17 dzane17 changed the title Coordinator req & phase tracing Tracing instrumentation at deep search path Jan 8, 2024
@dzane17 dzane17 force-pushed the new-tracing branch 2 times, most recently from 86c6175 to 68f1723 Compare January 9, 2024 02:12
Comment on lines +34 to +45
if (searchRequestContext.getPhaseSpan() != null) {
searchRequestContext.getPhaseSpan().endSpan();
}
searchRequestContext.setPhaseSpan(
tracer.startSpan(
SpanBuilder.from(
"coordinator" + capitalize(context.getCurrentPhase().getName()),
new SpanContext(searchRequestContext.getRequestSpan())
)
)
);
SpanScope spanScope = tracer.withSpanInScope(searchRequestContext.getPhaseSpan());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every SpanScope has to be closed. This is not the right way to add the span in a Scope. It actually add the span the TheadLocal and wait for it to be ended then only it will be removed. Further spans can use this span as a parent span. Here it will stay in the TheardLocal of the current thread and if the phase is being executed in an async fashion and ended by some other thread or callback then this Span still will stay in the ThreadLocal and can be wrongly mapped as a parent for other spans from different requests.

I think that's the reason you had to end the phase before start at line no 34.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I understand the purpose of SpanScope for attaching/detaching to threads but have one follow up question. Let's say we have a main function that calls child functions a, b, & c. We want to start the span somewhere in a() and end it somewhere in c(). How can we declare the SpanScope in this case?

void main() {
    a()
    b()
    c()
}

a() {
    ...
    startSpan()
    ...
}

b() {
    ...
}

void c() {
    ...
    endSpan()
    ...
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added SpanScope for all spans in latest commit

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gaganjuneja there was a bug in the listener interface that was causing an issue. It is now fixed in latest revision. Please review again

@dzane17 dzane17 force-pushed the new-tracing branch 2 times, most recently from ef55bf6 to 8d9e4a1 Compare January 11, 2024 18:57
@dzane17 dzane17 requested a review from Gaganjuneja January 12, 2024 08:55
Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: David Zane <davizane@amazon.com>
@dzane17 dzane17 closed this Jan 30, 2024
@dzane17 dzane17 deleted the new-tracing branch July 25, 2024 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants