Skip to content

[RFC] Query Visibility #11008

@deshsidd

Description

@deshsidd

Introduction

In the dynamic and ever-evolving realm of search and data retrieval, a deep understanding of search query patterns and system behavior during query executions is imperative. This knowledge serves as the foundation for enhancing query processing, optimizing the user experience, and bolstering the overall query performance.

We propose the implementation of "Query Visibility", an initiative designed to provide a comprehensive visibility of user interactions with the OpenSearch search platform. This Request for Comment (RFC) outlines the problem we are trying to solve, along with the milestones of the major features which we envision to deliver.

Problem Statement

Presently, OpenSearch is often confronted with a notable lack of visibility into the performance of our search queries. The absence of detailed insights makes it challenging to identify the specific areas within the query execution process where delays and bottlenecks occur. When we encounter lengthy query execution times, it becomes hard to pinpoint for users the root causes of these delays.

Proposal

In response to this challenge, we are proposing to introduce comprehensive metrics and tracing capabilities for the search queries. These measures will provide us with enhanced visibility into the execution of search queries, shedding light on various aspects of query performance. This newfound visibility will enable us to analyze query patterns, traffic volumes, popular query attributes, query structures, execution phases, and latency, among other critical metrics.

Through the introduction of Query Visibility, we aim to elevate our ability to gather, analyze, and derive actionable insights from user interactions with the search platform. This heightened visibility is set to drive improvements in query performance, search functionality, and system robustness, ultimately delivering an enhanced experience for OpenSearch users.

Roadmap/Features

This roadmap outlines the work-stream plan and its possible future looking target release versions. We will dive into the specific areas with detailed RFCs covering each feature with in-depth proposal, will be linking here as a followup to this RFC.

1. Capturing Query Patterns

Target OS Release : 2.12/2.13
RFC Link :
Feature Highlights :

  • Extraction and categorization of query patterns from within the search workload on cluster.
  • Categorizing search queries by type.
  • Capturing the hierarchical structure of queries, including nested subqueries.
  • Extraction of field-related information, such as the number and types of fields.
  • Identifying the types of aggregations.
  • Capturing the number and types of fields as part of the response.
  • Utilizing the metrics framework to collect the above information.

2. Top N Queries

Release : 2.12/2.13

  • Aims to provide ability to identify the top-N queries based on latency.
  • Capturing query execution traces facilitated by the Request Tracing Framework.
  • We aim to extend the tracing framework to capture resource utilization and add dimension for top N such as memory and average CPU consumption. In future we will also include caches, and disk usage.
  • Instrumentation in query execution phases (such as query & fetch) along with search operations (such as aggregations/filtering) to figure out the resources used by a query in various spans of its execution.

3. Query Visibility Plugin (APIs and Dashboard)

Release : 2.12/2.13

  • Proposal to build a Plugin & API and dashboard to surface the top N queries, query latency in each query phase, the various query patterns, query shapes and other resource information related to the query execution spans
  • Create a dedicated, user-friendly query analytics dashboard within OpenSearch that provides real-time insights into query patterns, query performance, and resource consumption.
  • The real-time query monitoring dashboard will provide a live feed of incoming queries, their execution times, and resource consumption. This enables users to spot performance issues immediately.

Conclusion

In summary, "Query Visibility" is an initiative to enhance our understanding of query patterns and system behavior during query executions. By capturing query patterns, metrics, and traces, we aim to optimize query processing, improve user experience, and bolster system performance. This initiative promises to provide valuable insights for targeted enhancements, ultimately benefiting our users and the efficiency of our system.

Metadata

Metadata

Labels

RFCIssues requesting major changesRoadmap:SearchProject-wide roadmap labelSearchSearch query, autocomplete ...etcSearch:Query InsightsenhancementEnhancement or improvement to existing feature or request

Type

No type

Projects

Status

✅ Done

Status

Done

Status

New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions