[RFC] Tracking Search Pipeline Execution

### Is your feature request related to a problem? Please describe

With the expansion of search pipeline processors, tracking data transformations and understanding data flow through complex processors is becoming challenging. The introduction of ML inference processors, which can manipulate model inputs and outputs, increases the need for a tool to visualize and debug the flow of data across these processors. Such functionality would aid in troubleshooting, optimizing pipeline configurations, and provide transparency for end-to-end transformations of search requests and responses.


As search pipeline processors grow in complexity, there is an increasing need to: [Related Issue](https://github.com/opensearch-project/OpenSearch/issues/14745)

1. Track how data flows and transforms through each processor.
2. Debug data transformations and pinpoint any failures within the pipeline.
3. View the end-to-end pipeline execution for both the request and response sides of a search.

This capability would also be valuable for frontend plugins like the [Flow Framework](https://github.com/opensearch-project/flow-framework), helping users configure and test complex ingest and search pipelines.

### Describe the solution you'd like



# Adding `verbose` Parameter to Search Request [Preferred]

#### Overview
In this approach, the `verbose_pipeline` parameter is introduced as a query parameter in the search request URL. When used in conjunction with the `search_pipeline` parameter, it activates a debugging mode, allowing detailed tracking of search pipeline processor execution without requiring a new API or changes to the `Explain` API.
![searchRequestflow drawio](https://github.com/user-attachments/assets/e4348c54-be28-4375-baff-27e0f49b2eaf)
* * *

#### Pros

1. **Minimal Changes to Existing Workflow**:

    * No need for a new API endpoint; the debugging functionality is seamlessly integrated into the existing search request.

1. **Backward Compatibility**:

    * The `verbose` parameter is optional and defaults to `false`. Existing search requests remain unaffected unless explicitly updated to include `verbose=true`.

1. **Alignment with OpenSearch Design**:

    * Consistent with the design of existing search features, such as the `profile` query parameter.

**Cons**


1. **Performance Impact**:

    * Activating verbose mode may slightly increase computational load due to additional processor-level logging, primarily for debugging purposes. By integrating with the existing [search backpressure](https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/search-backpressure/) mechanism, the system can dynamically manage resource usage, ensuring stability while allowing detailed debugging during low-load periods.

* * *
#### Example Request
`
GET /my_index/_search?search_pipeline=my_debug_pipeline&verbose_pipeline=true`



#### Example Response

```
{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 50,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1.0,
        "_source": { "field": "value" }
      }
    ]
  },
  "processor_result": [
    {
      "processor": "filter_query",
      "status": "success",
      "execution_time": 3,
      "input": { "query": { "match_all": {} } },
      "output": { "query": { "filtered_query": { "match_all": {} } } }
    },
    {
      "processor": "collapse",
      "status": "success",
      "execution_time": 5,
      "input": { "hits": [...] },
      "output": { "collapsed_hits": [...] }
    }
  ]
}
```

* * *

#### Common Fields for All Processors

Each processor, regardless of type, will include the following common fields:

* **processor**: The name or type of the processor (e.g., `filter_query`, `collapse`).
* **status**: Indicates whether the processor completed successfully (`success`) or encountered an error (`failure`).
* **execution_time**: The time taken by the processor to execute, in milliseconds.
* **input**: The input data provided to the processor. The structure of this field varies depending on the processor type.
* **output**: The transformed data output by the processor. The structure of this field varies depending on the processor type.

#### Request Processor Fields

For processors that handle the incoming search request:

* **input**: The original search request before processing (e.g., the query, filters, and other parameters).
* **output**: The modified search request after this processor has applied its transformations.

Example:

```
{
  "processor": "filter_query",
  "status": "success",
  "execution_time": 3,
  "input": { "query": { "match_all": {} } },
  "output": { "query": { "filtered_query": { "match_all": {} } } }
}

```

#### Search Phase Result Processor Fields

For processors that handle intermediate results during the search phase:

* **input**: The set of search hits or results passed into this processor.
* **output**: The modified or filtered set of search hits after the processor has completed its operation.

```
{
  "processor":"normalization-processor"
  "status": "success",
  "execution_time": 5,
  "input": {
    "hits": [
      { "_index": "my_index", "_id": "1", "_score": 1.0, "_source": { "field": "value1" } },
      { "_index": "my_index", "_id": "2", "_score": 0.9, "_source": { "field": "value2" } }
    ]
  },
  "output": {
    "hits": [
      { "_index": "my_index", "_id": "1", "_score": 1.0, "_source": { "field": "value1" } }
    ]
  }
}
```

#### Response Processor Fields

For processors that handle the final search response:

* **input**: The raw search response from the previous phase or processor.
* **output**: The final transformed response to be returned to the client.

```
{
  "processor": "Rerank",
  "status": "success",
  "execution_time": 4,
  "input": { "hits": [ ... ] },
  "output": { "hits": [ ... ] }
}
```

* * *


#### Verbose Mode Support Across Search Pipeline Configurations

The verbose mode is designed to seamlessly integrate with all ways of using a search pipeline, ensuring consistent debugging capabilities regardless of the method chosen. Below is an overview of how verbose mode supports different search pipeline configurations:


1. Default Search Pipeline

```
PUT /my_index/_settings
{
  "index.search.default_pipeline": "my_pipeline"
}

GET /my_index/_search?verbose_pipeline=true
```

1. Specified Search Pipeline by ID

```
GET /my_index/_search?search_pipeline=my_pipeline&verbose_pipeline=true
```

1. Ad-Hoc (Temporary) Search Pipeline

```
POST /my_index/_search?verbose_pipeline=true
{
  "query": {
    "match": { "text_field": "some search text" }
  },
  "search_pipeline": {
    "request_processors": [
      {
        "filter_query": {
          "query": { "term": { "visibility": "public" } }
        }
      }
    ],
    "response_processors": [
      {
        "collapse": {
          "field": "category"
        }
      }
    ]
  }
}
```



### Related component

Search

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Tracking Search Pipeline Execution #16705

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Adding `verbose` Parameter to Search Request [Preferred]

Overview

Pros

Example Request

Example Response

Common Fields for All Processors

Request Processor Fields

Search Phase Result Processor Fields

Response Processor Fields

Verbose Mode Support Across Search Pipeline Configurations

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Tracking Search Pipeline Execution #16705

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Adding verbose Parameter to Search Request [Preferred]

Overview

Pros

Example Request

Example Response

Common Fields for All Processors

Request Processor Fields

Search Phase Result Processor Fields

Response Processor Fields

Verbose Mode Support Across Search Pipeline Configurations

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Adding `verbose` Parameter to Search Request [Preferred]